Disease mapping with spatial CAR models

NIMBLE provides the ability to specify and fit conditional autoregressive (CAR) models, both intrinsic and proper models.

Here we’ll use an intrinsic CAR (ICAR) model for disease mapping. We’ll illustrate with data on hospital admissions due to respiratory disease in 2010 from the 134 Intermediate Geographies (IG) north of the river Clyde in the Greater Glasgow and Clyde health board. The data are available in the CARBayesdata package and can be transformed into the neighborhood information needed by NIMBLE using functions from the spdep package. In particular, we need to provide vectors indicating which regions are neighbors of which other regions (adj), the weights for each pair of neighbors (weights), and the number of neighbors for each region (num).

As usual, the mean of the Poisson counts includes an offset for the expected count. We’ll also include a covariate, the percentage of people defined to be income deprived in each spatial region.

library(nimble, warn.conflicts = FALSE)

## nimble version 1.3.0 is loaded.
## For more information on NIMBLE and a User Manual,
## please visit https://R-nimble.org.
## 
## Note for advanced users who have written their own MCMC samplers:
##   As of version 0.13.0, NIMBLE's protocol for handling posterior
##   predictive nodes has changed in a way that could affect user-defined
##   samplers in some situations. Please see Section 15.5.1 of the User Manual.

library(CARBayesdata, quietly = TRUE)
library(sp, quietly = TRUE)
library(spdep, quietly = TRUE)

## To access larger datasets in this package, install the spDataLarge
## package with: `install.packages('spDataLarge',
## repos='https://nowosad.github.io/drat/', type='source')`

## Linking to GEOS 3.10.2, GDAL 3.4.1, PROJ 8.2.1; sf_use_s2() is TRUE

data(GGHB.IZ)
data(respiratorydata)

We handle the spatial analysis here with nb2WB from the package spdep.

respiratorydata_spatial <- merge(x = GGHB.IZ, y = respiratorydata, by = "IZ", all.x = FALSE)
W.nb <- poly2nb(respiratorydata_spatial, row.names =  rownames(respiratorydata_spatial))
## Determine neighborhood/adjacency information needed for neighborhood-based CAR model
nbInfo <- nb2WB(W.nb)

# A vector of indices indicating which regions are neighbors of which.
nbInfo$adj

##   [1]   2   3   5  87  91  93   1   3  91 113   1   2   5  11 113   6   7  10
##  [19]  98 101 105 106   1   3   9  11  19  87  90 104 109   4  10  12 105 108
##  [37]   4  10  20 106 109  11  15  16 113 114   5  19  20 109   4   6   7  12
##  [55]  13  14  20   3   5   8  16  19  22 113   6  10  13  17  26 108 111 112
##  [73] 122 124 126  10  12  14  17  10  13  17  20   8  16  18  24 113 114 116
##  [91]   8  11  15  18  22  12  13  14  20  25  26  15  16  21  22  24   5   9
## [109]  11  20  22  28   7   9  10  14  17  19  23  25  28 109  18  22  24  11
## [127]  16  18  19  21  24  27  28  20  25  15  18  21  22  17  20  23  26  12
## [145]  17  25 126  22  28  19  20  22  27  30  33  37  38  29  31  34  38  30
## [163]  34  35  41  44  39  40  29  36  37  39  30  31  35  38  41  42  31  34
## [181]  41  33  37  39  43  29  33  36  38  43  29  30  34  37  42  43  32  33
## [199]  36  40  43  46  47  32  39  46  55  31  34  35  42  44  45  34  38  41
## [217]  43  45  52  36  37  38  39  42  47  52  54  31  41  45  50  51  58  41
## [235]  42  44  50  52  59  39  40  47  48  55  39  43  46  48  54  56  46  47
## [253]  55  56  53  57  61  44  45  58  59  70  44  58  70  42  43  45  54  59
## [271]  63  49  55  61  65  43  47  52  56  63  40  46  48  53  56  62  65  66
## [289]  72  47  48  54  55  62  63  74  49  60  61  44  50  51  70  45  50  52
## [307]  63  70  91 113  57  61  64  67  68  73  49  53  57  60  65  68  71  55
## [325]  56  63  66  74  76  83  52  54  56  59  62  74  91 113  60  67  69  75
## [343]  78  53  55  61  71  72  81  55  62  72  76  60  64  73  75  60  61  71
## [361]  73  77  79  64  78  80  82  50  51  58  59 113  61  65  68  79  81  55
## [379]  65  66  76  81  88  90  60  67  68  75  77  85  56  62  63  83  91  93
## [397]  64  67  73  78  85  62  66  72  83  87  90  68  73  79  85  86  64  69
## [415]  75  80  82  85  68  71  77  81  84  86  69  78  82  92  95  65  71  72
## [433]  79  84  88  69  78  80  85  95  62  74  76  87  93  79  81  86  88  89
## [451]  73  75  77  78  82  86  94  95  98 101  77  79  84  85  89  96  98   1
## [469]   5  76  83  90  93  72  81  84  89  90  96  97 100 104  84  86  88  96
## [487]   5  72  76  87  88 104   1   2  59  63  74  93 113  80  95  99 103 117
## [505]   1  74  83  87  91  85  95 101 102 105  80  82  85  92  94  99 102  86
## [523]  88  89  98 100 106  88 100 104 109   4  85  86  96 101 106  92  95 102
## [541] 103 107  88  96  97 106 109   4  85  94  98 102 105  94  95  99 101 105
## [559] 107  92  99 107 117 119   5  88  90  97 109   4   6  94 101 102 107 108
## [577]   4   7  96  98 100 109  99 102 103 105 108 110 119   6  12 105 107 110
## [595] 111 112   5   7   9  20  97 100 104 106 107 108 112 119 122  12 108 112
## [613]  12 108 110 111 122 124   2   3   8  11  15  59  63  70  91 114   8  15
## [631] 113 115 116 114 116  15 114 115  92 103 118 119 117 119 120 121 123 103
## [649] 107 110 117 118 120 122 118 119 121 122 118 120 122 123 126  12 110 112
## [667] 119 120 121 124 126 118 121 126 127  12 112 122 126 127 129 130  12  26
## [685] 121 122 123 124 127 132 123 125 126 129 132 131 125 127 130 132 125 129
## [703] 131 132 128 130 132 133 126 127 129 130 131 133 134 131 132 134 132 133

# A vector of weights. In this case, all weights are 1.
head(nbInfo$weights)

## [1] 1 1 1 1 1 1

# A vector of length N. num[n] indicates how many neighbors region n contains.
# This helps map the adj vector to the starting region.
nbInfo$num

##   [1]  6  4  5  7  9  5  5  5  4  7  7 11  4  4  7  5  6  5  6 10  3  8  2  4  4
##  [26]  4  2  4  4  4  5  2  4  6  3  4  5  6  7  4  6  6  8  6  6  5  6  4  3  5
##  [51]  3  6  4  5  9  7  3  4  7  6  7  7  8  5  6  4  4  6  4  5  5  7  6  6  5
##  [76]  6  5  6  6  5  6  5  5  5 10  7  6  9  4  6  7  5  5  5  7  6  4  6  5  5
## [101]  6  6  5  5  7  6  7  7  8  5  3  6 10  5  2  3  4  5  7  4  5  8  4  4  3
## [126]  8  5  1  4  4  4  7  3  2

Now we have the three pieces of information we need. We’re ready to use the dcar_normal distribution in a nimbleModel.

nregions <- nrow(respiratorydata_spatial)

code <- nimbleCode({
    # priors
    beta ~ dnorm(0, sd = 100)
    sigma ~ dunif(0, 100)   # prior for variance components based on Gelman (2006)
    tau <- 1 / sigma^2
    # latent process
    s[1:N] ~ dcar_normal(adj[1:L], weights[1:L], num[1:N], tau, zero_mean = 0)
    # likelihood
    for(i in 1:N) {
        log(lambda[i]) <- log(expected[i]) + beta*x[i] + s[i]
        y[i] ~ dpois(lambda[i])
    }
})

x <- respiratorydata_spatial$incomedep
x <- x - mean(x)  # center for improved MCMC performance

set.seed(1)

constants <- list(N = nregions, L = length(nbInfo$adj), 
               adj = nbInfo$adj, weights = nbInfo$weights, num = nbInfo$num,
                        x = x, expected = respiratorydata_spatial$expected)
data <- list(y = respiratorydata_spatial$observed)
inits <- list(beta = 0, sigma = 1, s = rnorm(nregions))
      
model <- nimbleModel(code, constants = constants, data = data, inits = inits)

## Defining model

## Building model

## Setting data and initial values

## Running calculate on model
##   [Note] Any error reports that follow may simply reflect missing values in model variables.

## Checking model sizes and dimensions

cModel <- compileNimble(model)

## Compiling
##   [Note] This may take a minute.
##   [Note] Use 'showCompilerOutput = TRUE' to see C++ compilation details.

conf <- configureMCMC(model, monitors = c('beta', 'sigma', 's'))

## ===== Monitors =====
## thin = 1: beta, s, sigma
## ===== Samplers =====
## RW sampler (2)
##   - beta
##   - sigma
## CAR_normal sampler (1)
##   - s[1:134]

conf$printSamplers()

## [1] RW sampler: beta
## [2] RW sampler: sigma
## [3] CAR_normal sampler: s[1:134]

MCMC <- buildMCMC(conf)
cMCMC <- compileNimble(MCMC, project = cModel)

## Compiling
##   [Note] This may take a minute.
##   [Note] Use 'showCompilerOutput = TRUE' to see C++ compilation details.

samples <- runMCMC(cMCMC, niter = 10000, nburnin = 1000)

## running chain 1...

## |-------------|-------------|-------------|-------------|
## |-------------------------------------------------------|

Here are density plots for some posterior samples of interest.

plot(density(samples[,"sigma"]))

plot(density(samples[,"s[1]"]))

plot(density(samples[,"beta"]))

In some cases, MCMC mixing in CAR models (particularly when considering the CAR hyperparameter(s)) can be slow, and considering NIMBLE’s HMC sampler (in the nimbleHMC package) may be worthwhile.