Reversible Jump MCMC (RJMCMC) is a method for sampling across models with a different numbers of dimensions, which finds natural application in Bayesian variable selection problems. In this setting, RJMCMC requires a proposal distribution for the coefficient of a variable when the variable is proposed to be included in the model. NIMBLE uses a univariate normal proposal distribution.
There are two ways to use RJMCMC variable selection in your model. If you know the prior probability for inclusion of a variable in the model, you can use that directly in RJMCMC without modifying your model. If you need the prior probability for inclusion in the model to be a model node itself, such as if it will have a prior and be estimated, you will need to write the model with indicator variables. The latter is more flexible and is illustrated in the example below.
More information can be found in the user manual and via help(configureRJ)
.
For simplicity we consider a linear regression example with 15 explanatory variables, of which five have real effects. The other 10 have null effects, i.e. the corresponding coefficient is equal to zero.
library(nimble, warn.conflicts = FALSE)
## nimble version 0.12.2 is loaded.
## For more information on NIMBLE and a User Manual,
## please visit https://R-nimble.org.
lmCode <- nimbleCode({
sigma ~ dunif(0, 20)
psi ~ dunif(0,1) ## prior on inclusion probability
for(i in 1:numVars) {
z[i] ~ dbern(psi) ## indicator variable for each coefficient
beta[i] ~ dnorm(0, sd = 100)
zbeta[i] <- z[i] * beta[i] ## indicator * beta
}
for(i in 1:n) {
pred.y[i] <- inprod(X[i, 1:numVars], zbeta[1:numVars])
y[i] ~ dnorm(pred.y[i], sd = sigma)
}
})
## Generate data and define model constants
set.seed(1)
numVars = 15
n = 100
## Simulate explanatory variables
X <- matrix(rnorm(n*numVars), nrow = n, ncol = numVars)
lmConstants <- list(numVars = numVars, n = n, X = X)
lmModel <- nimbleModel(lmCode, constants = lmConstants)
## Defining model
## Building model
## Setting data and initial values
## Running calculate on model
## [Note] Any error reports that follow may simply reflect missing values in model variables.
## Checking model sizes and dimensions
## [Note] This model is not fully initialized. This is not an error.
## To see which variables are not initialized, use model$initializeInfo().
## For more information on model initialization, see help(modelInitialization).
We can simulate data using the nimbleModel
we just created by initializing the model parameters and calling the simulate()
method.
## Set the first give coefficients to be non-zero and the remaining
## ones to be zero.
true_betas <- c(c(0.1, 0.2, 0.3, 0.4, 0.5),
rep(0, 10))
lmModel$beta <- true_betas
lmModel$sigma <- 1
lmModel$z <- rep(1, 15)
lmModel$psi <- 0.5
this_will_be_NA <- lmModel$calculate() ## Update nodes that depend on values we just set..
set.seed(0) ## Make this example reproducible
lmModel$simulate('y') ## Simulate y
lmModel$calculate() ## Update calculations
## [1] -226.7382
lmModel$setData('y') ## Mark the simulated y values as data
lmData = list(y = lmModel$y) ## Record the simulated y values for use below
The RJCMCM sampler can be added to the MCMC configuration by calling the function configureRJ()
. In this example, z
are indicator variables associated with the coefficients beta
. We pass them to configureRJ
using the arguments indicatorNodes
and targetNodes
. The control
argument allows us to specify the mean and the scale of the normal proposal distribution.
lmConf <- configureMCMC(lmModel)
## ===== Monitors =====
## thin = 1: beta, psi, sigma
## ===== Samplers =====
## RW sampler (2)
## - sigma
## - psi
## conjugate sampler (15)
## - beta[] (15 elements)
## binary sampler (15)
## - z[] (15 elements)
lmConf$addMonitors('z')
## thin = 1: beta, psi, sigma, z
configureRJ(lmConf,
targetNodes = 'beta',
indicatorNodes = 'z',
control = list(mean = 0, scale = .2))
Checking the assigned samplers, we can see that indicator parameters are assigned to an RJ_indicator
sampler whose targetNode
is the corresponding coefficient, while beta
parameters are assigned to a RJ_toggled
sampler, which is used only when its target parameter is included in the model.
## Check the assigned samplers
lmConf$printSamplers(c("z[1]", "beta[1]"))
## [4] RJ_toggled sampler: beta[1], samplerType: conjugate_dnorm_dnorm_linear
## [3] RJ_indicator sampler: z[1], mean: 0, scale: 0.20000000000000001, targetNode: beta[1]
mcmcRJ <- buildMCMC(lmConf)
cLmModel <- compileNimble(lmModel)
## Compiling
## [Note] This may take a minute.
## [Note] Use 'showCompilerOutput = TRUE' to see C++ compilation details.
CMCMCRJ <- compileNimble(mcmcRJ, project = lmModel)
## Compiling
## [Note] This may take a minute.
## [Note] Use 'showCompilerOutput = TRUE' to see C++ compilation details.
set.seed(100)
system.time(samplesRJ <- runMCMC(CMCMCRJ, niter = 100000, nburnin = 10000, thin = 10))
## Running chain 1 ...
## |-------------|-------------|-------------|-------------|
## |-------------------------------------------------------|
## user system elapsed
## 22.669 0.000 22.678
For each variable we can look at the sampled values of the indicator and corresponding coefficient.
plot(samplesRJ[,'z[5]'], pch = '.', main = "z[5] traceplot")
plot(samplesRJ[,'beta[5]'], pch = '.', main = "beta[5] traceplot")
psi
Here is the posterior density of inclusion probability.
plot(density(samplesRJ[,'psi']), main = "Inclusion probability", xlab = "Probability", ylab = "Density")
Here are the marginal probabilities of inclusion for each variable.
zNames <- lmModel$expandNodeNames('z')
zCols <- which(colnames(samplesRJ) %in% zNames)
posterior_inclusion_prob <- colMeans(samplesRJ[,zCols])
plot(1:length(true_betas), posterior_inclusion_prob,
xaxt = 'none',
xlab = "beta", ylab = "Inclusion probability",
main = "Inclusion probabilities for each beta")
axis(1, seq(0,15,1))
Finally we can look at the posterior probabilities for every model that was explored by the MCMC, coded by the vector of indicator variables z
.
library(data.table)
binary <- as.data.table((samplesRJ[, zCols] != 0)+ 0)
res <- binary[,.N, by=names(binary)]
res <-res[order(N, decreasing = T)]
res <- res[, prob := N/dim(samplesRJ)[1]]
res[1:5, ]
## z[1] z[2] z[3] z[4] z[5] z[6] z[7] z[8] z[9] z[10] z[11] z[12] z[13] z[14]
## 1: 0 0 1 1 1 0 0 0 0 0 0 0 0 0
## 2: 0 0 0 1 1 0 0 0 0 0 0 0 0 0
## 3: 0 0 1 0 1 0 0 0 0 0 0 0 0 0
## 4: 0 0 0 0 1 0 0 0 0 0 0 0 0 0
## 5: 0 0 0 0 0 0 0 0 0 0 0 0 0 0
## z[15] N prob
## 1: 0 3305 0.36722222
## 2: 0 3044 0.33822222
## 3: 0 1222 0.13577778
## 4: 0 1130 0.12555556
## 5: 0 174 0.01933333