Parallelization is a useful feature of multi-core processors that improves computation time by running multiple independent processes simultaneously.
Due to how NIMBLE builds models and algorithms and automatically generates C++ code, parallelizating NIMBLE code requires some care. Naive approaches such as building a single model using standard parallelization packages in R to parallelize an MCMC are likely to fail, or worse, behave incorrectly without failing.
That said, parallelizing NIMBLE pipelines is possible and can be very convenient if done properly. The key consideration is to ensure that all NIMBLE execution, including model building, is conducted inside the parallelized code. This ensures that all models and algorithms are independent objects that don’t interfere with each other.
Note that we have work in progress to make parallelizing NIMBLE code easier and more powerful.
In this example, we’ll use basic functionality from the
parallel
package to run MCMC in parallel chains.
library(parallel)
First we create a cluster, specifying the number of cores we want the cluster to operate across.
this_cluster <- makeCluster(4)
(Side note: do not use makeForkCluster
when
parallelizing NIMBLE code; it will not work. makeCluster
sets up a PSOCK cluster, which will work. If you really want to use
forking, if you are able to set up separate directories for NIMBLE’s
generated code for each worker, that should allow one to use forked
processes. One option is to use unixtools::set.tempdir()
and another is to use the dirName
argument to
compileNimble()
.)
We’ll set up a situation where we run four independent MCMC
simulations. As part of this, we’ll also calculate WAIC, a model
selection tool, from the combined chains. Note that calculating WAIC in
this way only allows calculation of the default conditional WAIC and
requires that all parent nodes of data nodes be monitored, as discussed
further in help(calculateWAIC)
.
The key is to create a function that includes all the modeling steps
and run that function in parallel. Note that myCode
could
have been created outside the parallelized code, but all the other steps
need to be done inside the parallelized code.
set.seed(10120)
useWAIC <- TRUE
# Simulate some data
myData <- rgamma(1000, shape = 0.4, rate = 0.8)
library(nimble)
## nimble version 1.3.0 is loaded.
## For more information on NIMBLE and a User Manual,
## please visit https://R-nimble.org.
##
## Note for advanced users who have written their own MCMC samplers:
## As of version 0.13.0, NIMBLE's protocol for handling posterior
## predictive nodes has changed in a way that could affect user-defined
## samplers in some situations. Please see Section 15.5.1 of the User Manual.
##
## Attaching package: 'nimble'
## The following object is masked from 'package:stats':
##
## simulate
## The following object is masked from 'package:base':
##
## declare
myCode <- nimbleCode({
a ~ dunif(0, 100)
b ~ dnorm(0, 100)
for (i in 1:length_y) {
y[i] ~ dgamma(shape = a, rate = b)
}
})
# Create a function with all the needed code
run_MCMC_allcode <- function(seed, data, code, useWAIC = TRUE) {
library(nimble)
myModel <- nimbleModel(code = code,
data = list(y = data),
constants = list(length_y = 1000),
inits = list(a = 0.5, b = 0.5))
CmyModel <- compileNimble(myModel)
if(useWAIC)
monitors <- myModel$getParents(myModel$getNodeNames(dataOnly = TRUE), stochOnly = TRUE)
## One may also wish to add additional monitors
myMCMC <- buildMCMC(CmyModel, monitors = monitors)
CmyMCMC <- compileNimble(myMCMC)
results <- runMCMC(CmyMCMC, niter = 10000, setSeed = seed)
return(results)
}
Now, we execute the desired code using parLapply
, which
is equivalent to lapply
with each process running in
parallel.
chain_output <- parLapply(cl = this_cluster, X = 1:4,
fun = run_MCMC_allcode,
data = myData, code = myCode,
useWAIC = useWAIC)
# It's good practice to close the cluster when you're done with it.
stopCluster(this_cluster)
Alternatively, we could use foreach
with the
doParallel backend (via registerDoParallel
, which
uses makeCluster
by default under the hood) to invoke
run_MCMC_allcode
in parallel.
We ran four independent MCMCs.
par(mfrow = c(2,2))
for (i in 1:4) {
this_output <- chain_output[[i]]
plot(this_output[,"b"], type = "l", ylab = 'b')
}
We can calculate WAIC using NIMBLE’s off-line WAIC function, passing in the full matrix of samples from all the chains.
if(useWAIC) {
## We'll need a model object with which to do the WAIC calculation on the MCMC samples.
myModel <- nimbleModel(code = myCode,
data = list(y = myData),
constants = list(length_y = 1000))
CmyModel <- compileNimble(myModel) # calculateWAIC needs compiled model to exist
samples <- do.call(rbind, chain_output) # single matrix of samples
waic <- calculateWAIC(samples, myModel)
print(waic)
}
## Defining model
## Building model
## Setting data and initial values
## Running calculate on model
## [Note] Any error reports that follow may simply reflect missing values in model variables.
## Checking model sizes and dimensions
## [Note] This model is not fully initialized. This is not an error.
## To see which variables are not initialized, use model$initializeInfo().
## For more information on model initialization, see help(modelInitialization).
## Compiling
## [Note] This may take a minute.
## [Note] Use 'showCompilerOutput = TRUE' to see C++ compilation details.
## Compiling
## [Note] This may take a minute.
## [Note] Use 'showCompilerOutput = TRUE' to see C++ compilation details.
## Calculating WAIC.
## nimbleList object of type waicNimbleList
## Field "WAIC":
## [1] -315.0209
## Field "lppd":
## [1] 159.3933
## Field "pWAIC":
## [1] 1.882831
In the example above, we provided a different random number seed to use for each chain. However we didn’t start the chain from different (dispersed) starting values. That’s often a good idea as part of assessing convergence. To do so, you could generate random initial values inside the function that would then differ between the chains.
Alternatively, you could generate your initial values outside of the
parallelization (either randomly or via some other approach) and pass in
the initial values as an argument to run_MCMC_allcode
.
Here’s an example:
per_chain_info <- list(
list(seed = 1,
inits = list(a = 0.5, b = 0.5)),
list(seed = 2,
inits = list(a = 0.3, b = 0.3)),
list(seed = 3,
inits = list(a = 0.2, b = 0.6)),
list(seed = 4,
inits = list(a = 0.6, b = 0.2)))
run_MCMC_allcode <- function(info, data) {
## Remaining code as shown above, except modify the `nimbleModel`
## call to use `info$seed` and `info$inits`.
## ....
}
chain_output <- parLapply(cl = this_cluster, X = per_chain_info,
fun = run_MCMC_allcode,
data = myData)
# It's good practice to close the cluster when you're done with it.
stopCluster(this_cluster)
Note that NIMBLE generally expects user-defined distributions,
functions, and samplers to be defined in the global environment. As
discussed above, for parallelization, it is important to do all the
model and MCMC building and compilation inside a function. Therefore, in
order to achieve this but still have the user-defined elements in the
global environment we recommend using assign
to copy them
into the global environment from within your function.
After defining your distribution, function, or sampler inside your overall function, here is how you can do that:
## for a user-defined function
assign('dfoo', dfoo, envir = .GlobalEnv)
assign('rfoo', rfoo, envir = .GlobalEnv)
## for a user-defined distribution
assign('myfun', myfun, envir = .GlobalEnv)
## for a user-defined sampler
assign('sampler_mySampler', sampler_mySampler, envir = .GlobalEnv)