Chapter 7 MCMC
NIMBLE provides a variety of paths to creating and executing an MCMC algorithm, which differ greatly in their simplicity of use, and also in the options available and customizability.
The most direct approach to invoking the MCMC engine is using the nimbleMCMC
function (Section 7.1). This oneline call creates and executes an MCMC, and provides a wide range of options for controlling the MCMC: specifying monitors, burnin, and thinning, running multiple MCMC chains with different initial values, and returning posterior samples, summary statistics, and/or a WAIC value. However, this approach is restricted to using NIMBLE’s default MCMC algorithm; further customization of, for example, the specific samplers employed, is not possible.
The lengthier and more customizable approach to invoking the MCMC engine on a particular NIMBLE model object involves the following steps:
(Optional) Create and customize an MCMC configuration for a particular model:
 Use
configureMCMC
to create an MCMC configuration (see Section 7.2). The configuration contains a list of samplers with the node(s) they will sample. (Optional) Customize the MCMC configuration:
 Add, remove, or reorder the list of samplers (Section 7.10 and
help(samplers)
in R for details), including adding your own samplers (Section 15.5);  Change the tuning parameters or adaptive properties of individual samplers;
 Change the variables to monitor (record for output) and thinning intervals for MCMC samples.
 Add, remove, or reorder the list of samplers (Section 7.10 and
 Use
 Use
buildMCMC
to build the MCMC object and its samplers either from the model (using default MCMC configuration) or from a customized MCMC configuration (Section 7.3).  Compile the MCMC object (and the model), unless one is debugging and wishes to run the uncompiled MCMC.
 Run the MCMC and extract the samples (Sections 7.4, 7.5 and 7.6).
Optionally, calculate the WAIC using the posterior samples (Section 7.7).
Prior to version 0.8.0, NIMBLE provided two additional functions, MCMCsuite
and compareMCMCs
, to facilitate comparison of multiple MCMC algorithms, either internal or external to NIMBLE. Those capabilities have been redesigned and moved into a separate package called compareMCMCs
.
Endtoend examples of MCMC in NIMBLE can be found in Sections 2.52.6 and Section 7.11.
7.1 Oneline invocation of MCMC: nimbleMCMC
The most direct approach to executing an MCMC algorithm in NIMBLE is using nimbleMCMC
. This single function can be used to create an underlying model and associated MCMC algorithm, compile both of these, execute the MCMC, and return samples, summary statistics, and a WAIC value. This approach circumvents the longer (and more flexible) approach using nimbleModel
, configureMCMC
, buildMCMC
, compileNimble
, and runMCMC
, which is described subsequently.
The nimbleMCMC
function provides control over the:
 number of MCMC iterations in each chain;
 number of MCMC chains to execute;
 number of burnin samples to discard from each chain;
 thinning interval on which samples should be recorded;
 model variables to monitor and return posterior samples;
 initial values, or a function for generating initial values for each chain;
 setting the random number seed;
 returning posterior samples as a matrix or a
coda
mcmc
object;  returning posterior summary statistics; and
 returning a WAIC value calculated using samples from all chains.
This entry point for using nimbleMCMC
is the code
, constants
, data
, and inits
arguments that are used for building a NIMBLE model (see Chapters 5 and 6). However, when using nimbleMCMC
, the inits
argument can also specify a list of lists of initial values that will be used for each MCMC chain, or a function that generates a list of initial values, which will be generated at the onset of each chain. As an alternative entry point, a NIMBLE model
object can also be supplied to nimbleMCMC
, in which case this model will be used to build the MCMC algorithm.
Based on its arguments, nimbleMCMC
optionally returns any combination of
 Posterior samples,
 Posterior summary statistics, and
 WAIC value.
The above are calculated and returned for each MCMC chain. Additionally, posterior summary statistics are calculated for all chains combined when multiple chains are run.
Several example usages of nimbleMCMC
are shown below:
code < nimbleCode({
mu ~ dnorm(0, sd = 1000)
sigma ~ dunif(0, 1000)
for(i in 1:10)
x[i] ~ dnorm(mu, sd = sigma)
})
data < list(x = c(2, 5, 3, 4, 1, 0, 1, 3, 5, 3))
initsFunction < function() list(mu = rnorm(1,0,1), sigma = runif(1,0,10))
# execute one MCMC chain, monitoring the "mu" and "sigma" variables,
# with thinning interval 10. fix the random number seed for reproducible
# results. by default, only returns posterior samples.
mcmc.out < nimbleMCMC(code = code, data = data, inits = initsFunction,
monitors = c("mu", "sigma"), thin = 10,
niter = 20000, nchains = 1, setSeed = TRUE)
# note that the inits argument to nimbleModel must be a list of
# initial values, whereas nimbleMCMC can accept inits as a function
# for generating new initial values for each chain.
initsList < initsFunction()
Rmodel < nimbleModel(code, data = data, inits = initsList)
# using the existing Rmodel object, execute three MCMC chains with
# specified burnin. return samples, summary statistics, and WAIC.
mcmc.out < nimbleMCMC(model = Rmodel,
niter = 20000, nchains = 3, nburnin = 2000,
summary = TRUE, WAIC = TRUE)
# run ten chains, generating random initial values for each
# chain using the inits function specified above.
# only return summary statistics from each chain; not all the samples.
mcmc.out < nimbleMCMC(model = Rmodel, nchains = 10, inits = initsFunction,
samples = FALSE, summary = TRUE)
See help(nimbleMCMC)
for further details.
7.2 The MCMC configuration
The MCMC configuration contains information needed for building an MCMC. When no customization is needed, one can jump directly to the buildMCMC
step below. An MCMC configuration is an object of class MCMCconf
, which includes:
 The model on which the MCMC will operate
 The model nodes which will be sampled (updated) by the MCMC
 The samplers and their internal configurations, called control parameters
 Two sets of variables that will be monitored (recorded) during execution of the MCMC and thinning intervals for how often each set will be recorded. Two sets are allowed because it can be useful to monitor different variables at different intervals
7.2.1 Default MCMC configuration
Assuming we have a model named Rmodel
, the following will generate a default MCMC configuration:
mcmcConf < configureMCMC(Rmodel)
The default configuration will contain a single sampler for each node in the model, and the default ordering follows the topological ordering of the model.
7.2.1.1 Default assignment of sampler algorithms
The default sampler assigned to a stochastic node is determined by the following, in order of precedence:
 If the node has no stochastic dependents, a
posterior_predictive
sampler is assigned. This sampler sets the new value for the node simply by simulating from its distribution.  If the node has a conjugate relationship between its prior distribution and the distributions of its stochastic dependents, a
conjugate
(`Gibbs’) sampler is assigned.  If the node follows a multinomial distribution, then a
RW_multinomial
sampler is assigned. This is a discrete randomwalk sampler in the space of multinomial outcomes.  If a node follows a Dirichlet distribution, then a
RW_dirichlet
sampler is assigned. This is a random walk sampler in the space of the simplex defined by the Dirichlet.  If the node follows any other multivariate distribution, then a
RW_block
sampler is assigned for all elements. This is a MetropolisHastings adaptive randomwalk sampler with a multivariate normal proposal (Roberts and Sahu 1997).  If the node is binaryvalued (strictly taking values 0 or 1), then a
binary
sampler is assigned. This sampler calculates the conditional probability for both possible node values and draws the new node value from the conditional distribution, in effect making a Gibbs sampler.  If the node is otherwise discretevalued, then a
slice
sampler is assigned (R. M. Neal 2003).  If none of the above criteria are satisfied, then a
RW
sampler is assigned. This is a MetropolisHastings adaptive randomwalk sampler with a univariate normal proposal distribution.
These sampler assignment rules can be inspected, reordered, and easily modified using the system option nimbleOptions("MCMCdefaultSamplerAssignmentRules")
and customized samplerAssignmentRules
objects.
Details of each sampler and its control parameters can be found by invoking help(samplers)
.
7.2.1.2 Sampler assignment rules
The behavior of configureMCMC
can be customized to control how samplers are assigned. A new set of sampler assignment rules can be created using samplerAssignmentRules
, which can be modified using the addRule
and reorder
methods, then passed as an argument to configureMCMC
. Alternatively, the default behavior of configureMCMC
can be altered by setting the system option MCMCdefaultSamplerAssignmentRules
to a custom samplerAssignmentRules
object. See help(samplerAssignmentRules)
for details.
7.2.1.3 Options to control default sampler assignments
As a lightweight alternative to using samplerAssignmentRules
, very basic control of default sampler assignments is provided via two arguments to configureMCMC
. The useConjugacy
argument controls whether conjugate samplers are assigned when possible, and the multivariateNodesAsScalars
argument controls whether scalar elements of multivariate nodes are sampled individually. See help(configureMCMC)
for usage details.
7.2.1.4 Default monitors
The default MCMC configuration includes monitors on all toplevel stochastic nodes of the model. Only variables that are monitored will have their samples saved for use outside of the MCMC. MCMC configurations include two sets of monitors, each with different thinning intervals. By default, the second set of monitors (monitors2
) is empty.
7.2.1.5 Automated parameter blocking
The default configuration may be replaced by one generated from an automated parameter blocking algorithm. This algorithm determines groupings of model nodes that, when jointly sampled with a RW_block
sampler, increase overall MCMC efficiency. Overall efficiency is defined as the effective sample size of the slowestmixing node divided by computation time. This is done by:
autoBlockConf < configureMCMC(Rmodel, autoBlock = TRUE)
Note that this using autoBlock = TRUE
compiles and runs MCMCs, progressively exploring different sampler assignments, so it takes some time and generates some output. It is most useful for determining effective blocking strategies that can be reused for later runs. The additional control argument autoIt
may also be provided to indicate the number of MCMC samples to be used in each trial of the automated blocking procedure (default 20,000).
7.2.2 Customizing the MCMC configuration
The MCMC configuration may be customized in a variety of ways, either through additional named arguments to configureMCMC
or by calling methods of an existing MCMCconf
object.
7.2.2.1 Controlling which nodes to sample
One can create an MCMC configuration with default samplers on just a particular set of nodes using the nodes
argument to configureMCMC
. The value for the nodes
argument may be a character vector containing node and/or variable names. In the case of a variable name, a default sampler will be added for all stochastic nodes in the variable. The order of samplers will match the order of nodes
. Any deterministic nodes will be ignored.
If a data node is included in nodes
, it will be assigned a sampler. This is the only way in which a default sampler may be placed on a data node and will result in overwriting data values in the node.
7.2.2.2 Creating an empty configuration
If you plan to customize the choice of all samplers, it can be useful to obtain a configuration with no sampler assignments at all. This can be done by any of nodes = NULL
, nodes = character()
, or nodes = list()
.
7.2.2.3 Overriding the default sampler assignment rules
The default rules used for assigning samplers to model nodes can be overridden using the rules
argument to configureMCMC
. This argument must be an object of class samplerAssignmentRules
, which defines an ordered set of rules for assigning samplers. Rules can be modified and reordered, to give different precedence to particular samplers, or to assign userdefined samplers (see section 15.5). The following example creates a new set of rules (which initially contains the default assignment rules), reorders the rules, adds a new rule, then uses these rules to create an MCMC configuration object.
my_rules < samplerAssignmentRules()
my_rules$reorder(c(8, 1:7))
my_rules$addRule(condition = quote(model$getDistribution(node) == "dmnorm"),
sampler = new_dmnorm_sampler)
mcmcConf < configureMCMC(Rmodel, rules = my_rules)
In addition, the default behavior of configureMCMC
can be altered by setting the system option nimbleOptions(MCMCdefaultSamplerAssignmentRules = my_rules)
, or reset to the original default behavior using nimbleOptions(MCMCdefaultSamplerAssignmentRules = samplerAssignmentRules())
.
7.2.2.4 Overriding the default sampler control list values
The default values of control list elements for all sampling algorithms may be overridden through use of the control
argument to configureMCMC
, which should be a named list. Named elements in the control
argument will be used for all default samplers and any subsequent sampler added via addSampler
(see below). For example, the following will create the default MCMC configuration, except all RW
samplers will have their initial scale
set to 3, and none of the samplers (RW
, or otherwise) will be adaptive.
mcmcConf < configureMCMC(Rmodel, control = list(scale = 3, adaptive = FALSE))
When adding samplers to a configuration using addSampler
, the default control list can also be overridden.
7.2.2.5 Adding samplers to the configuration: addSampler
Samplers may be added to a configuration using the addSampler
method of the MCMCconf
object. The first argument gives the node(s) to be sampled, called the target
, as a character vector. The second argument gives the type of sampler, which may be provided as a character string or a nimbleFunction object. Valid character strings are indicated in help(samplers)
(do not include "sampler_"
). Added samplers can be labeled with a name
argument, which is used in output of printSamplers
.
Writing a new sampler as a nimbleFunction is covered in Section 15.5.
The hierarchy of precedence for control list elements for samplers is:
 The
control
list argument toaddSampler
;  The
control
list argument toconfigureMCMC
;  The default values, as defined in the sampling algorithm
setup
function.
Samplers added by addSampler
will be appended to the end of current sampler list. Adding a sampler for a node will not automatically remove any existing samplers on that node.
7.2.2.6 Printing, reordering, modifying and removing samplers: printSamplers, removeSamplers, setSamplers, and getSamplerDefinition
The current, ordered, list of all samplers in the MCMC configuration may be printed by calling the printSamplers
method. When you want to see only samplers acting on specific model nodes or variables, provide those names as an argument to printSamplers
. The printSamplers
method accepts arguments controlling the level of detail displayed as discussed in its R help information.
# Print all samplers
mcmcConf$printSamplers()
# Print all samplers operating on node "a[1]",
# or any of the "beta[]" variables
mcmcConf$printSamplers(c("a[1]", "beta"))
# Print all conjugate and slice samplers
mcmcConf$printSamplers(type = c("conjugate", "slice"))
# Print all RW samplers operating on "x"
mcmcConf$printSamplers("x", type = "RW")
# Print the first 100 samplers
mcmcConf$printSamplers(1:100)
# Print all samplers in their order of execution
mcmcConf$printSamplers(executionOrder = TRUE)
Samplers may be removed from the configuration object using removeSamplers
, which accepts a character vector of node or variable names, or a numeric vector of indices.
# Remove all samplers acting on "x" or any component of it
mcmcConf$removeSamplers("x")
# Remove all samplers acting on "alpha[1]" and "beta[1]"
mcmcConf$removeSamplers(c("alpha[1]", "beta[1]"))
# Remove the first five samplers
mcmcConf$removeSamplers(1:5)
# Providing no argument removes all samplers
mcmcConf$removeSamplers()
Samplers to retain may be specified reordered using setSamplers
, which also accepts a character vector of node or variable names, or a numeric vector of indices.
# Set the list of samplers to those acting on any components of the
# model variables "x", "y", or "z".
mcmcConf$setSamplers(c("x", "y", "z"))
# Set the list of samplers to only those acting on model nodes
# "alpha[1]", "alpha[2]", ..., "alpha[10]"
mcmcConf$setSamplers("alpha[1:10]")
# Truncate the current list of samplers to the first 10 and the 100th
mcmcConf$setSamplers(ind = c(1:10, 100))
The nimbleFunction definition underlying a particular sampler may be viewed using the getSamplerDefinition
method, using the sampler index as an argument. A node name argument may also be supplied, in which case the definition of the first sampler acting on that node is returned. In all cases, getSamplerDefinition
only returns the definition of the first sampler specified either by index or node name.
# Return the definition of the third sampler in the mcmcConf object
mcmcConf$getSamplerDefinition(3)
# Return the definition of the first sampler acting on node "x",
# or the first of any indexed nodes comprising the variable "x"
mcmcConf$getSamplerDefinition("x")
7.2.2.7 Customizing individual sampler configurations: getSamplers, setSamplers, setName, setSamplerFunction, setTarget, and setControl
Each sampler in an MCMCconf
object is represented by a sampler configuration as a samplerConf
object. Each samplerConf
is a reference class object containing the following (required) fields: name
(a character string), samplerFunction
(a valid nimbleFunction sampler), target
(the model node to be sampled), and control
(list of control arguments). The MCMCconf
method getSamplers
allows access to the samplerConf
objects. These can be modified and then passed as an argument to setSamplers
to overwrite the current list of samplers in the MCMC configuration object. However, no checking of the validity of this modified list is performed; if the list of samplerConf objects is corrupted to be invalid, incorrect behavior will result at the time of calling buildMCMC
. The fields of a samplerConf
object can be modified using the access functions setName(name)
, setSamplerFunction(fun)
, setTarget(target, model)
, and setControl(control)
.
Here are some examples:
# retrieve samplerConf list
samplerConfList < mcmcConf$getSamplers()
# change the name of the first sampler
samplerConfList[[1]]$setName("newNameForThisSampler")
# change the sampler function of the second sampler,
# assuming existance of a nimbleFunction 'anotherSamplerNF',
# which represents a valid nimbleFunction sampler.
samplerConfList[[2]]$setSamplerFunction(anotherSamplerNF)
# change the 'adaptive' element of the control list of the third sampler
control < samplerConfList[[3]]$control
control$adaptive < FALSE
samplerConfList[[3]]$setControl(control)
# change the target node of the fourth sampler
samplerConfList[[4]]$setTarget("y", model) # model argument required
# use this modified list of samplerConf objects in the MCMC configuration
mcmcConf$setSamplers(samplerConfList)
7.2.2.8 Customizing the sampler execution order
The ordering of sampler execution can be controlled as well. This allows for sampler functions to execute multiple times within a single MCMC iteration, or the execution of different sampler functions to be interleaved with one another.
The sampler execution order is set using the function setSamplerExecutionOrder
, and the current ordering of execution is retrieved using getSamplerExecutionOrder
. For example, assuming the MCMC configuration object mcmcConf
contains five samplers:
# first sampler to execute twice, in succession:
mcmcConf$setSamplerExecutionOrder(c(1, 1, 2, 3, 4, 5))
# first sampler to execute multiple times, interleaved:
mcmcConf$setSamplerExecutionOrder(c(1, 2, 1, 3, 1, 4, 1, 5))
# fourth sampler to execute 10 times, only
mcmcConf$setSamplerExecutionOrder(rep(4, 10))
# omitting the argument to setSamplerExecutionOrder()
# resets the ordering to each sampler executing once, sequentially
mcmcConf$setSamplerExecutionOrder()
# retrieve the current ordering of sampler execution
ordering < mcmcConf$getSamplerExecutionOrder()
# print the sampler functions in the order of execution
mcmcConf$printSamplers(executionOrder = TRUE)
7.2.2.9 Monitors and thinning intervals: printMonitors, getMonitors, addMonitors, setThin, and resetMonitors
An MCMC configuration object contains two independent sets of variables to monitor, each with their own thinning interval: thin
corresponding to monitors
, and thin2
corresponding to monitors2
. Monitors operate at the variable level. Only entire model variables may be monitored. Specifying a monitor on a node, e.g., x[1]
, will result in the entire variable x
being monitored.
The variables specified in monitors
and monitors2
will be recorded (with thinning interval thin
) into objects called mvSamples
and mvSamples2
, contained within the MCMC object. These are both modelValues objects; modelValues are NIMBLE data structures used to store multiple sets of values of model variables^{17}. These can be accessed as the member data mvSamples
and mvSamples2
of the MCMC object, and they can be converted to matrices using as.matrix
(see Section 7.6).
Monitors may be added to the MCMC configuration either in the original call to configureMCMC
or using the addMonitors
method:
# Using an argument to configureMCMC
mcmcConf < configureMCMC(Rmodel, monitors = c("alpha", "beta"),
monitors2 = "x")
# Calling a member method of the mcmcconf object
# This results in the same monitors as above
mcmcConf$addMonitors("alpha", "beta")
mcmcConf$addMonitors2("x")
Similarly, either thinning interval may be set at either step:
# Using an argument to configureMCMC
mcmcConf < configureMCMC(Rmodel, thin = 1, thin2 = 100)
# Calling a member method of the mcmcConf object
# This results in the same thinning intervals as above
mcmcConf$setThin(1)
mcmcConf$setThin2(100)
The current lists of monitors and thinning intervals may be displayed using the printMonitors
method. Both sets of monitors (monitors
and monitors2
) may be reset to empty character vectors by calling the resetMonitors
method. The methods getMonitors
and getMonitors2
return the currently specified monitors
and monitors2
as character vectors.
7.2.2.10 Monitoring model logprobabilities
To record model logprobabilities from an MCMC, one can add monitors for logProb variables (which begin with the prefix logProb_
) that correspond to variables with (any) stochastic nodes. For example, to record and extract logprobabilities for the variables alpha
, sigma_mu
, and Y
:
mcmcConf < configureMCMC(Rmodel)
mcmcConf$addMonitors("logProb_alpha", "logProb_sigma_mu", "logProb_Y")
Rmcmc < buildMCMC(mcmcConf)
Cmodel < compileNimble(Rmodel)
Cmcmc < compileNimble(Rmcmc, project = Rmodel)
Cmcmc$run(10000)
samples < as.matrix(Cmcmc$mvSamples)
The samples
matrix will contain both MCMC samples and model logprobabilities.
7.3 Building and compiling the MCMC
Once the MCMC configuration object has been created, and customized to one’s liking, it may be used to build an MCMC function:
Rmcmc < buildMCMC(mcmcConf, enableWAIC = TRUE)
buildMCMC
is a nimbleFunction. The returned object Rmcmc
is an instance of the nimbleFunction specific to configuration mcmcConf
(and of course its associated model).
Note that if you would like to be able to calculate the WAIC of the model after the MCMC function has been run, you must set enableWAIC = TRUE
as an argument to either configureMCMC
or buildMCMC
, or set nimbleOptions(MCMCenableWAIC = TRUE)
, which will enable WAIC calculations for all subsequently built MCMC functions. For more information on WAIC calculations, see Section 7.7 or help(buildMCMC)
in R.
When no customization is needed, one can skip configureMCMC
and simply provide a model object to buildMCMC
. The following two MCMC functions will be identical:
mcmcConf < configureMCMC(Rmodel) # default MCMC configuration
Rmcmc1 < buildMCMC(mcmcConf)
Rmcmc2 < buildMCMC(Rmodel) # uses the default configuration for Rmodel
For speed of execution, we usually want to compile the MCMC function to C++ (as is the case for other NIMBLE functions). To do so, we use compileNimble
. If the model has already been compiled, it should be provided as the project
argument so the MCMC will be part of the same compiled project. A typical compilation call looks like:
Cmcmc < compileNimble(Rmcmc, project = Rmodel)
Alternatively, if the model has not already been compiled, they can be compiled together in one line:
Cmcmc < compileNimble(Rmodel, Rmcmc)
Note that if you compile the MCMC with another object (the model in this case), you’ll need to explicitly refer to the MCMC component of the resulting object to be able to run the MCMC:
Cmcmc$Rmcmc$run(niter = 1000)
7.4 Userfriendly execution of MCMC algorithms: runMCMC
Once an MCMC algorithm has been created using buildMCMC
, the function runMCMC
can be used to run multiple chains and extract posterior samples, summary statistics and/or a WAIC value. This is a simpler approach to executing an MCMC algorithm, than the process of executing and extracting samples as described in Sections 7.5 and 7.6.
runMCMC
also provides several userfriendly options such as burnin, thinning, running multiple chains, and different initial values for each chain. However, using runMCMC
does not support several lowerlevel options, such as timing the individual samplers internal to the MCMC, continuing an existing MCMC run (picking up where it left off), or modifying the sampler execution ordering.
runMCMC
takes arguments that will control the following aspects of the MCMC:
 Number of iterations in each chain;
 Number of chains;
 Number of burnin samples to discard from each chain;
 Thinning interval for recording samples;
 Initial values, or a function for generating initial values for each chain;
 Setting the random number seed;
 Returning the posterior samples as a
coda
mcmc
object;  Returning summary statistics calculated from each chains; and
 Returning a WAIC value calculated using samples from all chains.
The following examples demonstrate some uses of runMCMC
, and assume the existence of Cmcmc
, a compiled MCMC algorithm.
# run a single chain, and return a matrix of samples
mcmc.out < runMCMC(Cmcmc)
# run three chains of 10000 samples, discard initial burnin of 1000,
# record samples thereafter using a thinning interval of 10,
# and return of list of sample matrices
mcmc.out < runMCMC(Cmcmc, niter=10000, nburnin=1000, thin=10, nchains=3)
# run three chains, returning posterior samples, summary statistics,
# and the WAIC value for each chain
mcmc.out < runMCMC(Cmcmc, nchains = 3, summary = TRUE, WAIC = TRUE)
# run two chains, and specify the initial values for each
initsList < list(list(mu = 1, sigma = 1),
list(mu = 2, sigma = 10))
mcmc.out < runMCMC(Cmcmc, nchains = 2, inits = initsList)
# run ten chains of 100,000 iterations each, using a function to
# generate initial values and a fixed random number seed for each chain.
# only return summary statistics from each chain; not all the samples.
initsFunction < function()
list(mu = rnorm(1,0,1), sigma = runif(1,0,100))
mcmc.out < runMCMC(Cmcmc, niter = 100000, nchains = 10,
inits = initsFunction, setSeed = TRUE,
samples = FALSE, summary = TRUE)
See help(runMCMC)
for further details.
7.5 Running the MCMC
The MCMC algorithm (either the compiled or uncompiled version) can be executed using the member method mcmc$run
(see help(buildMCMC)
in R). The run
method has one required argument, niter
, the number of iterations to be run.
The run
method has optional arguments nburnin
, thin
and thin2
. These can be used to specify the number of prethinning burnin samples to discard, and the postburnin thinning intervals for recording samples (corresponding to monitors
and monitors2
). If either thin
and thin2
are provided, they will override the thinning intervals that were specified in the original MCMC configuration object.
The run
method has an optional reset
argument. When reset = TRUE
(the default value), the following occurs prior to running the MCMC:
 All model nodes are checked and filled or updated as needed, in valid (topological) order. If a stochastic node is missing a value, it is populated using a call to
simulate
and its log probability value is calculated. The values of deterministic nodes are calculated from their parent nodes. If any righthandsideonly nodes (e.g., explanatory variables) are missing a value, an error results.  All MCMC sampler functions are reset to their initial state: the initial values of any sampler control parameters (e.g.,
scale
,sliceWidth
, orpropCov
) are reset to their initial values, as were specified by the original MCMC configuration.  The internal modelValues objects
mvSamples
andmvSamples2
are each resized to the appropriate length for holding the requested number of samples (niter/thin
, andniter/thin2
, respectively).
When mcmc$run(niter, reset = FALSE)
is called, the MCMC picks up from where it left off, continuing the previous chain and expanding the output as needed. No values in the model are checked or altered, and sampler functions are not reset to their initial states.
7.5.1 Measuring sampler computation times: getTimes
If you want to obtain the computation time spent in each sampler, you can set time=TRUE
as a runtime argument and then use the method getTimes()
obtain the times. For example,
Cmcmc$run(niter, time = TRUE)
Cmcmc$getTimes()
will return a vector of the total time spent in each sampler, measured in seconds.
7.6 Extracting MCMC samples
After executing the MCMC, the output samples can be extracted as follows:
mvSamples < mcmc$mvSamples
mvSamples2 < mcmc$mvSamples2
These modelValues objects can be converted into matrices using as.matrix
:
samplesMatrix < as.matrix(mvSamples)
samplesMatrix2 < as.matrix(mvSamples2)
The column names of the matrices will be the node names of nodes in the monitored variables. Then, for example, the mean of the samples for node x[2]
could be calculated as:
mean(samplesMatrix[, "x[2]"])
Obtaining samples as matrices is most common, but see Section 14.1 for more about programming with modelValues objects, especially if you want to write nimbleFunctions to use the samples.
7.7 Calculating WAIC
Once an MCMC algorithm has been run, as described in Section 7.5, the WAIC (Watanabe 2010) can be calculated from the posterior samples produced by the MCMC algorithm. Note that in order to calculate the WAIC value after running an MCMC algorithm, the argument enableWAIC = TRUE
must have been supplied to configureMCMC
or to buildMCMC
, or the enableWAIC
NIMBLE option must have been set to TRUE
.
The WAIC is calculated by calling the member method mcmc$calculateWAIC
(see help(buildMCMC)
in R for more details). The calculateWAIC
method has one required argument, nburnin
, the number of initial samples to discard prior to WAIC calculation. nburnin
defaults to 0.
Cmcmc$calculateWAIC(nburnin = 100)
Note that there is not a unique value of WAIC for a model. WAIC is calculated from Equations 5, 12, and 13 in (Gelman, Hwang, and Vehtari 2014) (i.e. using WAIC2). In NIMBLE, the set of all stochastic nodes monitored by the MCMC object will be treated as \(\theta\) for the purposes of Equation 5 from (Gelman, Hwang, and Vehtari 2014).
In many cases one would want \(\theta\) to be the set of all stochastic nodes in the model, in which case the user must set the MCMC monitors to include all stochastic nodes; by default the MCMC monitors are only the toplevel nodes of the model.
7.8 kfold crossvalidation
The runCrossValidate
function in NIMBLE performs kfold crossvalidation on a nimbleModel
fit via MCMC. More information can be found by calling help(runCrossValidate)
.
7.9 Reversible Jump MCMC
Reversible Jump MCMC (Green 1995) (RJMCMC) is a general framework for MCMC simulation in which the dimension of the parameter space can vary between iterates of the Markov chain. The reversible jump sampler can be viewed as an extension of the MetropolisHastings algorithm onto more general state spaces.
Given two models \(M_1\) and \(M_2\) of possibly different dimensions, the core idea of this technique is to remove the difference in the dimensions of models \(M_1\) and \(M_2\) by supplementing the corresponding parameters \(\boldsymbol{\theta_1}\) and \(\boldsymbol{\theta_2}\) with auxiliary variables \(\boldsymbol{u}_{1 \rightarrow 2}\) and \(\boldsymbol{u}_{2 \rightarrow 1}\) such that \((\boldsymbol{\theta_1}, \boldsymbol{u}_{1 \rightarrow 2})\) and \((\boldsymbol{\theta_2}, \boldsymbol{u}_{2 \rightarrow 1})\) are in bijection, \((\boldsymbol{\theta_2}, \boldsymbol{u}_{2 \rightarrow 1}) = \Psi(\boldsymbol{\theta_1}, \boldsymbol{u}_{1 \rightarrow 2})\). The corresponding MetropolisHastings acceptance probability is generalized accounting for the proposal density for the auxiliary variables.
NIMBLE implements RJMCMC for variable selection using a univariate normal distribution as proposal density and supports two types of model specifications, depending on whether or not indicator variables are included. A reversible jump sampler can be added to a MCMC configuration object by calling the function configureRJ
, which handles different model specification by assigning the corresponding specialized samplers, RJ
and RJ_indicator
. Information about the model specification is passed to configureRJ
function by providing one of arguments indicatorNodes
or priorProb
. The configureRJ
function modifies the MCMC configuration, and assigns a RJ
sampler directly to the nodes specified in targetNodes
when priorProb
is provided. Otherwise a RJ_indicator
sampler is assigned to nodes in indicatorNodes
. In addition, the configureRJ
removes the default NIMBLE sampler assigned to each node in targetNodes
, and adds a toggled version of the default sampler that is run only when the node is the model.
The RJ
and RJ_indicator
both use a univariate normal distribution as proposal density, whose mean and scale can be provided in the control
list argument of configureRJ
(by default mean = 0
and scale = 1
). An optional argument called fixedValue
can be provided when the priorProb
argument is present; this indicates the value taken by nodes in targetNodes
when out of the model (by default 0). All arguments in control
can accept one value that will be used for all samplers assigned to nodes targetNodes
; alternatively each sampler can be customized by passing a vector of values whose length must equal the length of targetNodes
.
In the following we provide an example for the two different model specifications.
7.9.1 Using indicator variables
Here we consider a normal linear regression in which two covariates x1
and x2
are given and only x1
has to be included in the model.
## Linear regression with intercept and two covariates
code < nimbleCode({
beta0 ~ dnorm(0, sd = 100)
beta1 ~ dnorm(0, sd = 100)
beta2 ~ dnorm(0, sd = 100)
sigma ~ dunif(0, 100)
z1 ~ dbern(psi) ## indicator variable associated with beta1
z2 ~ dbern(psi) ## indicator variable associated with beta2
psi ~ dbeta(1, 1) ## hyperprior on inclusion probability
for(i in 1:N) {
Ypred[i] < beta0 + beta1 * z1 * x1[i] + beta2 * z2 * x2[i]
Y[i] ~ dnorm(Ypred[i], sd = sigma)
}
})
## simulate some data
set.seed(1)
N < 100
x1 < runif(N, 1, 1)
x2 < runif(N, 1, 1) ## this covariate is not included
Y < rnorm(N, 1.5 + 2 * x1, sd = 1)
## build the model and configure default MCMC
rIndicatorModel < nimbleModel(code, constants = list(N = N),
data = list(Y = Y, x1 = x1, x2 = x2),
inits= list(beta0 = 0, beta1 = 0, beta2 = 0, sigma = sd(Y),
z2 = 1, z1 = 1, psi = 0.5))
indicatorModelConf < configureMCMC(rIndicatorModel)
## print NIMBLE default samplers
indicatorModelConf$printSamplers()
## [1] conjugate_dnorm_dnorm sampler: beta0
## [2] conjugate_dnorm_dnorm sampler: beta1
## [3] conjugate_dnorm_dnorm sampler: beta2
## [4] RW sampler: sigma
## [5] conjugate_dbeta_dbern sampler: psi
## [6] binary sampler: z1
## [7] binary sampler: z2
At this point we may want to modify the current configuration by adding a reversible jump sampler to allow for selection on beta[1]
and beta[2]
variables, which can be done by calling the configureRJ
function.
configureRJ(mcmcConf = indicatorModelConf,
targetNodes = c("beta1", "beta2"),
indicatorNodes = c('z1', 'z2'),
control = list(mean = c(1, 0), scale = 2))
The targetNodes
argument should indicate the nodes for which we want to do variable selection. targetNodes
will be expanded as described in Section 13.3.1.1. I.e., either targetNodes = "beta"
or targetNodes = c("beta[1]", "beta[2]")
will assign a RJ sampler to each of beta[1]
and beta[2]
if beta
is a vector of two values. The same applies to indicatorNodes
, which provides the indicator variables paired with nodes in targetNodes
. Notice that indicatorNodes
must be provided consistently with respect to targetNodes
. E.g. if targetNodes = "beta"
is a vector, then one should have indicatorNodes = "z"
with z
a vector as well; something like indicatorNodes = c("z1", "z2")
will throw an error.
indicatorModelConf$printSamplers()
## [1] conjugate_dnorm_dnorm sampler: beta0
## [2] RW sampler: sigma
## [3] conjugate_dbeta_dbern sampler: psi
## [4] RJ_indicator sampler: z1, mean: 1, scale: 2, targetNode: beta1
## [5] toggled sampler: beta1, samplerConf: conjugate_dnorm_dnorm
## [6] RJ_indicator sampler: z2, mean: 0, scale: 2, targetNode: beta2
## [7] toggled sampler: beta2, samplerConf: conjugate_dnorm_dnorm
An RJ_indicator
sampler is assigned to z[1]
and z[2]
in place of the binary
sampler, while the samplers for beta[1]
and beta[2]
have been changed to toggled
samplers, which still use the default conjugate_dnorm_dnorm
sampler but only when the corresponding indicator variable is equal to \(1\), thereby including the coefficient in the model. The two RJ_indicator
samplers have different means for the proposal distribution but the same scale, as given in the control
list of configureRJ
.
Notice that the order of the sampler has changed, since configureRJ
calls removeSampler
for nodes in targetNodes
and indicatorNodes
, and subsequently addSampler
, which appends the sampler to the end of current sampler list. Order can be modified by using setSamplers
.
Also note that configureRJ
modifies the existing sampler and returns NULL
.
7.9.2 Without indicator variables
We consider the same regression setting, but writing the model without the use of indicator variables.
## Linear regression with intercept and two covariates
code < nimbleCode({
beta0 ~ dnorm(0, sd = 100)
beta1 ~ dnorm(0, sd = 100)
beta2 ~ dnorm(0, sd = 100)
sigma ~ dunif(0, 100)
for(i in 1:N) {
Ypred[i] < beta0 + beta1 * x1[i] + beta2 * x2[i]
Y[i] ~ dnorm(Ypred[i], sd = sigma)
}
})
## build the model
rNoIndicatorModel < nimbleModel(code, constants = list(N = N),
data = list(Y = Y, x1 = x1, x2 = x2),
inits= list(beta0 = 0, beta1 = 0, beta2 = 0, sigma = sd(Y)))
noIndicatorModelConf < configureMCMC(rNoIndicatorModel)
## print NIMBLE default samplers
noIndicatorModelConf$printSamplers()
## [1] conjugate_dnorm_dnorm sampler: beta0
## [2] conjugate_dnorm_dnorm sampler: beta1
## [3] conjugate_dnorm_dnorm sampler: beta2
## [4] RW sampler: sigma
In this case, since there are no indicator variables, we need to pass to configureRJ
the prior inclusion probabilities for each node in targetNodes
, by specifying either one common value or a vector of values for the argument priorProb
. Notice that this case does not allow for a stochastic prior.
configureRJ(mcmcConf = noIndicatorModelConf,
targetNodes = c("beta1", "beta2"),
priorProb = 0.5,
control = list(mean = 0, scale = 2, fixedValue = c(1.5, 0)))
## print samplers after configureRJ
noIndicatorModelConf$printSamplers()
## [1] conjugate_dnorm_dnorm sampler: beta0
## [2] RW sampler: sigma
## [3] RJ sampler: beta1, priorProb: 0.5, mean: 0, scale: 2, fixedValue: 1.5
## [4] toggled sampler: beta1, samplerConf: conjugate_dnorm_dnorm, fixedValue: 1.5
## [5] RJ sampler: beta2, priorProb: 0.5, mean: 0, scale: 2, fixedValue: 0
## [6] toggled sampler: beta2, samplerConf: conjugate_dnorm_dnorm, fixedValue: 0
Since there are no indicator variables, the RJ
sampler is assigned directly to beta[1]
and beta[2]
along with the toggled
sampler. In addition in this case one can set the coefficient to a value different from \(0\) via the fixedValue
argument in the control
list.
If fixedValue
is given when using indicatorNodes
the values provided in fixedValue
are ignored. However the same behavior can be obtained in this situation, using a different model specification. For example, the model in 7.9.1 can be modified to have beta1
equal to \(1.5\) when in the model as follows:
for(i in 1:N) {
Ypred[i] < beta0 + (1  z1) * 1.5 * beta1 * x1[i] +
z1 * beta1 * x1[i] + beta2 * z2 * x2[i]
Y[i] ~ dnorm(Ypred[i], sd = sigma)
}
7.10 Samplers provided with NIMBLE
Most documentation of MCMC samplers provided with NIMBLE can be found by invoking help(samplers)
in R. Here we provide additional explanation of conjugate samplers and how complete customization can be achieved by making a sampler use an arbitrary loglikelihood function, such as to build a particle MCMC algorithm.
7.10.1 Conjugate (‘Gibbs’) samplers
By default, configureMCMC()
and buildMCMC()
will assign conjugate samplers to all nodes satisfying a conjugate relationship, unless the option useConjugacy = FALSE
is specified.
The current release of NIMBLE supports conjugate sampling of the relationships listed in Table 7.1^{18}.
Prior Distribution  Sampling (Dependent Node) Distribution  Parameter 

Beta  Bernoulli  prob 
Binomial  prob 

Negative Binomial  prob 

Dirichlet  Multinomial  prob 
Categorical  prob 

Flat  Normal  mean 
Lognormal  meanlog 

Gamma  Poisson  lambda 
Normal  tau 

Lognormal  taulog 

Gamma  rate 

Inverse Gamma  scale 

Exponential  rate 

Double Exponential  rate 

Weibull  lambda 

Halfflat  Normal  sd 
Lognormal  sdlog 

Inverse Gamma  Normal  var 
Lognormal  varlog 

Gamma  scale 

Inverse Gamma  rate 

Exponential  scale 

Double Exponential  scale 

Normal  Normal  mean 
Lognormal  meanlog 

Multivariate Normal  Multivariate Normal  mean 
Wishart  Multivariate Normal  prec 
Inverse Wishart  Multivariate Normal  cov 
Conjugate sampler functions may (optionally) dynamically check that their own posterior likelihood calculations are correct. If incorrect, a warning is issued. However, this functionality will roughly double the runtime required for conjugate sampling. By default, this option is disabled in NIMBLE. This option may be enabled with nimbleOptions(verifyConjugatePosteriors = TRUE)
.
If one wants information about conjugate node relationships for other purposes, they can be obtained using the checkConjugacy
method on a model. This returns a named list describing all conjugate relationships. The checkConjugacy
method can also accept a character vector argument specifying a subset of node names to check for conjugacy.
7.10.2 Customized loglikelihood evaluations: RW_llFunction sampler
Sometimes it is useful to control the loglikelihood calculations used for an MCMC updater instead of simply using the model. For example, one could use a sampler with a loglikelihood that analytically (or numerically) integrates over latent model nodes. Or one could use a sampler with a loglikelihood that comes from a stochastic approximation such as a particle filter (see below), allowing composition of a particle MCMC (PMCMC) algorithm (Andrieu, Doucet, and Holenstein 2010). The RW_llFunction
sampler handles this by using a MetropolisHastings algorithm with a normal proposal distribution and a userprovided loglikelihood function. To allow compiled execution, the loglikelihood function must be provided as a specialized instance of a nimbleFunction. The loglikelihood function may use the same model as the MCMC as a setup argument (as does the example below), but if so the state of the model should be unchanged during execution of the function (or you must understand the implications otherwise).
The RW_llFunction
sampler can be customized using the control
list argument to set the initial proposal distribution scale and the adaptive properties for the MetropolisHastings sampling. In addition, the control
list argument must contain a named llFunction
element. This is the specialized nimbleFunction that calculates the loglikelihood; it must accept no arguments and return a scalar double number. The return value must be the total loglikelihood of all stochastic dependents of the target
nodes – and, if includesTarget = TRUE
, of the target node(s) themselves – or whatever surrogate is being used for the total loglikelihood. This is a required control
list element with no default. See help(samplers)
for details.
Here is a complete example:
code < nimbleCode({
p ~ dunif(0, 1)
y ~ dbin(p, n)
})
Rmodel < nimbleModel(code, data = list(y=3), inits = list(p=0.5, n=10))
llFun < nimbleFunction(
setup = function(model) { },
run = function() {
y < model$y
p < model$p
n < model$n
ll < lfactorial(n)  lfactorial(y)  lfactorial(ny) +
y * log(p) + (ny) * log(1p)
returnType(double())
return(ll)
}
)
RllFun < llFun(Rmodel)
mcmcConf < configureMCMC(Rmodel, nodes = NULL)
mcmcConf$addSampler(target = "p", type = "RW_llFunction",
control = list(llFunction = RllFun, includesTarget = FALSE))
Rmcmc < buildMCMC(mcmcConf)
7.10.3 Particle MCMC sampler
For state space models, a particle MCMC (PMCMC) sampler can be specified for toplevel parameters. This sampler is described in Section 8.1.2.
7.11 Detailed MCMC example: litters
Here is a detailed example of specifying, building, compiling, and running two MCMC algorithms. We use the litters
example from the BUGS examples.
###############################
##### model configuration #####
###############################
# define our model using BUGS syntax
litters_code < nimbleCode({
for (i in 1:G) {
a[i] ~ dgamma(1, .001)
b[i] ~ dgamma(1, .001)
for (j in 1:N) {
r[i,j] ~ dbin(p[i,j], n[i,j])
p[i,j] ~ dbeta(a[i], b[i])
}
mu[i] < a[i] / (a[i] + b[i])
theta[i] < 1 / (a[i] + b[i])
}
})
# list of fixed constants
constants < list(G = 2,
N = 16,
n = matrix(c(13, 12, 12, 11, 9, 10, 9, 9, 8, 11, 8, 10, 13,
10, 12, 9, 10, 9, 10, 5, 9, 9, 13, 7, 5, 10, 7, 6,
10, 10, 10, 7), nrow = 2))
# list specifying model data
data < list(r = matrix(c(13, 12, 12, 11, 9, 10, 9, 9, 8, 10, 8, 9, 12, 9,
11, 8, 9, 8, 9, 4, 8, 7, 11, 4, 4, 5 , 5, 3, 7, 3,
7, 0), nrow = 2))
# list specifying initial values
inits < list(a = c(1, 1),
b = c(1, 1),
p = matrix(0.5, nrow = 2, ncol = 16),
mu = c(.5, .5),
theta = c(.5, .5))
# build the R model object
Rmodel < nimbleModel(litters_code,
constants = constants,
data = data,
inits = inits)
###########################################
##### MCMC configuration and building #####
###########################################
# generate the default MCMC configuration;
# only wish to monitor the derived quantity "mu"
mcmcConf < configureMCMC(Rmodel, monitors = "mu")
# check the samplers assigned by default MCMC configuration
mcmcConf$printSamplers()
# doublecheck our monitors, and thinning interval
mcmcConf$printMonitors()
# build the executable R MCMC function
mcmc < buildMCMC(mcmcConf)
# let's try another MCMC, as well,
# this time using the crossLevel sampler for toplevel nodes
# generate an empty MCMC configuration
# we need a new copy of the model to avoid compilation errors
Rmodel2 < Rmodel$newModel()
mcmcConf_CL < configureMCMC(Rmodel2, nodes = NULL, monitors = "mu")
# add two crossLevel samplers
mcmcConf_CL$addSampler(target = c("a[1]", "b[1]"), type = "crossLevel")
mcmcConf_CL$addSampler(target = c("a[2]", "b[2]"), type = "crossLevel")
# let's check the samplers
mcmcConf_CL$printSamplers()
# build this second executable R MCMC function
mcmc_CL < buildMCMC(mcmcConf_CL)
###################################
##### compile to C++, and run #####
###################################
# compile the two copies of the model
Cmodel < compileNimble(Rmodel)
Cmodel2 < compileNimble(Rmodel2)
# compile both MCMC algorithms, in the same
# project as the R model object
# NOTE: at this time, we recommend compiling ALL
# executable MCMC functions together
Cmcmc < compileNimble(mcmc, project = Rmodel)
Cmcmc_CL < compileNimble(mcmc_CL, project = Rmodel2)
# run the default MCMC function,
# and example the mean of mu[1]
Cmcmc$run(1000)
cSamplesMatrix < as.matrix(Cmcmc$mvSamples)
mean(cSamplesMatrix[, "mu[1]"])
# run the crossLevel MCMC function,
# and examine the mean of mu[1]
Cmcmc_CL$run(1000)
cSamplesMatrix_CL < as.matrix(Cmcmc_CL$mvSamples)
mean(cSamplesMatrix_CL[, "mu[1]"])
###################################
#### run multiple MCMC chains #####
###################################
# run 3 chains of the crossLevel MCMC
samplesList < runMCMC(Cmcmc_CL, niter=1000, nchains=3)
lapply(samplesList, dim)
7.12 Comparing different MCMCs with MCMCsuite and compareMCMCs
Please see the compareMCMCs
package for the features previously provided by MCMCsuite
and compareMCMCs
in NIMBLE (until version 0.8.0). The compareMCMCs
package provides tools to automatically run MCMC in nimble (including multiple sampler configurations), WinBUGS, OpenBUGS, JAGS, Stan, or any other engine for which you provide a simple common interface. The package makes it easy to manage comparison metrics and generate html pages with comparison figures.
References
Roberts, G. O., and S. K. Sahu. 1997. “Updating Schemes, Correlation Structure, Blocking and Parameterization for the Gibbs Sampler.” Journal of the Royal Statistical Society: Series B (Statistical Methodology) 59 (2): 291–317. doi:10.1111/14679868.00070.
Neal, Radford M. 2003. “Slice Sampling.” The Annals of Statistics 31 (3): 705–41.
Watanabe, Sumio. 2010. “Asymptotic Equivalence of Bayes Cross Validation and Widely Applicable Information Criterion in Singular Learning Theory.” Journal of Machine Learning Research 11 (Dec): 3571–94.
Gelman, A., J. Hwang, and A. Vehtari. 2014. “Understanding Predictive Information Criteria for Bayesian Model.” Statistics and Computing 24 (6): 997–1016.
Green, Peter J. 1995. “Reversible Jump Markov Chain Monte Carlo Computation and Bayesian Model Determination.” Biometrika 82 (4). Oxford University Press: 711–32.
Andrieu, Christophe, Arnaud Doucet, and Roman Holenstein. 2010. “Particle Markov Chain Monte Carlo Methods.” Journal of the Royal Statistical Society: Series B (Statistical Methodology) 72 (3): 269–342. doi:10.1111/j.14679868.2009.00736.x.