# Chapter 2 Lightning introduction

## 2.1 A brief example

Here we’ll give a simple example of building a model and running some algorithms on the model, as well as creating our own user-specified algorithm. The goal is to give you a sense for what one can do in the system. Later sections will provide more detail.

We’ll use the *pump* model example from BUGS^{3}. We could load the model from the standard BUGS example file formats (Section 6.1.2), but instead we’ll show how to enter it directly in R.

In this ‘lightning introduction’ we will:

- Create the model for the pump example.
- Compile the model.
- Create a basic MCMC configuration for the pump model.
- Compile and run the MCMC
- Customize the MCMC configuration and compile and run that.
- Create, compile and run a Monte Carlo Expectation Maximization (MCEM) algorithm, which illustrates some of the flexibility NIMBLE provides to combine R and NIMBLE.
- Write a short
`nimbleFunction`

to generate simulations from designated nodes of any model.

## 2.2 Creating a model

First we define the model code, its constants, data, and initial values for MCMC.

```
pumpCode <- nimbleCode({
for (i in 1:N){
theta[i] ~ dgamma(alpha,beta)
lambda[i] <- theta[i]*t[i]
x[i] ~ dpois(lambda[i])
}
alpha ~ dexp(1.0)
beta ~ dgamma(0.1,1.0)
})
pumpConsts <- list(N = 10,
t = c(94.3, 15.7, 62.9, 126, 5.24,
31.4, 1.05, 1.05, 2.1, 10.5))
pumpData <- list(x = c(5, 1, 5, 14, 3, 19, 1, 1, 4, 22))
pumpInits <- list(alpha = 1, beta = 1,
theta = rep(0.1, pumpConsts$N))
```

Here `x[i]`

is the number of failures recorded during a time duration of length `t[i]`

for the `i`

\(^{th}\) pump. `theta[i]`

is a failure rate, and the goal is estimate parameters `alpha`

and `beta`

. Now let’s create the model and look at some of its nodes.

```
pump <- nimbleModel(code = pumpCode, name = "pump", constants = pumpConsts,
data = pumpData, inits = pumpInits)
pump$getNodeNames()
```

```
## [1] "alpha" "beta" "lifted_d1_over_beta"
## [4] "theta[1]" "theta[2]" "theta[3]"
## [7] "theta[4]" "theta[5]" "theta[6]"
## [10] "theta[7]" "theta[8]" "theta[9]"
## [13] "theta[10]" "lambda[1]" "lambda[2]"
## [16] "lambda[3]" "lambda[4]" "lambda[5]"
## [19] "lambda[6]" "lambda[7]" "lambda[8]"
## [22] "lambda[9]" "lambda[10]" "x[1]"
## [25] "x[2]" "x[3]" "x[4]"
## [28] "x[5]" "x[6]" "x[7]"
## [31] "x[8]" "x[9]" "x[10]"
```

`pump$x`

`## [1] 5 1 5 14 3 19 1 1 4 22`

`pump$logProb_x`

```
## [1] -2.998011 -1.118924 -1.882686 -2.319466 -4.254550 -20.739651
## [7] -2.358795 -2.358795 -9.630645 -48.447798
```

`pump$alpha`

`## [1] 1`

`pump$theta`

`## [1] 0.1 0.1 0.1 0.1 0.1 0.1 0.1 0.1 0.1 0.1`

`pump$lambda`

`## [1] 9.430 1.570 6.290 12.600 0.524 3.140 0.105 0.105 0.210 1.050`

Notice that in the list of nodes, NIMBLE has introduced a new node, `lifted_d1_over_beta`

. We call this a ‘lifted’ node. Like R, NIMBLE allows alternative parameterizations, such as the scale or rate parameterization of the gamma distribution. Choice of parameterization can generate a lifted node, as can using a link function or a distribution argument that is an expression. It’s helpful to know why they exist, but you shouldn’t need to worry about them.

Thanks to the plotting capabilities of the `igraph`

package that NIMBLE uses to represent the directed acyclic graph, we can plot the model (Figure 2.1).

`pump$plotGraph()`

You are in control of the model. By default, `nimbleModel`

does its best to initialize a model, but let’s say you want to re-initialize `theta`

. To simulate from the prior for `theta`

(overwriting the initial values previously in the model) we first need to be sure the parent nodes of all `theta[i]`

nodes are fully initialized, including any non-stochastic nodes such as lifted nodes. We then use the `simulate`

function to simulate from the distribution for `theta`

. Finally we use the `calculate`

function to calculate the dependencies of `theta`

, namely `lambda`

and the log probabilities of `x`

to ensure all parts of the model are up to date. First we show how to use the model’s `getDependencies`

method to query information about its graph.

```
# Show all dependencies of alpha and beta terminating in stochastic nodes
pump$getDependencies(c("alpha", "beta"))
```

```
## [1] "alpha" "beta" "lifted_d1_over_beta"
## [4] "theta[1]" "theta[2]" "theta[3]"
## [7] "theta[4]" "theta[5]" "theta[6]"
## [10] "theta[7]" "theta[8]" "theta[9]"
## [13] "theta[10]"
```

```
# Now show only the deterministic dependencies
pump$getDependencies(c("alpha", "beta"), determOnly = TRUE)
```

`## [1] "lifted_d1_over_beta"`

```
# Check that the lifted node was initialized.
pump[["lifted_d1_over_beta"]] # It was.
```

`## [1] 1`

```
# Now let's simulate new theta values
set.seed(1) # This makes the simulations here reproducible
pump$simulate("theta")
pump$theta # the new theta values
```

```
## [1] 0.15514136 1.88240160 1.80451250 0.83617765 1.22254365 1.15835525
## [7] 0.99001994 0.30737332 0.09461909 0.15720154
```

```
# lambda and logProb_x haven't been re-calculated yet
pump$lambda # these are the same values as above
```

`## [1] 9.430 1.570 6.290 12.600 0.524 3.140 0.105 0.105 0.210 1.050`

`pump$logProb_x`

```
## [1] -2.998011 -1.118924 -1.882686 -2.319466 -4.254550 -20.739651
## [7] -2.358795 -2.358795 -9.630645 -48.447798
```

`pump$getLogProb("x") # The sum of logProb_x`

`## [1] -96.10932`

`pump$calculate(pump$getDependencies(c("theta")))`

`## [1] -262.204`

`pump$lambda # Now they have.`

```
## [1] 14.6298299 29.5537051 113.5038360 105.3583839 6.4061287
## [6] 36.3723548 1.0395209 0.3227420 0.1987001 1.6506161
```

`pump$logProb_x`

```
## [1] -6.002009 -26.167496 -94.632145 -65.346457 -2.626123 -7.429868
## [7] -1.000761 -1.453644 -9.840589 -39.096527
```

Notice that the first `getDependencies`

call returned dependencies from `alpha`

and `beta`

down to the next stochastic nodes in the model. The second call requested only deterministic dependencies. The call to `pump$simulate("theta")`

expands `"theta"`

to include all nodes in `theta`

. After simulating into `theta`

, we can see that `lambda`

and the log probabilities of `x`

still reflect the old values of `theta`

, so we `calculate`

them and then see that they have been updated.

## 2.3 Compiling the model

Next we compile the model, which means generating C++ code, compiling that code, and loading it back into R with an object that can be used just like the uncompiled model. The values in the compiled model will be initialized from those of the original model in R, but the original and compiled models are distinct objects so any subsequent changes in one will not be reflected in the other.

```
Cpump <- compileNimble(pump)
Cpump$theta
```

```
## [1] 0.15514136 1.88240160 1.80451250 0.83617765 1.22254365 1.15835525
## [7] 0.99001994 0.30737332 0.09461909 0.15720154
```

Note that the compiled model is used when running any NIMBLE algorithms via C++, so the model needs to be compiled before (or at the same time as) any compilation of algorithms, such as the compilation of the MCMC done in the next section.

## 2.4 One-line invocation of MCMC

The most direct approach to invoking NIMBLE’s MCMC engine is using the `nimbleMCMC`

function. This function would generally take the code, data, constants, and initial values as input, but it can also accept the (compiled or uncompiled) model object as an argument. It provides a variety of options for executing and controlling multiple chains of NIMBLE’s default MCMC algorithm, and returning posterior samples, posterior summary statistics, and/or WAIC values.

For example, to execute two MCMC chains of 10,000 samples each, and return samples, summary statistics, and WAIC values:

```
mcmc.out <- nimbleMCMC(code = pumpCode, constants = pumpConsts,
data = pumpData, inits = pumpInits,
nchains = 2, niter = 10000,
summary = TRUE, WAIC = TRUE)
names(mcmc.out)
```

`## [1] "samples" "summary" "WAIC"`

`mcmc.out$summary`

```
## $chain1
## Mean Median St.Dev. 95%CI_low 95%CI_upp
## alpha 0.6980435 0.6583506 0.2703768 0.2878982 1.314046
## beta 0.9286260 0.8215685 0.5496913 0.1836991 2.287270
##
## $chain2
## Mean Median St.Dev. 95%CI_low 95%CI_upp
## alpha 0.6910196 0.6580365 0.2654838 0.2771956 1.285815
## beta 0.9162727 0.8143443 0.5375082 0.1857723 2.270243
##
## $all.chains
## Mean Median St.Dev. 95%CI_low 95%CI_upp
## alpha 0.6945316 0.6580365 0.2679578 0.2832985 1.299932
## beta 0.9224494 0.8182816 0.5436554 0.1854908 2.278544
```

`mcmc.out$waic`

`## NULL`

See Section 7.1 or `help(nimbleMCMC)`

for more details about using `nimbleMCMC`

.

## 2.5 Creating, compiling and running a basic MCMC configuration

At this point we have initial values for all of the nodes in the model, and we have both the original and compiled versions of the model. As a first algorithm to try on our model, let’s use NIMBLE’s default MCMC. Note that conjugate relationships are detected for all nodes except for `alpha`

, on which the default sampler is a random walk Metropolis sampler.

`pumpConf <- configureMCMC(pump, print = TRUE)`

```
## [1] RW sampler: alpha
## [2] conjugate_dgamma_dgamma sampler: beta
## [3] conjugate_dgamma_dpois sampler: theta[1]
## [4] conjugate_dgamma_dpois sampler: theta[2]
## [5] conjugate_dgamma_dpois sampler: theta[3]
## [6] conjugate_dgamma_dpois sampler: theta[4]
## [7] conjugate_dgamma_dpois sampler: theta[5]
## [8] conjugate_dgamma_dpois sampler: theta[6]
## [9] conjugate_dgamma_dpois sampler: theta[7]
## [10] conjugate_dgamma_dpois sampler: theta[8]
## [11] conjugate_dgamma_dpois sampler: theta[9]
## [12] conjugate_dgamma_dpois sampler: theta[10]
```

`pumpConf$addMonitors(c("alpha", "beta", "theta"))`

`## thin = 1: alpha, beta, theta`

```
pumpMCMC <- buildMCMC(pumpConf)
CpumpMCMC <- compileNimble(pumpMCMC, project = pump)
niter <- 1000
set.seed(1)
samples <- runMCMC(CpumpMCMC, niter = niter)
par(mfrow = c(1, 4), mai = c(.6, .4, .1, .2))
plot(samples[ , "alpha"], type = "l", xlab = "iteration",
ylab = expression(alpha))
plot(samples[ , "beta"], type = "l", xlab = "iteration",
ylab = expression(beta))
plot(samples[ , "alpha"], samples[ , "beta"], xlab = expression(alpha),
ylab = expression(beta))
plot(samples[ , "theta[1]"], type = "l", xlab = "iteration",
ylab = expression(theta[1]))
```

```
acf(samples[, "alpha"]) # plot autocorrelation of alpha sample
acf(samples[, "beta"]) # plot autocorrelation of beta sample
```

Notice the posterior correlation between `alpha`

and `beta`

. A measure of the mixing for each is the autocorrelation for each parameter, shown by the `acf`

plots.

## 2.6 Customizing the MCMC

Let’s add an adaptive block sampler on `alpha`

and `beta`

jointly and see if that improves the mixing.

```
pumpConf$addSampler(target = c("alpha", "beta"), type = "RW_block",
control = list(adaptInterval = 100))
pumpMCMC2 <- buildMCMC(pumpConf)
# need to reset the nimbleFunctions in order to add the new MCMC
CpumpNewMCMC <- compileNimble(pumpMCMC2, project = pump,
resetFunctions = TRUE)
set.seed(1)
CpumpNewMCMC$run(niter)
```

`## NULL`

```
samplesNew <- as.matrix(CpumpNewMCMC$mvSamples)
par(mfrow = c(1, 4), mai = c(.6, .4, .1, .2))
plot(samplesNew[ , "alpha"], type = "l", xlab = "iteration",
ylab = expression(alpha))
plot(samplesNew[ , "beta"], type = "l", xlab = "iteration",
ylab = expression(beta))
plot(samplesNew[ , "alpha"], samplesNew[ , "beta"], xlab = expression(alpha),
ylab = expression(beta))
plot(samplesNew[ , "theta[1]"], type = "l", xlab = "iteration",
ylab = expression(theta[1]))
```

```
acf(samplesNew[, "alpha"]) # plot autocorrelation of alpha sample
acf(samplesNew[, "beta"]) # plot autocorrelation of beta sample
```

We can see that the block sampler has decreased the autocorrelation for both `alpha`

and `beta`

. Of course these are just short runs, and what we are really interested in is the effective sample size of the MCMC per computation time, but that’s not the point of this example.

Once you learn the MCMC system, you can write your own samplers and include them. The entire system is written in nimbleFunctions.

## 2.7 Running MCEM

NIMBLE is a system for working with algorithms, not just an MCMC engine. So let’s try maximizing the marginal likelihood for `alpha`

and `beta`

using Monte Carlo Expectation Maximization^{4}.

```
pump2 <- pump$newModel()
box = list( list(c("alpha","beta"), c(0, Inf)))
pumpMCEM <- buildMCEM(model = pump2, latentNodes = "theta[1:10]",
boxConstraints = box)
pumpMLE <- pumpMCEM$run()
```

```
## Iteration Number: 1.
## Current number of MCMC iterations: 1000.
## Parameter Estimates:
## alpha beta
## 0.8160625 1.1230921
## Convergence Criterion: 1.001.
## Iteration Number: 2.
## Current number of MCMC iterations: 1000.
## Parameter Estimates:
## alpha beta
## 0.8045037 1.1993128
## Convergence Criterion: 0.0223464.
## Monte Carlo error too big: increasing MCMC sample size.
## Iteration Number: 3.
## Current number of MCMC iterations: 1250.
## Parameter Estimates:
## alpha beta
## 0.8203178 1.2497067
## Convergence Criterion: 0.004913688.
## Monte Carlo error too big: increasing MCMC sample size.
## Monte Carlo error too big: increasing MCMC sample size.
## Monte Carlo error too big: increasing MCMC sample size.
## Iteration Number: 4.
## Current number of MCMC iterations: 3032.
## Parameter Estimates:
## alpha beta
## 0.8226618 1.2602452
## Convergence Criterion: 0.0004201048.
```

`pumpMLE`

```
## alpha beta
## 0.8226618 1.2602452
```

Both estimates are within 0.01 of the values reported by George, Makov, and Smith (1993)^{5}. Some discrepancy is to be expected since it is a Monte Carlo algorithm.

## 2.8 Creating your own functions

Now let’s see an example of writing our own algorithm and using it on the model. We’ll do something simple: simulating multiple values for a designated set of nodes and calculating every part of the model that depends on them. More details on programming in NIMBLE are in Part IV.

Here is our *nimbleFunction*:

```
simNodesMany <- nimbleFunction(
setup = function(model, nodes) {
mv <- modelValues(model)
deps <- model$getDependencies(nodes)
allNodes <- model$getNodeNames()
},
run = function(n = integer()) {
resize(mv, n)
for(i in 1:n) {
model$simulate(nodes)
model$calculate(deps)
copy(from = model, nodes = allNodes,
to = mv, rowTo = i, logProb = TRUE)
}
})
simNodesTheta1to5 <- simNodesMany(pump, "theta[1:5]")
simNodesTheta6to10 <- simNodesMany(pump, "theta[6:10]")
```

Here are a few things to notice about the nimbleFunction.

- The
`setup`

function is written in R. It creates relevant information specific to our model for use in the run-time code.

- The
`setup`

code creates a*modelValues*object to hold multiple sets of values for variables in the model provided. - The
`run`

function is written in NIMBLE. It carries out the calculations using the information determined once for each set of`model`

and`nodes`

arguments by the setup code. The run-time code is what will be compiled. - The
`run`

code requires type information about the argument`n`

. In this case it is a scalar integer.

- The for-loop looks just like R, but only sequential integer iteration is allowed.
- The functions
`calculate`

and`simulate`

, which were introduced above in R, can be used in NIMBLE. - The special function
`copy`

is used here to record values from the model into the modelValues object.

- Multiple instances, or ‘specializations’, can be made by calling
`simNodesMany`

with different arguments. Above,`simNodesTheta1to5`

has been made by calling`simNodesMany`

with the`pump`

model and nodes`"theta[1:5]"`

as inputs to the`setup`

function, while`simNodesTheta6to10`

differs by providing`"theta[6:10]"`

as an argument. The returned objects are objects of a uniquely generated R reference class with fields (member data) for the results of the`setup`

code and a`run`

method (member function).

By the way, `simNodesMany`

is very similar to a standard `nimbleFunction`

provided with NIMBLE, `simNodesMV`

.

Now let’s execute this nimbleFunction in R, before compiling it.

```
set.seed(1) # make the calculation repeatable
pump$alpha <- pumpMLE[1]
pump$beta <- pumpMLE[2]
# make sure to update deterministic dependencies of the altered nodes
pump$calculate(pump$getDependencies(c("alpha","beta"), determOnly = TRUE))
```

`## [1] 0`

```
saveTheta <- pump$theta
simNodesTheta1to5$run(10)
simNodesTheta1to5$mv[["theta"]][1:2]
```

```
## [[1]]
## [1] 0.21829875 1.93210969 0.62296551 0.34197266 3.45729601 1.15835525
## [7] 0.99001994 0.30737332 0.09461909 0.15720154
##
## [[2]]
## [1] 0.82759981 0.08784057 0.34414959 0.29521943 0.14183505 1.15835525
## [7] 0.99001994 0.30737332 0.09461909 0.15720154
```

`simNodesTheta1to5$mv[["logProb_x"]][1:2]`

```
## [[1]]
## [1] -10.250111 -26.921849 -25.630612 -15.594173 -11.217566 -7.429868
## [7] -1.000761 -1.453644 -9.840589 -39.096527
##
## [[2]]
## [1] -61.043876 -1.057668 -11.060164 -11.761432 -3.425282 -7.429868
## [7] -1.000761 -1.453644 -9.840589 -39.096527
```

In this code we have initialized the values of `alpha`

and `beta`

to their MLE and then recorded the `theta`

values to use below. Then we have requested 10 simulations from `simNodesTheta1to5`

. Shown are the first two simulation results for `theta`

and the log probabilities of `x`

. Notice that `theta[6:10]`

and the corresponding log probabilities for `x[6:10]`

are unchanged because the nodes being simulated are only `theta[1:5]`

. In R, this function runs slowly.

Finally, let’s compile the function and run that version.

```
CsimNodesTheta1to5 <- compileNimble(simNodesTheta1to5,
project = pump, resetFunctions = TRUE)
Cpump$alpha <- pumpMLE[1]
Cpump$beta <- pumpMLE[2]
Cpump$calculate(Cpump$getDependencies(c("alpha","beta"), determOnly = TRUE))
```

`## [1] 0`

```
Cpump$theta <- saveTheta
set.seed(1)
CsimNodesTheta1to5$run(10)
```

`## NULL`

`CsimNodesTheta1to5$mv[["theta"]][1:2]`

```
## [[1]]
## [1] 0.21829875 1.93210969 0.62296551 0.34197266 3.45729601 1.15835525
## [7] 0.99001994 0.30737332 0.09461909 0.15720154
##
## [[2]]
## [1] 0.82759981 0.08784057 0.34414959 0.29521943 0.14183505 1.15835525
## [7] 0.99001994 0.30737332 0.09461909 0.15720154
```

`CsimNodesTheta1to5$mv[["logProb_x"]][1:2]`

```
## [[1]]
## [1] -10.250111 -26.921849 -25.630612 -15.594173 -11.217566 -2.782156
## [7] -1.042151 -1.004362 -1.894675 -3.081102
##
## [[2]]
## [1] -61.043876 -1.057668 -11.060164 -11.761432 -3.425282 -2.782156
## [7] -1.042151 -1.004362 -1.894675 -3.081102
```

Given the same initial values and the same random number generator seed, we got identical results for `theta[1:5]`

and their dependencies, but it happened much faster.

### References

George, E.I., U.E. Makov, and A.F.M. Smith. 1993. “Conjugate Likelihood Distributions.” *Scandinavian Journal of Statistics* 20 (2): 147–56.