In previous implementations of the BUGS language (WinBUGS, OpenBUGS, and JAGS), there is no way to extend the language with new distributions or functions (other than diving into the low-level source code). NIMBLE tries to make it easy to add distributions and functions and then to share your code with others.
In this example, we’ll add a user-defined distribution in such a way that it can also be used with derivative-based (AD-based) algorithms.
In a variety of situations, scientists consider the possibility that some measurements will yield zeros for reasons separate from other considerations. For example, ecologists counting plants or animals of a particular species at many sites may get a zero when the species is absent but get a Poisson-distributed count when it is present. The explanatory variables they wish to consider as influencing the probability of a so-called structural zero (the species is absent) may be different from those influencing the abundance when it is present. This is done by having one parameter representing the probability of a structural zero and another parameter representing the Poisson mean. Each parameter may itself be predicted by other variables, and a zero in the data could result from either a structural zero or a Poisson sample that just happens to be zero. Similar ideas are used for zero-inflated binomial and zero-inflated negative binomial models. Hurdle models represent a subtly different approach, in which the probability of a zero is mixed with the zero-truncated probability of non-zeros.
Since such distributions were not built into earlier dialects of BUGS, users of that language have made use of a couple of indirect techniques to accomplish them. One approach is to add a Bernoulli (0/1) latent variable to the model for each observed zero. The latent variable indicates whether the corresponding data value is a structural or non-structural zero, and the MCMC must sample it. The other approach is to use a trick for providing an arbitrary log-likelihood to BUGS (for reasons unrelated to zero-inflation, this is sometimes called the “zero trick” – are you confused yet?). The former approach is more general but adds computational cost since the MCMC must sample the latent variables. The latter trick avoids the computational cost but is less general because every time one wants to consider a different distribution one must be sure to get the log likelihood correct, which could become cumbersome and difficult for others to follow.
There are plenty of other situations where one may wish to add a distribution to their model that was not envisioned in earlier dialects of BUGS.
We won’t labor through the indirect techniques mentioned for implementing zero inflation. Also, we won’t do this example in an interesting real model because all the other model components would get in the way. Instead we’ll just show a toy example. Say we have \(N\) observations, \(y[i], i = 1 \ldots N\), each from a zero-inflated Poisson. The parameters are the probability \(p\) of a structural zero and the mean \(\lambda\) of counts that are not structural zeros (but may be non-structural zeros).
We’ll call the density function for our new zero-inflated Poisson distribution “dZIP”.
Of course we need to load the package first:
library(nimble, warn.conflicts = FALSE)
## nimble version 1.3.0 is loaded.
## For more information on NIMBLE and a User Manual,
## please visit https://R-nimble.org.
##
## Note for advanced users who have written their own MCMC samplers:
## As of version 0.13.0, NIMBLE's protocol for handling posterior
## predictive nodes has changed in a way that could affect user-defined
## samplers in some situations. Please see Section 15.5.1 of the User Manual.
The BUGS code for a toy model with our user-defined distribution is:
ZIPcode <- nimbleCode({
p ~ dunif(0,1) # Probability of presence.
lambda ~ dunif(0,10)
for (i in 1:N)
y[i] ~ dZIP(lambda, zeroProb = 1-p) # Note NIMBLE allows R-like named-parameter syntax.
})
Here there is nothing more than the parameters \(p\) and \(lambda\), the data \(y\), and the constant \(N\).
Before we can use this code to build a model, we need to define dZIP. We do that with a nimbleFunction, which is a lot like an R function.
dZIP <- nimbleFunction(
run = function(x = double(), lambda = double(),
zeroProb = double(), log = logical(0, default = 0)) {
returnType(double())
## For use with AD, we cannot use an `if` statement to handle the mixture.
prob <- zeroProb * dbinom(x, size = 1, prob = 0) + (1 - zeroProb) * dpois(x, lambda)
if (log) return(log(prob))
return(prob)
},
buildDerivs = 'run' # Needed when used with AD-based algorithms.
)
This example doesn’t include a full introduction to nimbleFunctions, but here are some brief points:
double()
, integer()
and
logical()
notation means those arguments are scalar
doubles, integers, and logicals (TRUE
or
FALSE
), respectively.returnType(double())
means this function will
return a scalar double.buildDerives = 'run'
as an
argument to nimbleFunction()
.You are not required to provide an “r” function (for example,
rZIP
) for random number generation unless any algorithms
you use would need to generate from the distribution. Two general
situations in which the “r” function is necessary are for model
initialization when no initial value is provided for a node following
the user-defined distribution and for sampling posterior predictive
nodes (those nodes with no downstream data dependencies) that follow the
user-defined distribution.
For completeness, we show the “r” function here.
rZIP <- nimbleFunction(
run = function(n = integer(), lambda = double(), zeroProb = double()) {
returnType(double())
isStructuralZero <- rbinom(1, prob = zeroProb, size = 1)
if (isStructuralZero) return(0)
return(rpois(1, lambda))
})
A brief note on this:
n
, is the number of random draws
you want, but at this moment in NIMBLE’s development the n
argument isn’t really used. We require and assume n = 1. It is there for
compatibility with the standard argument list of “r” functions in R and
for future implementation.NIMBLE will automatically find your distribution without you doing
anything else. In certain cases, you may need to use the
registerDistributions
function first to tell NIMBLE some
information about your function that NIMBLE does not determine
automatically. In this case, using registerDistributions
is
not needed, but we’ll illustrate its use for reference.
registerDistributions(list(
dZIP = list(
BUGSdist = "dZIP(lambda, zeroProb)",
discrete = TRUE,
range = c(0, Inf),
types = c('value = integer()', 'lambda = double()', 'zeroProb = double()')
)))
Now we are ready to build and compile the model. We’ll also use the model to generate a data set and then run MCMC on it.
set.seed(1)
ZIPmodel <- nimbleModel(ZIPcode, constants = list(N = 100),
buildDerivs = TRUE) ## Needed when used with AD-based algorithms.
## Defining model
## Building model
## Running calculate on model
## [Note] Any error reports that follow may simply reflect missing values in model variables.
## Checking model sizes and dimensions
## [Note] This model is not fully initialized. This is not an error.
## To see which variables are not initialized, use model$initializeInfo().
## For more information on model initialization, see help(modelInitialization).
ZIPmodel$p <- 0.6 ## Choose values of p and lambda
ZIPmodel$lambda <- 1.8
ZIPmodel$simulate(ZIPmodel$getDependencies(c("p", "lambda"), self=FALSE)) ## Simulate `y`.
simulatedData <- ZIPmodel$y
simulatedData
## [1] 1 4 4 0 0 0 1 2 3 2 0 3 0 2 1 0 3 2 2 3 0 0 2 3 0 0 2 0 2 0 0 3 1 0 2 0 4
## [38] 1 2 2 0 3 3 1 4 0 3 0 2 1 0 2 1 1 2 0 0 0 1 0 0 0 1 0 0 0 4 5 0 1 0 0 1 0
## [75] 0 2 3 2 1 2 0 0 0 2 5 2 0 1 0 1 0 3 0 1 2 2 1 1 0 3
ZIPmodel$setData(list(y = simulatedData)) ## Set those values as data in the model.
cZIPmodel <- compileNimble(ZIPmodel) ## Compile the model.
## Compiling
## [Note] This may take a minute.
## [Note] Use 'showCompilerOutput = TRUE' to see C++ compilation details.
For info on how to run MCMC in NIMBLE, see other examples. Here we’ll
use buildHMC
to set up an HMC sampler, but we could use
buildMCMC
to set up NIMBLE’s default sampling.
nsamples <- 2000
library(nimbleHMC)
ZIPmcmc <- buildHMC(ZIPmodel, control = list(warmup = nsamples/2))
## ===== Monitors =====
## thin = 1: lambda, p
## ===== Samplers =====
## NUTS sampler (1)
## - p, lambda
cZIPmcmc <- compileNimble(ZIPmcmc, project = ZIPmodel)
## Compiling
## [Note] This may take a minute.
## [Note] Use 'showCompilerOutput = TRUE' to see C++ compilation details.
samples <- runMCMC(cZIPmcmc, niter = nsamples)
## running chain 1...
## [Note] NUTS sampler (nodes: p, lambda) is using 1000 warmup iterations.
## Since `warmupMode` is 'default' and `nburnin` = 0,
## the number of warmup iterations is equal to `niter/2`.
## No samples will be discarded, so the first half of the samples returned
## are from the warmup period, and the second half of the samples are post-warmup.
## |-------------|-------------|-------------|-------------|
## |-------------------------------------------------------|
Let’s look at a summary and trace plots to see if everything looks reasonable. As stated in the message, the second half are the valid HMC samples, although in this case they all look similar.
summary(samples[(nsamples/2 + 1):nsamples, ])
## lambda p
## Min. :1.283 Min. :0.5271
## 1st Qu.:1.655 1st Qu.:0.6698
## Median :1.787 Median :0.7133
## Mean :1.792 Mean :0.7134
## 3rd Qu.:1.923 3rd Qu.:0.7534
## Max. :2.519 Max. :0.9212
plot(samples[,'lambda'], type = 'l', main = 'lambda trace plot')
plot(samples[,'p'], type = 'l', main = 'p trace plot')
It worked!