Version 0.11.0 of NIMBLE released
- added the ‘posterior_predictive_branch’ MCMC sampler, which samples jointly from the predictive distribution of networks of entirely non-data nodes, to improve MCMC mixing,
- added a model method to find parent nodes, called getParents(), analogous to getDependencies(),
- improved efficiency of conjugate samplers,
- allowed use of the elliptical slice sampler for univariate nodes, which can be useful for posteriors with multiple modes,
- allowed model definition using if-then-else without an else clause, and
- fixed a bug giving incorrect node names and potentially affecting algorithm behavior for models with more than 100,000 elements in a vector node or any dimension of a multi-dimensional node.
Please see the release notes on our website for more details.
Bayesian Nonparametric Models in NIMBLE: General Multivariate Models
(Prepared by Claudia Wehrhahn)
Overview
NIMBLE is a hierarchical modeling package that uses nearly the same language for model specification as the popular MCMC packages WinBUGS, OpenBUGS and JAGS, while making the modeling language extensible — you can add distributions and functions — and also allowing customization of the algorithms used to estimate the parameters of the model.
NIMBLE supports Markov chain Monte Carlo (MCMC) inference for Bayesian nonparametric (BNP) mixture models. Specifically, NIMBLE provides functionality for fitting models involving Dirichlet process priors using either the Chinese Restaurant Process (CRP) or a truncated stick-breaking (SB) representation.
In version 0.10.1, we’ve extended NIMBLE to be able to handle more general multivariate models when using the CRP prior. In particular, one can now easily use the CRP prior when multiple observations (or multiple latent variables) are being jointly clustered. For example, in a longitudinal study, one may want to cluster at the individual level, i.e., to jointly cluster all of the observations for each of the individuals in the study. (Formerly this was only possible in NIMBLE by specifying the observations for each individual as coming from a single multivariate distribution.)
This allows one to specify a multivariate mixture kernel as the product of univariate ones. This is particularly useful when working with discrete data. In general, multivariate extensions of well-known univariate discrete distributions, such as the Bernoulli, Poisson and Gamma, are not straightforward. For example, for multivariate count data, a multivariate Poisson distribution might appear to be a good fit, yet its definition is not trivial, inference is cumbersome, and the model lacks flexibility to deal with overdispersion. See Inouye et al. (2017) for a review on multivariate distributions for count data based on the Poisson distribution.
In this post, we illustrate NIMBLE’s new extended BNP capabilities by modelling multivariate discrete data. Specifically, we show how to model multivariate count data from a longitudinal study under a nonparametric framework. The modeling approach is simple and introduces correlation in the measurements within subjects.
For more detailed information on NIMBLE and Bayesian nonparametrics in NIMBLE, see the User Manual.
BNP analysis of epileptic seizure count data
We illustrate the use of nonparametric multivariate mixture models for modeling counts of epileptic seizures from a longitudinal study of the drug progabide as an adjuvant antiepileptic chemotherapy. The data, originally reported in Leppik et al. (1985), arise from a clinical trial of 59 people with epilepsy. At four clinic visits, subjects reported the number of seizures occurring over successive two-week periods. Additional data include the baseline seizure count and the age of the patient. Patients were randomized to receive either progabide or a placebo, in addition to standard chemotherapy.
load(url("https://r-nimble.org/nimbleExamples/seizures.Rda")) names(seizures)
## [1] "id" "seize" "visit" "trt" "age"
head(seizures)
## id seize visit trt age ## 1 101 76 0 1 18 ## 2 101 11 1 1 18 ## 3 101 14 2 1 18 ## 4 101 9 3 1 18 ## 5 101 8 4 1 18 ## 6 102 38 0 1 32
Model formulation
We model the joint distribution of the baseline number of seizures and the counts from each of the two-week periods as a Dirichlet Process mixture (DPM) of products of Poisson distributions. Let , where denotes the seizure count for patient measured at visit , for , and . The value for is the baseline count. The model takes the form
where and corresponds to a product of Gamma distributions.
Our specification uses a product of Poisson distributions as the kernel in the DPM which, at first sight, would suggest independence of the repeated seizure count measurements. However, because we are mixing over the parameters, this specification in fact induces dependence within subjects, with the strength of the dependence being inferred from the data. In order to specify the model in NIMBLE, first we translate the information in seize into a matrix and then we write the NIMBLE code.
We specify this model in NIMBLE with the following code in R. The vector xi contains the latent cluster IDs, one for each patient.
n <- 59 J <- 5 data <- list(y = matrix(seizures$seize, ncol = J, nrow = n, byrow = TRUE)) constants <- list(n = n, J = J) code <- nimbleCode({ for(i in 1:n) { for(j in 1:J) { y[i, j] ~ dpois(lambda[xi[i], j]) } } for(i in 1:n) { for(j in 1:J) { lambda[i, j] ~ dgamma(shape = 1, rate = 0.1) } } xi[1:n] ~ dCRP(conc = alpha, size = n) alpha ~ dgamma(shape = 1, rate = 1) })
Running the MCMC
The following code sets up the data and constants, initializes the parameters, defines the model object, and builds and runs the MCMC algorithm. For speed, the MCMC runs using compiled C++ code, hence the calls to compileNimble to create compiled versions of the model and the MCMC algorithm.
Because the specification is in terms of a Chinese restaurant process, the default sampler selected by NIMBLE is a collapsed Gibbs sampler. More specifically, because the baseline distribution is conjugate to the product of Poisson kernels, Algorithm 2 from Neal (2000) is used.
set.seed(1) inits <- list(xi = 1:n, alpha = 1, lambda = matrix(rgamma(J*n, shape = 1, rate = 0.1), ncol = J, nrow = n)) model <- nimbleModel(code, data=data, inits = inits, constants = constants, dimensions = list(lambda = c(n, J)))
## defining model...
## building model...
## setting data and initial values...
## running calculate on model (any error reports that follow may simply reflect missing values in model variables) ... ## checking model sizes and dimensions... ## model building finished.
cmodel <- compileNimble(model)
## compiling... this may take a minute. Use 'showCompilerOutput = TRUE' to see C++ compilation details. ## compilation finished.
conf <- configureMCMC(model, monitors = c('xi','lambda', 'alpha'), print = TRUE)
## ===== Monitors ===== ## thin = 1: xi, lambda, alpha ## ===== Samplers ===== ## CRP_concentration sampler (1) ## - alpha ## CRP_cluster_wrapper sampler (295) ## - lambda[] (295 elements) ## CRP sampler (1) ## - xi[1:59]
mcmc <- buildMCMC(conf) cmcmc <- compileNimble(mcmc, project = model)
## compiling... this may take a minute. Use 'showCompilerOutput = TRUE' to see C++ compilation details. ## compilation finished.
samples <- runMCMC(cmcmc, niter=55000, nburnin = 5000, thin=10)
## running chain 1...
## |-------------|-------------|-------------|-------------| ## |-------------------------------------------------------|
We can extract posterior samples for some parameters of interest. The following are trace plots of the posterior samples for the concentration parameter, , and the number of clusters.
xiSamples <- samples[, grep('xi', colnames(samples))] # samples of cluster IDs nGroups <- apply(xiSamples, 1, function(x) length(unique(x))) concSamples <- samples[, grep('alpha', colnames(samples))] par(mfrow=c(1, 2)) ts.plot(concSamples, xlab = "Iteration", ylab = expression(alpha), main = expression(paste('Traceplot for ', alpha))) ts.plot(nGroups, xlab = "Iteration", ylab = "Number of components", main = "Number of clusters")
Assessing the posterior
We can compute the posterior predictive distribution for a new observation , , which in turn allows us to obtain univariate or multivariate marginals or conditionals, or any other density estimate of interest. As an illustration, we compute the bivariate posterior predictive distribution for the number of seizures at baseline and at the 4th hospital visit. This is done in two steps. First, we compute posterior samples of the random measure , which can be done using the getSamplesDPmeasure() function. Based on the MCMC output, getSamplesDPmeasure() returns a list of matrices, each of them corresponding to a single posterior sample from , using its stick-breaking (SB) representation. The first column of each of these matrices contains the weights of the SB representation of while the rest of the columns contain the atoms of the SB representation of , here . Second, we compute the bivariate posterior predictive distribution of the seizure counts at baseline and at the fourth visit, based on the posterior samples of . We use a compiled nimble function, called ‘bivariate’, to speed up the computations of the bivariate posterior predictive density.
# samples from the random measure samplesG <- getSamplesDPmeasure(cmcmc)
## compiling... this may take a minute. Use 'showCompilerOutput = TRUE' to see C++ compilation details.
## compilation finished.
niter <- length(samplesG) weightsIndex <- grep('weights', colnames(samplesG[[1]])) lambdaIndex <- grep('lambda', colnames(samplesG[[1]])) ygrid <- 0:45 # function used to compute bivariate posterior predictive bivariateFun <- nimbleFunction( run = function(w = double(1), lambda1 = double(1), lambda5 = double(1), ytilde = double(1)) { returnType(double(2)) ngrid <- length(ytilde) out <- matrix(0, ncol = ngrid, nrow = ngrid) for(i in 1:ngrid) { for(j in 1:ngrid) { out[i, j] <- sum(w * dpois(ytilde[i], lambda1) * dpois(ytilde[j], lambda5)) } } return(out) } ) cbivariateFun <- compileNimble(bivariateFun)
## compiling... this may take a minute. Use 'showCompilerOutput = TRUE' to see C++ compilation details. ## compilation finished.
# computing bivariate posterior predictive of seizure counts are baseline and fourth visit bivariate <- matrix(0, ncol = length(ygrid), nrow = length(ygrid)) for(iter in 1:niter) { weights <- samplesG[[iter]][, weightsIndex] # posterior weights lambdaBaseline <- samplesG[[iter]][, lambdaIndex[1]] # posterior rate of baseline lambdaVisit4 <- samplesG[[iter]][, lambdaIndex[5]] # posterior rate at fourth visit bivariate <- bivariate + cbivariateFun(weights, lambdaBaseline, lambdaVisit4, ygrid) } bivariate <- bivariate / niter
The following code creates a heatmap of the posterior predictive bivariate distribution of the number of seizures at baseline and at the fourth hospital visit, showing that there is a positive correlation between these two measurements.
collist <- colorRampPalette(c('white', 'grey', 'black')) image.plot(ygrid, ygrid, bivariate, col = collist(6), xlab = 'Baseline', ylab = '4th visit', ylim = c(0, 15), axes = TRUE)
In order to describe the uncertainty in the posterior clustering structure of the individuals in the study, we present a heat map of the posterior probability of two subjects belonging to the same cluster. To do this, first we compute the posterior pairwise clustering matrix that describes the probability of two individuals belonging to the same cluster, then we reorder the observations and finally plot the associated heatmap.
pairMatrix <- apply(xiSamples, 2, function(focal) { colSums(focal == xiSamples) }) pairMatrix <- pairMatrix / niter newOrder <- c(1, 35, 13, 16, 32, 33, 2, 29, 39, 26, 28, 52, 17, 15, 23, 8, 31, 38, 9, 46, 45, 11, 49, 44, 50, 41, 54, 21, 3, 40, 47, 48, 12, 6, 14, 7, 18, 22, 30, 55, 19, 34, 56, 57, 4, 5, 58, 10, 43, 25, 59, 20, 27, 24, 36, 37, 42, 51, 53) reordered_pairMatrix <- pairMatrix[newOrder, newOrder] image.plot(1:n, 1:n, reordered_pairMatrix , col = collist(6), xlab = 'Patient', ylab = 'Patient', axes = TRUE) axis(1, at = 1:n, labels = FALSE, tck = -.02) axis(2, at = 1:n, labels = FALSE, tck = -.02) axis(3, at = 1:n, tck = 0, labels = FALSE) axis(4, at = 1:n, tck = 0, labels = FALSE)
References
Inouye, D.I., E. Yang, G.I. Allen, and P. Ravikumar. 2017. A Review of Multivariate Distributions for Count Data Derived from the Poisson Distribution. Wiley Interdisciplinary Reviews: Computational Statistics 9: e1398.
Leppik, I., F. Dreifuss, T. Bowman, N. Santilli, M. Jacobs, C. Crosby, J. Cloyd, et al. 1985. A Double-Blind Crossover Evaluation of Progabide in Partial Seizures: 3: 15 Pm8. Neurology 35.
Neal, R. 2000. Markov chain sampling methods for Dirichlet process mixture models. Journal of Computational and Graphical Statistics 9: 249–65.
NIMBLE virtual short course, May 26-28
We’ll be holding a virtual training workshop on NIMBLE, May 26-28, from 8 am to 1 pm US Pacific (California) time each day. NIMBLE is a system for building and sharing analysis methods for statistical models, especially for hierarchical models and computationally-intensive methods (such as MCMC and SMC).
The workshop will roughly follow the material covered in our June 2020 virtual training, in particular:
- the basic concepts and workflows for using NIMBLE and converting BUGS or JAGS models to work in NIMBLE.
- overview of different MCMC sampling strategies and how to use them in NIMBLE.
- writing new distributions and functions for more flexible modeling and more efficient computation.
- tips and tricks for improving computational efficiency.
- using advanced model components, including Bayesian non-parametric distributions (based on Dirichlet process priors), conditional auto-regressive (CAR) models for spatially correlated random fields, and reversible jump samplers for variable selection.
- an introduction to programming new algorithms in NIMBLE.
- calling R and compiled C++ code from compiled NIMBLE models or functions.
If participant interests vary sufficiently, the third session will be split into two tracks. One of these will likely focus on ecological models. The other will be chosen based on attendee interest from topics such as (a) advanced NIMBLE programming including writing new MCMC samplers, (b) advanced spatial or Bayesian non-parametric modeling, or (c) non-MCMC algorithms in NIMBLE, such as sequential Monte Carlo.
If you are interested in attending, please pre-register at https://forms.gle/6AtNgfdUdvhni32Q6. This will hold a spot for you and allow us to learn about your specific interests. No payment is necessary to pre-register. Fees to finalize registration will be $100 (regular) or $50 (student). We will offer a process for students to request a fee waiver.
The workshop will assume attendees have a basic understanding of hierarchical/Bayesian models and MCMC, the BUGS (or JAGS) model language, and some familiarity with R.
NIMBLE is hiring a programmer
The NIMBLE development team is hiring for a one-year programmer position. We are looking for someone with R and C++ experience. There is also a possibility of a part-time position. The deadline for full consideration is February 12. Application information is here: https://aprecruit.berkeley.edu/JPF02822.
Version 0.10.1 of NIMBLE released
– In particular, it fixes a bug in retrieving parameter values from distributions that was introduced in version 0.10.0. The bug can cause incorrect behavior of conjugate MCMC samplers under certain model structures (such as particular state-space models), so we strongly encourage users to upgrade to 0.10.1.
– In addition, version 0.10.1 restricts use of WAIC to the conditional version of WAIC (conditioning on all parameters directly involved in the likelihood). Previous versions of nimble gave incorrect results when not conditioning on all parameters directly involved in the likelihood (i.e., when not monitoring all such parameters). In a future version of nimble we plan to make a number of improvements to WAIC, including allowing use of marginal versions of WAIC, where the WAIC calculation integrates over random effects.
NIMBLE’s sequential Monte Carlo (SMC) algorithms are now in the nimbleSMC package
We’ve moved NIMBLE’s various sequential Monte Carlo (SMC) algorithms (bootstrap particle filter, auxiliary particle filter, ensemble Kalman filter, iterated filter2, and particle MCMC algorithms) to the new nimbleSMC package. So if you want to use any of these methods as of nimble version 0.10.0, please make sure to install the nimbleSMC package. Any existing code you have that uses any SMC functionality should continue to work as is.
Version 0.10.0 of NIMBLE released
We’ve released the newest version of NIMBLE on CRAN and on our website. NIMBLE is a system for building and sharing analysis methods for statistical models, especially for hierarchical models and computationally-intensive methods (such as MCMC and SMC).
- greatly extended NIMBLE’s Chinese Restaurant Process (CRP)-based Bayesian nonparametrics functionality by allowing multiple observations to be grouped together;
- fixed a bug giving incorrect results in our cross-validation function, runCrossValidate();
- moved NIMBLE’s sequential Monte Carlo (SMC, aka particle filtering) methods into the nimbleSMC package; and
- improved the efficiency of model and MCMC building and compilation.
Please see the release notes on our website for more details.
New NIMBLE cheatsheat available
Version 0.9.1 of NIMBLE released
We’ve released the newest version of NIMBLE on CRAN and on our website. NIMBLE is a system for building and sharing analysis methods for statistical models, especially for hierarchical models and computationally-intensive methods (such as MCMC and SMC). Version 0.9.1 is primarily a bug fix release but also provides some minor improvements in functionality.
Users of NIMBLE in R 4.0 on Windows MUST upgrade to this release for NIMBLE to work.
New features and bug fixes include:
- switched to use of system2() from system() to avoid an issue on Windows in R 4.0;
- modified various adaptive MCMC samplers so the exponent controlling the scale decay of the adaptation is adjustable by user;
- allowed pmin() and pmax() to be used in models;
- improved handling of NA values in the dCRP distribution; and
- improved handling of cases where indexing goes beyond the extent of a variable in expandNodeNames() and related queries of model structure.
Please see the release notes on our website for more details.
nimbleEcology: custom NIMBLE distributions for ecologists
Prepared by Ben Goldstein.
What is nimbleEcology?
nimbleEcology is an auxiliary nimble package for ecologists.
nimbleEcology contains a set of distributions corresponding to some common ecological models. When the package is loaded, these distributions are registered to NIMBLE and can be used directly in models.
nimbleEcology contains distributions often used in modeling abundance, occupancy and capture-recapture studies.
Why use nimbleEcology?
Ecological models for abundance, occupancy and capture-recapture often involve many discrete latent states. Writing such models can be error-prone and in some cases can lead to slow MCMC mixing. We’ve put together a collection of distributions in nimble to make writing these models easier
- Easy to use. Using a nimbleEcology distribution is easier than writing out probabilities or hierarchical model descriptions.
- Minimize errors. You don’t have to lose hours looking for the misplaced minus sign; the distributions are checked and tested.
- Integrate over latent states. nimbleEcology implementations integrate or sum likelihoods over latent states. This eliminates the need for sampling these latent variables, which in some cases can provide efficiency gains, and allows maximum likelihood (ML) estimation methods with hierarchical models.
How to use
nimbleEcology can be installed directly from CRAN as follows.
install.packages("nimbleEcology")
Once nimbleEcology is installed, load it using library. It will also load nimble.
library(nimbleEcology)
## Loading required package: nimble
## nimble version 0.10.0 is loaded. ## For more information on NIMBLE and a User Manual, ## please visit http://R-nimble.org.
## ## Attaching package: 'nimble'
## The following object is masked from 'package:stats': ## ## simulate
## Loading nimbleEcology. ## Registering the following user-defined functions: ## dOcc, dDynOcc, dCJS, dHMM, dDHMM
Note the message indicating which distribution families have been loaded.
Which distributions are available?
The following distributions are available in nimbleEcology.
- dOcc (occupancy model)
- dDynOcc (dynamic occupancy model)
- dHMM (hidden Markov model)
- dDHMM (dynamic hidden Markov model)
- dCJS (Cormack-Jolly-Seber or mark-recapture model)
- dNmixture (N-mixture model)
- dYourNewDistribution Do you have a custom distribution that would fit the package? Are we missing a distribution you need? Let us know! We actively encourage contributions through GitHub or direct communication.
Example code
The following code illustrates a NIMBLE model definition for an occupancy model using nimbleEcology. The model is specified, built, and used to simulate some data according to the occupancy distribution.
library(nimbleEcology) occ_code <- nimbleCode({ psi ~ dunif(0, 1) p ~ dunif(0, 1) for (s in 1:nsite) { x[s, 1:nvisit] ~ dOcc_s(probOcc = psi, probDetect = p, len = nvisit) } }) occ_model <- nimbleModel(occ_code, constants = list(nsite = 10, nvisit = 5), inits = list(psi = 0.5, p = 0.5))
## defining model...
## building model...
## setting data and initial values...
## running calculate on model (any error reports that follow may simply reflect missing values in model variables) ... ## checking model sizes and dimensions... This model is not fully initialized. This is not an error. To see which variables are not initialized, use model$initializeInfo(). For more information on model initialization, see help(modelInitialization). ## model building finished.
set.seed(94) occ_model$simulate("x") occ_model$x
## [,1] [,2] [,3] [,4] [,5] ## [1,] 0 0 0 0 0 ## [2,] 0 0 0 0 0 ## [3,] 0 0 0 0 0 ## [4,] 0 0 0 0 0 ## [5,] 1 1 1 0 1 ## [6,] 0 0 0 1 0 ## [7,] 0 0 0 0 0 ## [8,] 0 0 0 0 1 ## [9,] 1 1 1 0 0 ## [10,] 0 1 0 0 0
How to learn more
Once the package is installed, you can check out the package vignette with vignette(“nimbleEcology”).
Documentation is available for each distribution family using the R syntax ?distribution, for example
?dHMM
For more detail on marginalization in these distributions, see the paper “One size does not fit all: Customizing MCMC methods for hierarchical models using NIMBLE” (Ponisio et al. 2020).