Beta version of NIMBLE with automatic differentiation, including HMC sampling and Laplace approximation

We’re excited to announce that NIMBLE now supports automatic differentiation (AD), also known as algorithmic differentiation, in a beta version available on our website. In this beta version, NIMBLE now provides:

  • Hamiltonian Monte Carlo (HMC) sampling for an entire parameter vector or arbitrary subsets of the parameter vector (i.e., combined with other samplers for the remaining parameters). 
  • Laplace approximation for approximate integration over latent states in a model, allowing maximum likelihood estimation and MCMC based on the marginal likelihood (via the RW_llFunction samplers).
  • The ability for users and algorithm developers to write nimbleFunctions that calculate derivatives of functions, including many but not all mathematical operations that are supported in the NIMBLE language.

We’re making this beta release available to allow our users to test and evaluate the AD functionality and the new algorithms, but it is not recommended for production use at this stage. So please give it a try, and let us know of any problems or suggestions you have, either via the nimble-users list, bug reports to our GitHub repository, or email to nimble.stats@gmail.com

You can download the beta version and view an extensive draft manual for the AD functionality.

We plan to release this functionality in the next NIMBLE release on CRAN in the coming months. 

Keep everything organized in Ledger Live for Mac by tagging and taking notes so you can easily take note of your transactions and later refer to them.

Version 0.12.2 of NIMBLE released, including an important bug fix for some models using Bayesian nonparametrics with the dCRP distribution

We’ve released the newest version of NIMBLE on CRAN and on our website. NIMBLE is a system for building and sharing analysis methods for statistical models, especially for hierarchical models and computationally-intensive methods (such as MCMC and SMC).

Version 0.12.2 is a bug fix release. In particular, this release fixes a bug in our Bayesian nonparametric distribution (BNP) functionality that gives incorrect MCMC results for some models, specifically when using the dCRP distribution when the parameters of the mixture components (i.e., the clusters) have hyperparameters (i.e., the base measure parameters) that are unknown and sampled during the MCMC. Here is an example basic model structure that is affected by the bug:

k[1:n] ~ dCRP(alpha, n)
for(i in 1:n) {
  y[i] ~ dnorm(mu[k[i]], 1)
  mu[i] ~ dnorm(mu0, 1) ## mixture component parameters with hyperparameter
}
mu0 ~ dnorm(0, 1) ## unknown cluster hyperparameter

(There is no problem without the hyperparameter layer – i.e., if mu0 is a fixed value – which is the situation in many models.)

We strongly encourage users using models with this type of structure to rerun their analyses, and we apologize for this issue.

Other changes in this release include:

  • Fixing an issue with reversible jump variable selection under a similar situation to the BNP issue discussed above (in particular where there are unknown hyperparameters of the regression coefficients being considered, which would likely be an unusual use case).
  • Fixing a bug preventing setup of conjugate samplers for dwishart or dinvwishart nodes when using dynamic indexing.
  • Fixing a bug preventing use of truncation bounds specified via `data` or `constants`.
  • Fixing a bug preventing MCMC sampling with the LKJ prior for 2×2 matrices.
  • Fixing a bug in `runCrossValidate` affecting extraction of multivariate nodes.
  • Fixing a bug producing incorrect subset assignment into logical vectors in nimbleFunction code.
  • Fixing a bug preventing use of `nimbleExternalCall` with a constant expression.
  • Fixing a bug preventing use of recursion in nimbleFunctions without setup code.

Please see the release notes on our website for more details.

 

Keep everything organized in Ledger Live for Mac by tagging and taking notes so you can easily take note of your transactions and later refer to them.

Version 0.12.1 of NIMBLE released

We’ve released the newest version of NIMBLE on CRAN and on our website. NIMBLE is a system for building and sharing analysis methods for statistical models, especially for hierarchical models and computationally-intensive methods (such as MCMC and SMC).

Version 0.12.1, in combination with version 0.12.0 (which was released just last week), provides a variety of new functionality (in particular enhanced WAIC functionality and adding the LKJ distribution) plus bug fixes affecting MCMC in specific narrow cases described below and that warrant upgrading for some users. The changes include:

  • Completely revamping WAIC in NIMBLE, creating an online version that does not require any particular variable monitors. The new WAIC can calculate conditional or marginal WAIC and can group data nodes into joint likelihood terms if desired. In addition there is a new calculateWAIC() function that will calculate the basic conditional WAIC from MCMC output without having to enable the WAIC when creating the MCMC.
  • Adding the LKJ distribution, useful for prior distributions for correlation matrices, along with random walk samplers for them. These samplers operate in an unconstrained transformed parameter space and are assigned by default during MCMC configuration.
  • Fixing a bug introduced in conjugacy processing in version 0.11.0 that causes incorrect MCMC sampling only in specific cases. The impacted cases have terms of the form “a[i] + x[i] * beta” (or more simply “x[i] * beta”), with beta subject to conjugate sampling and either (i) ‘x’ provided via NIMBLE’s constants argument and x[1] == 1 or (ii) ‘a’ provided via NIMBLE’s constants argument and a[1] == 0.
  • Fixing an error in the sampler for the proper CAR distribution (dcar_proper) that gives incorrect MCMC results when the mean of the proper CAR is not the same value for all locations, e.g., when embedding covariate effects directly in the `mu` parameter of the `dcar_proper` distribution.
  • Fixing isData(‘y’) to return TRUE whenever any elements of a multivariate data node (‘y’) are flagged as data. As a result, attempting to carry out MCMC on the non-data elements will now fail. Formerly if only some elements were flagged as data, `isData` would only check the first element, potentially leading to other elements that were flagged as data being overwritten.
  • Error trapping cases where a BNP model has a differing number of dependent stochastic nodes (e.g., observations) or dependent deterministic nodes per group of elements clustered jointly (using functionality introduced in version 0.10.0). Previously we were not error trapping this, and incorrect MCMC results would be obtained.
  • Improving the formatting of standard logging messages.

Version 0.11.1 of NIMBLE released

We’ve released the newest version of NIMBLE on CRAN and on our website. NIMBLE is a system for building and sharing analysis methods for statistical models, especially for hierarchical models and computationally-intensive methods (such as MCMC and SMC).

Version 0.11.1 is a bug fix release, fixing a bug that was introduced in Version 0.11.0 (which was released on April 17, 2021) that affected MCMC sampling in MCMCs using the “posterior_predictive_branch” sampler introduced in version 0.11.0. This sampler would be listed by name when the MCMC configuration object is created and would be assigned to any set of multiple nodes that (as a group of nodes) have no data dependencies and are therefore sampled as a group from their predictive distributions.

For those currently using version 0.11.0, please update your version of NIMBLE. For users currently using other versions, this release won’t directly affect you, but we generally encourage you to update as we release new versions.

Version 0.11.0 of NIMBLE released

We’ve released the newest version of NIMBLE on CRAN and on our website. NIMBLE is a system for building and sharing analysis methods for statistical models, especially for hierarchical models and computationally-intensive methods (such as MCMC and SMC).
Version 0.11.0 provides a variety of new functionality, improved error trapping, and bug fixes, including:
  • added the ‘posterior_predictive_branch’ MCMC sampler, which samples jointly from the predictive distribution of networks of entirely non-data nodes, to improve MCMC mixing,
  • added a model method to find parent nodes, called getParents(), analogous to getDependencies(),
  • improved efficiency of conjugate samplers,
  • allowed use of the elliptical slice sampler for univariate nodes, which can be useful for posteriors with multiple modes,
  • allowed model definition using if-then-else without an else clause, and
  • fixed a bug giving incorrect node names and potentially affecting algorithm behavior for models with more than 100,000 elements in a vector node or any dimension of a multi-dimensional node.

Please see the release notes on our website for more details.

Version 0.10.1 of NIMBLE released

We’ve released the newest version of NIMBLE on CRAN and on our website. NIMBLE is a system for building and sharing analysis methods for statistical models, especially for hierarchical models and computationally-intensive methods (such as MCMC and SMC).
We’ve released version 0.10.1.  Version 0.10.1 is primarily a bug fix release:

–  In particular, it fixes a bug in retrieving parameter values from distributions that was introduced in version 0.10.0. The bug can cause incorrect behavior of conjugate MCMC samplers under certain model structures (such as particular state-space models), so we strongly encourage users to upgrade to 0.10.1.
– In addition, version 0.10.1 restricts use of WAIC to the conditional version of WAIC (conditioning on all parameters directly involved in the likelihood). Previous versions of nimble gave incorrect results when not conditioning on all parameters directly involved in the likelihood (i.e., when not monitoring all such parameters). In a future version of nimble we plan to make a number of improvements to WAIC, including allowing use of marginal versions of WAIC, where the WAIC calculation integrates over random effects.

Please see the release notes on our website for more details.

NIMBLE’s sequential Monte Carlo (SMC) algorithms are now in the nimbleSMC package

We’ve moved NIMBLE’s various sequential Monte Carlo (SMC) algorithms (bootstrap particle filter, auxiliary particle filter, ensemble Kalman filter, iterated filter2, and particle MCMC algorithms) to the new nimbleSMC package. So if you want to use any of these methods as of nimble version 0.10.0, please make sure to install the nimbleSMC package. Any existing code you have that uses any SMC functionality should continue to work as is.

As development work on NIMBLE has proceeded over the years, we’ve added additional functionality to the core nimble package, so we’ve reached the stage where it makes sense to break off some of the functionality into separate packages. Our goal is to make the overall NIMBLE platform more modular and therefore easier to maintain and use.

Version 0.10.0 of NIMBLE released

We’ve released the newest version of NIMBLE on CRAN and on our website. NIMBLE is a system for building and sharing analysis methods for statistical models, especially for hierarchical models and computationally-intensive methods (such as MCMC and SMC).

Version 0.10.0 provides new features, improvements in speed of building models and algorithms, bug fixes, and various improvements.
New features and bug fixes include:
  • greatly extended NIMBLE’s Chinese Restaurant Process (CRP)-based Bayesian nonparametrics functionality by allowing multiple observations to be grouped together;
  • fixed a bug giving incorrect results in our cross-validation function, runCrossValidate();
  • moved NIMBLE’s sequential Monte Carlo (SMC, aka particle filtering) methods into the nimbleSMC package; and
  • improved the efficiency of model and MCMC building and compilation.

Please see the release notes on our website for more details.

Version 0.9.1 of NIMBLE released

We’ve released the newest version of NIMBLE on CRAN and on our website. NIMBLE is a system for building and sharing analysis methods for statistical models, especially for hierarchical models and computationally-intensive methods (such as MCMC and SMC). Version 0.9.1 is primarily a bug fix release but also provides some minor improvements in functionality.

Users of NIMBLE in R 4.0 on Windows MUST upgrade to this release for NIMBLE to work.

New features and bug fixes include:

  • switched to use of system2() from system() to avoid an issue on Windows in R 4.0;
  • modified various adaptive MCMC samplers so the exponent controlling the scale decay of the adaptation is adjustable by user;
  • allowed pmin() and pmax() to be used in models;
  • improved handling of NA values in the dCRP distribution; and
  • improved handling of cases where indexing goes beyond the extent of a variable in expandNodeNames() and related queries of model structure.

Please see the release notes on our website for more details.

nimbleEcology: custom NIMBLE distributions for ecologists

Prepared by Ben Goldstein.

What is nimbleEcology?

nimbleEcology is an auxiliary nimble package for ecologists.

nimbleEcology contains a set of distributions corresponding to some common ecological models. When the package is loaded, these distributions are registered to NIMBLE and can be used directly in models.

nimbleEcology contains distributions often used in modeling abundance, occupancy and capture-recapture studies.

Why use nimbleEcology?

Ecological models for abundance, occupancy and capture-recapture often involve many discrete latent states. Writing such models can be error-prone and in some cases can lead to slow MCMC mixing. We’ve put together a collection of distributions in nimble to make writing these models easier

  • Easy to use. Using a nimbleEcology distribution is easier than writing out probabilities or hierarchical model descriptions.
  • Minimize errors. You don’t have to lose hours looking for the misplaced minus sign; the distributions are checked and tested.
  • Integrate over latent states. nimbleEcology implementations integrate or sum likelihoods over latent states. This eliminates the need for sampling these latent variables, which in some cases can provide efficiency gains, and allows maximum likelihood (ML) estimation methods with hierarchical models.

How to use

nimbleEcology can be installed directly from CRAN as follows.

install.packages("nimbleEcology")

Once nimbleEcology is installed, load it using library. It will also load nimble.

library(nimbleEcology)
## Loading required package: nimble
## nimble version 0.10.0 is loaded.
## For more information on NIMBLE and a User Manual,
## please visit http://R-nimble.org.
## 
## Attaching package: 'nimble'
## The following object is masked from 'package:stats':
## 
##     simulate
## Loading nimbleEcology. 
## Registering the following user-defined functions: 
## dOcc, dDynOcc, dCJS, dHMM, dDHMM

Note the message indicating which distribution families have been loaded.

Which distributions are available?

The following distributions are available in nimbleEcology.

  • dOcc (occupancy model)
  • dDynOcc (dynamic occupancy model)
  • dHMM (hidden Markov model)
  • dDHMM (dynamic hidden Markov model)
  • dCJS (Cormack-Jolly-Seber or mark-recapture model)
  • dNmixture (N-mixture model)
  • dYourNewDistribution Do you have a custom distribution that would fit the package? Are we missing a distribution you need? Let us know! We actively encourage contributions through GitHub or direct communication.

Example code

The following code illustrates a NIMBLE model definition for an occupancy model using nimbleEcology. The model is specified, built, and used to simulate some data according to the occupancy distribution.

library(nimbleEcology)

occ_code <- nimbleCode({
  psi ~ dunif(0, 1)
  p ~ dunif(0, 1)
  for (s in 1:nsite) {
    x[s, 1:nvisit] ~ dOcc_s(probOcc = psi, probDetect = p,
                            len = nvisit)
  }
})

occ_model <- nimbleModel(occ_code,
               constants = list(nsite = 10, nvisit = 5),
               inits = list(psi = 0.5, p = 0.5))
## defining model...
## building model...
## setting data and initial values...
## running calculate on model (any error reports that follow may simply reflect missing values in model variables) ... 
## checking model sizes and dimensions... This model is not fully initialized. This is not an error. To see which variables are not initialized, use model$initializeInfo(). For more information on model initialization, see help(modelInitialization).
## model building finished.
set.seed(94)
occ_model$simulate("x")
occ_model$x
##       [,1] [,2] [,3] [,4] [,5]
##  [1,]    0    0    0    0    0
##  [2,]    0    0    0    0    0
##  [3,]    0    0    0    0    0
##  [4,]    0    0    0    0    0
##  [5,]    1    1    1    0    1
##  [6,]    0    0    0    1    0
##  [7,]    0    0    0    0    0
##  [8,]    0    0    0    0    1
##  [9,]    1    1    1    0    0
## [10,]    0    1    0    0    0

How to learn more

Once the package is installed, you can check out the package vignette with vignette(“nimbleEcology”).

Documentation is available for each distribution family using the R syntax ?distribution, for example

?dHMM

For more detail on marginalization in these distributions, see the paper “One size does not fit all: Customizing MCMC methods for hierarchical models using NIMBLE” (Ponisio et al. 2020).