nimbleEcology: custom NIMBLE distributions for ecologists

Prepared by Ben Goldstein.

What is nimbleEcology?

nimbleEcology is an auxiliary nimble package for ecologists.

nimbleEcology contains a set of distributions corresponding to some common ecological models. When the package is loaded, these distributions are registered to NIMBLE and can be used directly in models.

nimbleEcology contains distributions often used in modeling abundance, occupancy and capture-recapture studies.

Why use nimbleEcology?

Ecological models for abundance, occupancy and capture-recapture often involve many discrete latent states. Writing such models can be error-prone and in some cases can lead to slow MCMC mixing. We’ve put together a collection of distributions in nimble to make writing these models easier

  • Easy to use. Using a nimbleEcology distribution is easier than writing out probabilities or hierarchical model descriptions.
  • Minimize errors. You don’t have to lose hours looking for the misplaced minus sign; the distributions are checked and tested.
  • Integrate over latent states. nimbleEcology implementations integrate or sum likelihoods over latent states. This eliminates the need for sampling these latent variables, which in some cases can provide efficiency gains, and allows maximum likelihood (ML) estimation methods with hierarchical models.

How to use

nimbleEcology can be installed directly from CRAN as follows.

install.packages("nimbleEcology")

Once nimbleEcology is installed, load it using library. It will also load nimble.

library(nimbleEcology)
## Loading required package: nimble
## nimble version 0.10.0 is loaded.
## For more information on NIMBLE and a User Manual,
## please visit http://R-nimble.org.
## 
## Attaching package: 'nimble'
## The following object is masked from 'package:stats':
## 
##     simulate
## Loading nimbleEcology. 
## Registering the following user-defined functions: 
## dOcc, dDynOcc, dCJS, dHMM, dDHMM

Note the message indicating which distribution families have been loaded.

Which distributions are available?

The following distributions are available in nimbleEcology.

  • dOcc (occupancy model)
  • dDynOcc (dynamic occupancy model)
  • dHMM (hidden Markov model)
  • dDHMM (dynamic hidden Markov model)
  • dCJS (Cormack-Jolly-Seber or mark-recapture model)
  • dNmixture (N-mixture model)
  • dYourNewDistribution Do you have a custom distribution that would fit the package? Are we missing a distribution you need? Let us know! We actively encourage contributions through GitHub or direct communication.

Example code

The following code illustrates a NIMBLE model definition for an occupancy model using nimbleEcology. The model is specified, built, and used to simulate some data according to the occupancy distribution.

library(nimbleEcology)

occ_code <- nimbleCode({
  psi ~ dunif(0, 1)
  p ~ dunif(0, 1)
  for (s in 1:nsite) {
    x[s, 1:nvisit] ~ dOcc_s(probOcc = psi, probDetect = p,
                            len = nvisit)
  }
})

occ_model <- nimbleModel(occ_code,
               constants = list(nsite = 10, nvisit = 5),
               inits = list(psi = 0.5, p = 0.5))
## defining model...
## building model...
## setting data and initial values...
## running calculate on model (any error reports that follow may simply reflect missing values in model variables) ... 
## checking model sizes and dimensions... This model is not fully initialized. This is not an error. To see which variables are not initialized, use model$initializeInfo(). For more information on model initialization, see help(modelInitialization).
## model building finished.
set.seed(94)
occ_model$simulate("x")
occ_model$x
##       [,1] [,2] [,3] [,4] [,5]
##  [1,]    0    0    0    0    0
##  [2,]    0    0    0    0    0
##  [3,]    0    0    0    0    0
##  [4,]    0    0    0    0    0
##  [5,]    1    1    1    0    1
##  [6,]    0    0    0    1    0
##  [7,]    0    0    0    0    0
##  [8,]    0    0    0    0    1
##  [9,]    1    1    1    0    0
## [10,]    0    1    0    0    0

How to learn more

Once the package is installed, you can check out the package vignette with vignette(“nimbleEcology”).

Documentation is available for each distribution family using the R syntax ?distribution, for example

?dHMM

For more detail on marginalization in these distributions, see the paper “One size does not fit all: Customizing MCMC methods for hierarchical models using NIMBLE” (Ponisio et al. 2020).

Version 0.9.0 of NIMBLE released

We’ve released the newest version of NIMBLE on CRAN and on our website. NIMBLE is a system for building and sharing analysis methods for statistical models, especially for hierarchical models and computationally-intensive methods (such as MCMC and SMC). Version 0.9.0 provides some new features as well as providing a variety of speed improvements, better output handling, and bug fixes.

New features and bug fixes include:

  • added an iterated filtering 2 (IF2) algorithm (a sequential Monte Carlo (SMC) method) for parameter estimation via maximum likelihood;
  • fixed several bugs in our SMC algorithms;
  • improved the speed of MCMC configuration;
  • improved the user interface for interacting with the MCMC configuration; and
  • improved our conjugacy checking system to detect some additional cases of conjugacy.

Please see the release notes on our website for more details.

 

Version 0.8.0 of NIMBLE released

We’ve released the newest version of NIMBLE on CRAN and on our website. NIMBLE is a system for building and sharing analysis methods for statistical models, especially for hierarchical models and computationally-intensive methods (such as MCMC and SMC). Version 0.8.0 provides some new features, speed improvements, and a variety of bug fixes and better error/warning messages.

New features include:

  • a reversible jump MCMC sampler for variable selection via configureRJ();
  • greatly improved speed of MCMC sampling for Bayesian nonparametric models with a dCRP distribution by not sampling parameters for empty clusters;
  • experimental faster MCMC configuration, available by setting nimbleOptions(oldConjugacyChecking = FALSE) and nimbleOptions(useNewConfigureMCMC = TRUE);
  • and improved warning and error messages for MCEM and slice sampling.

Please see the release notes on our website for more details.

Version 0.7.1 of NIMBLE released

We’ve released the newest version of NIMBLE on CRAN and on our website. NIMBLE is a system for building and sharing analysis methods for statistical models, especially for hierarchical models and computationally-intensive methods (such as MCMC and SMC).

Version 0.7.1 is primarily a maintenance release with a couple important bug fixes and a few additional features. Users with large models and users of the dCRP Bayesian nonparametric distribution are strongly encouraged to update to this version to pick up the bug fixes related to these uses.

New features include:

  • recognition of normal-normal conjugacy in additional multivariate regression settings;
  • handling of six-dimensional arrays in models.

Please see the release notes on our website for more details.

Version 0.7.0 of NIMBLE released

We’ve released the newest version of NIMBLE on CRAN and on our website. NIMBLE is a system for building and sharing analysis methods for statistical models, especially for hierarchical models and computationally-intensive methods (such as MCMC and SMC). Version 0.7.0 provides a variety of new features, as well as various bug fixes.

New features include:

  • greatly improved efficiency of sampling for Bayesian nonparametric (BNP) mixture models that use the dCRP (Chinese Restaurant process) distribution;
  • addition of the double exponential (Laplace) distribution for use in models and nimbleFunctions;
  • a new “RW_wishart” MCMC sampler, for sampling non-conjugate Wishart and inverse-Wishart nodes;
  • handling of the normal-inverse gamma conjugacy for BNP mixture models using the dCRP distribution;
  • enhanced functionality of the getSamplesDPmeasure function for posterior sampling from BNP random measures with Dirichlet process priors.
  • handling of five-dimensional arrays in models;
  • enhanced warning messages; and
  • an HTML version of the NIMBLE manual.

Please see the NEWS file in the installed package for more details.

Version 0.6-12 of NIMBLE released

We’ve released the newest version of NIMBLE on CRAN and on our website. Version 0.6-12 is primarily a maintenance release with various bug fixes.

Changes include:

  • a fix for the bootstrap particle filter to correctly calculate weights when particles are not resampled (the filter had been omitting the previous weights when calculating the new weights);
  • addition of an option to print MCMC samplers of a particular type;
  • avoiding an overly-aggressive check for ragged arrays when building models; and
  • avoiding assigning a sampler to non-conjugacy inverse-Wishart nodes (thereby matching our handling of Wishart nodes).

Please see the NEWS file in the installed package for more details.

Version 0.6-11 of NIMBLE released

We’ve released the newest version of NIMBLE on CRAN and on our website.

Version 0.6-11 has important new features, notably support for Bayesian nonparametric mixture modeling, and more are on the way in the next few months.

New features include:

  • support for Bayesian nonparametric mixture modeling using Dirichlet process mixtures, with specialized MCMC samplers automatically assigned in NIMBLE’s default MCMC (See Chapter 10 of the manual for details);
  • additional resampling methods available with the auxiliary and bootstrap particle filters;
  • user-defined filtering algorithms can be used with NIMBLE’s particle MCMC samplers;
  • MCMC thinning intervals can be modified at MCMC run-time;
  • both runMCMC() and nimbleMCMC() now drop burn-in samples before thinning, making their behavior consistent with each other;
  • increased functionality for the ‘setSeed’ argument in nimbleMCMC() and runMCMC();
  • new functionality for specifying the order in which sampler functions are executed in an MCMC; and
  • invalid dynamic indexes now result in a warning and NaN values but do not cause execution to error out, allowing MCMC sampling to continue.

Please see the NEWS file in the installed package for more details

Version 0.6-10 of NIMBLE released

We’ve released the newest version of NIMBLE on CRAN and on our website. Version 0.6-10 primarily contains updates to the NIMBLE internals that may speed up building and compilation of models and algorithms, as well as a few bug fixes.

Changes include:

  • some steps of model and algorithm building and compilation are faster;
  • compiled execution with multivariate distributions or function arguments may be faster;
  • data can now be provided as a numeric data frame rather than a matrix;
  • to run WAIC, a user now must set ‘enableWAIC’ to TRUE, either in NIMBLE’s options or as an argument to buildMCMC();
  • if ‘enableWAIC’ is TRUE, buildMCMC() will now check to make sure that the nodes monitored by the MCMC algorithm will lead to a valid WAIC calculation; and
  • the use of identityMatrix() is deprecated in favor of diag().

Please see the NEWS file in the installed package for more details

Version 0.6-9 of NIMBLE released

We’ve released the newest version of NIMBLE on CRAN and on our website. Version 0.6-9 is primarily a maintenance release with various bug fixes and fixes for CRAN packaging issues.

New features include:

  • dimensions in a model will now be determined from either ‘inits’ or ‘data’ if not otherwise available;
  • one can now specify “nBootReps = NA” in the runCrossValidate() function, which will prevent the Monte Carlo error from being calculated;
  • runCrossValidate() now returns the averaged loss over all k folds, instead of the summed loss;
  • We’ve added the besselK function to the NIMBLE language;
  • and a variety of bug fixes.

Please see the NEWS file in the installed package for more details

Version 0.6-8 of NIMBLE released

We’ve released the newest version of NIMBLE on CRAN and on our website a week ago. Version 0.6-8 has a few new features, and more are on the way in the next few months.

New features include:

  • the proper Gaussian CAR (conditional autoregressive) model can now be used in BUGS code as dcar_proper, which behaves similarly to BUGS’ car.proper distribution;
  • a new nimbleMCMC function that provides one-line invocation of NIMBLE’s MCMC engine, akin to usage of JAGS and WinBUGS through R;
  • a new runCrossValidate function that will conduct k-fold cross-validation of NIMBLE models fit by MCMC;
  • dynamic indexing in BUGS code is now allowed by default;
  • and a variety of bug fixes and efficiency improvements.

Please see the NEWS file in the installed package for more details.