Version 0.6-10 of NIMBLE released

Posted on March 29, 2018March 29, 2018 by Chris Paciorek

We’ve released the newest version of NIMBLE on CRAN and on our website. Version 0.6-10 primarily contains updates to the NIMBLE internals that may speed up building and compilation of models and algorithms, as well as a few bug fixes.

Changes include:

some steps of model and algorithm building and compilation are faster;
compiled execution with multivariate distributions or function arguments may be faster;
data can now be provided as a numeric data frame rather than a matrix;
to run WAIC, a user now must set ‘enableWAIC’ to TRUE, either in NIMBLE’s options or as an argument to buildMCMC();
if ‘enableWAIC’ is TRUE, buildMCMC() will now check to make sure that the nodes monitored by the MCMC algorithm will lead to a valid WAIC calculation; and
the use of identityMatrix() is deprecated in favor of diag().

Please see the NEWS file in the installed package for more details

NIMBLE webinar Friday April 13

Posted on March 20, 2018March 20, 2018 by Chris Paciorek

We’ll be presenting a webinar on NIMBLE, hosted by the Eastern North America Region of the International Biometric Society. Details are as follows.

Programming with hierarchical statistical models: An introduction to the BUGS-compatible NIMBLE system for MCMC and more

Friday, April 13, 2018

11:00 a.m. – 1:00 p.m. EST

Must register before April 12. You can register here. (You’ll need to create an account on the ENAR website and there is a modest fee – from $25 for ENAR student members up through $85 for non-IBS members.)

This webinar will introduce attendees to the NIMBLE system for programming with hierarchical models in R. NIMBLE (r-nimble.org) is a system for flexible programming and dissemination of algorithms that builds on the BUGS language for declaring hierarchical models. NIMBLE provides analysts with a flexible system for using MCMC, sequential Monte Carlo and other techniques on user-specified models. It provides developers and methodologists with the ability to write algorithms in an R-like syntax that can be easily disseminated to users. C++ versions of models and algorithms are created for speed, but these are manipulated from R without any need for analysts or algorithm developers to program in C++.

While analysts can use NIMBLE as a drop-in replacement for WinBUGS or JAGS, NIMBLE provides greatly enhanced functionality in a number of ways. The webinar will first show how to specify a hierarchical statistical model using BUGS syntax (including user-defined function and distributions) and fit that model using MCMC (including user customization for better performance). We will demonstrate the use of NIMBLE for biostatistical methods such as semiparametric random effects models and clustering models. We will close with a discussion of how to use the system to write algorithms for use with hierarchical models, including building and disseminating your own methods.

Presenter:

Chris Paciorek
Adjunct Professor, Statistical Computing Consultant
Department of Statistics, University of California, Berkeley

Version 0.6-9 of NIMBLE released

Posted on February 9, 2018February 9, 2018 by Chris Paciorek

We’ve released the newest version of NIMBLE on CRAN and on our website. Version 0.6-9 is primarily a maintenance release with various bug fixes and fixes for CRAN packaging issues.

New features include:

dimensions in a model will now be determined from either ‘inits’ or ‘data’ if not otherwise available;
one can now specify “nBootReps = NA” in the runCrossValidate() function, which will prevent the Monte Carlo error from being calculated;
runCrossValidate() now returns the averaged loss over all k folds, instead of the summed loss;
We’ve added the besselK function to the NIMBLE language;
and a variety of bug fixes.

Please see the NEWS file in the installed package for more details

Version 0.6-8 of NIMBLE released

Posted on December 4, 2017December 4, 2017 by Chris Paciorek

We’ve released the newest version of NIMBLE on CRAN and on our website a week ago. Version 0.6-8 has a few new features, and more are on the way in the next few months.

New features include:

the proper Gaussian CAR (conditional autoregressive) model can now be used in BUGS code as dcar_proper, which behaves similarly to BUGS’ car.proper distribution;
a new nimbleMCMC function that provides one-line invocation of NIMBLE’s MCMC engine, akin to usage of JAGS and WinBUGS through R;
a new runCrossValidate function that will conduct k-fold cross-validation of NIMBLE models fit by MCMC;
dynamic indexing in BUGS code is now allowed by default;
and a variety of bug fixes and efficiency improvements.

Please see the NEWS file in the installed package for more details.

Version 0.6-6 of NIMBLE released!

Posted on July 29, 2017February 20, 2022 by Chris Paciorek

We’ve just released the newest version of NIMBLE on CRAN and on our website. Version 0.6-6 has some important new features, and more are on the way in the next few months.

Эти артефакты не имели практической или коммерческой ценности. На самом деле их продажа была строго запрещена обычаем. А так как предметы всегда находились в движении, их владельцы редко носили их. Тем не менее, массимы совершали модные подарки подруге долгие путешествия, чтобы обменять их, рискуя жизнью и здоровьем, когда они путешествовали по коварным водам Тихого океана на своих шатких каноэ.

New features include:

dynamic indexes are now allowed in BUGS code — indexes of a variable no longer need to be constants but can be other nodes or functions of other nodes; for this release this is a beta feature that needs to be enabled with nimbleOptions(allowDynamicIndexing = TRUE);
the intrinsic Gaussian CAR (conditional autoregressive) model can now be used in BUGS code as dcar_normal, which behaves similarly to BUGS’ car.normal distribution;
optim is now part of the NIMBLE language and can be used in nimbleFunctions;
it is possible to call out to external compiled code or back to R functions from a nimbleFunction using nimbleExternalCall() and nimbleRcall() (this is an experimental feature);
the WAIC model selection criterion can be calculated using the calculateWAIC() method for MCMC objects;
the bootstrap and auxiliary particle filters can now return their ESS values;
and a variety of bug fixes.

Please see the NEWS file in the installed package for more details.

Finally, we’re deep in the midst of development work to enable automatic differentiation, Tensorflow as an alternative back-end computational engine, additional spatial models, and Bayesian nonparametrics.

Version 0.6-5 of NIMBLE released!

Posted on June 7, 2017June 7, 2017 by Chris Paciorek

We’ve just released the newest version of NIMBLE on CRAN and on our website. Version 0.6-5 is mostly devoted to bug fixes and packaging fixes for CRAN, but there is some new functionality:

addition of the functions c(), seq(), rep(), `:`, diag() for use in BUGS code;
addition of two improper distributions (dflat and dhalfflat) as well as the inverse-Wishart distribution;
the ability to estimate the asymptotic covariance of the estimates in NIMBLE’s MCEM algorithm;
the ability to use nimbleLists in any nimbleFunction, newly including nimbleFunctions without setup code;
and a variety of bug fixes and better error trapping.

Please see the NEWS file in the installed package for more details.

Better block sampling in MCMC with the Automated Factor Slice Sampler

Posted on May 9, 2017May 9, 2017 by Nicholas Michaud

One nice feature of NIMBLE’s MCMC system is that a user can easily write new samplers from R, combine them with NIMBLE’s samplers, and have them automatically compiled to C++ via the NIMBLE compiler. We’ve observed that block sampling using a simple adaptive multivariate random walk Metropolis-Hastings sampler doesn’t always work well in practice, so we decided to implement the Automated Factor Slice sampler (AFSS) of Tibbits, Groendyke, Haran, and Liechty (2014) and see how it does on a (somewhat artificial) example with severe posterior correlation problems.

Roughly speaking, the AFSS works by conducting univariate slice sampling in directions determined by the eigenvectors of the marginal posterior covariance matrix for blocks of parameters in a model. So far, we’ve found the AFSS often outperforms random walk block sampling. To compare performance, we look at MCMC efficiency, which we define for each parameter as effective sample size (ESS) divided by computation time. We define overall MCMC efficiency as the minimum MCMC efficiency of all the parameters, because one needs all parameters to be well mixed.

We’ll demonstrate the performance of the AFSS on the correlated state space model described in Turek, de Valpine, Paciorek, Anderson-Bergman, and others (2017).

Model Creation

Assume $x_{i}$ is the latent state and $y_{i}$ is the observation at time $i$ for $i=1,\ldots,100$ . We define the state space model as

$x_{i} \sim N(a \cdot x_{i-1} + b, \sigma_{PN})$
$y_{i} \sim N(x_{i}, \sigma_{OE})$

for $i = 2, \ldots, 100$ , with initial states

$x_{1} \sim N(\frac{b}{1-a}, \frac{\sigma_{PN}}{\sqrt{1-a^2}})$
$y_{1} \sim N(x_{1}, \sigma_{OE})$

and prior distributions

$a \sim Unif(-0.999, 0.999)$
$b \sim N(0, 1000)$
$\sigma_{PN} \sim Unif(0, 1)$
$\sigma_{OE} \sim Unif(0, 1)$

where $N(\mu, \sigma)$ denotes a normal distribution with mean $\mu$ and standard deviation $\sigma$ .

A file named model_SSMcorrelated.RData with the BUGS model code, data, constants, and initial values for our model can be downloaded here.

## load the nimble library and set seed
library('nimble')
set.seed(1)
load('model_SSMcorrelated.RData')
## build and compile the model
stateSpaceModel <- nimbleModel(code = code,
                              data = data,
                              constants = constants,
                              inits = inits,
                              check = FALSE)

C_stateSpaceModel <- compileNimble(stateSpaceModel)

Comparing two MCMC Samplers

We next compare the performance of two MCMC samplers on the state space model described above. The first sampler we consider is NIMBLE’s RW_block sampler, a Metropolis-Hastings sampler with a multivariate normal proposal distribution. This sampler has an adaptive routine that modifies the proposal covariance to look like the empirical covariance of the posterior samples of the parameters. However, as we shall see below, this proposal covariance adaptation does not lead to efficient sampling for our state space model.

We first build and compile the MCMC algorithm.

RW_mcmcConfig <- configureMCMC(stateSpaceModel)
RW_mcmcConfig$removeSamplers(c('a', 'b', 'sigOE', 'sigPN'))
RW_mcmcConfig$addSampler(target = c('a', 'b', 'sigOE', 'sigPN'), type = 'RW_block')
RW_mcmc <- buildMCMC(RW_mcmcConfig)
C_RW_mcmc <- compileNimble(RW_mcmc, project = stateSpaceModel)

We next run the compiled MCMC algorithm for 10,000 iterations, recording the overall MCMC efficiency from the posterior output. The overall efficiency here is defined as $min(\frac{ESS}{T})$ , where ESS denotes the effective sample size, and $T$ the total run-time of the sampling algorithm. The minimum is taken over all parameters that were sampled. We repeat this process 5 times to get a very rough idea of the average minimum efficiency for this combination of model and sampler.

RW_minEfficiency <- numeric(5)
for(i in 1:5){
  runTime <- system.time(C_RW_mcmc$run(50000, progressBar = FALSE))['elapsed']
  RW_mcmcOutput <- as.mcmc(as.matrix(C_RW_mcmc$mvSamples))
  RW_minEfficiency[i] <- min(effectiveSize(RW_mcmcOutput)/runTime)
}
summary(RW_minEfficiency)

##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##  0.3323  0.4800  0.5505  0.7567  0.7341  1.6869

Examining a trace plot of the output below, we see that the $a$ and $b$ parameters are mixing especially poorly.

plot(RW_mcmcOutput, density = FALSE)

Plotting the posterior samples of $a$ against those of $b$ reveals a strong negative correlation. This presents a problem for the Metropolis-Hastings sampler — we have found that adaptive algorithms used to tune the proposal covariance are often slow to reach a covariance that performs well for blocks of strongly correlated parameters.

plot.default(RW_mcmcOutput[,'a'], RW_mcmcOutput[,'b'])

cor(RW_mcmcOutput[,'a'], RW_mcmcOutput[,'b'])

## [1] -0.9201277

In such situations with strong posterior correlation, we’ve found the AFSS to often run much more efficiently, so we next build and compile an MCMC algorithm using the AFSS sampler. Our hope is that the AFSS sampler will be better able to to produce efficient samples in the face of high posterior correlation.

AFSS_mcmcConfig <- configureMCMC(stateSpaceModel)
AFSS_mcmcConfig$removeSamplers(c('a', 'b', 'sigOE', 'sigPN'))
AFSS_mcmcConfig$addSampler(target = c('a', 'b', 'sigOE', 'sigPN'), type = 'AF_slice')
AFSS_mcmc<- buildMCMC(AFSS_mcmcConfig)
C_AFSS_mcmc <- compileNimble(AFSS_mcmc, project = stateSpaceModel, resetFunctions = TRUE)

We again run the AFSS MCMC algorithm 5 times, each with 10,000 MCMC iterations.

AFSS_minEfficiency <- numeric(5)
for(i in 1:5){
  runTime <- system.time(C_AFSS_mcmc$run(50000, progressBar = FALSE))['elapsed']
  AFSS_mcmcOutput <- as.mcmc(as.matrix(C_AFSS_mcmc$mvSamples))
  AFSS_minEfficiency[i] <- min(effectiveSize(AFSS_mcmcOutput)/runTime)
}
summary(AFSS_minEfficiency)

##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##   9.467   9.686  10.549  10.889  10.724  14.020

Note that the minimum overall efficiency of the AFSS sampler is approximately 28 times that of the RW_block sampler. Additionally, trace plots from the output of the AFSS sampler show that the $a$ and $b$ parameters are mixing much more effectively than they were under the RW_block sampler.

plot(AFSS_mcmcOutput, density = FALSE)

Tibbits, M. M, C. Groendyke, M. Haran, et al.
(2014).
“Automated factor slice sampling”.
In: Journal of Computational and Graphical Statistics 23.2, pp. 543–563.

Turek, D, P. de
Valpine, C. J. Paciorek, et al.
(2017).
“Automated parameter blocking for efficient Markov chain Monte Carlo sampling”.
In: Bayesian Analysis 12.2, pp. 465–490.

Version 0.6-4 of NIMBLE released!

Posted on April 26, 2017April 26, 2017 by Chris Paciorek

We’ve just released the newest version of NIMBLE on CRAN and on our website. Version 0.6-4 has a bunch of new functionality for writing your own algorithms (using a natural R-like syntax) that can operate on user-provided models, specified using BUGS syntax. It also enhances the functionality of our built-in MCMC and other algorithms.

addition of the functions c(), seq(), rep(), `:`, diag(), dim(), and which() for use in the NIMBLE language (i.e., run code) — usage generally mimics usage in R;
a complete reorganization of the User Manual, with the goal of clarifying how one can write nimbleFunctions to program with models;
addition of the adaptive factor slice sampler, which can improve MCMC sampling for correlated blocks of parameters;
addition of a new sampler that can handle non-conjugate Dirichlet settings;
addition of a nimbleList data structure that behaves like R lists for use in nimbleFunctions;
addition of eigendecomposition and SVD functions for use in the NIMBLE language;
additional flexibility in providing initial values for numeric(), logical(), integer(), matrix(), and array();
logical vectors and operators can now be used in the NIMBLE language;
indexing of vectors and matrices can now use arbitrary numeric and logical vectors;
one can now index a vector of node names provided to values(), and more general indexing of node names in calculate(), simulate(), calculateDiff() and getLogProb();
addition of the inverse-gamma distribution;
use of recycling for distribution functions used in the NIMBLE language;
enhanced MCMC configuration functionality;
users can specify a user-defined BUGS distribution by simply providing a user-defined ‘d’ function without an ‘r’ function for use when an algorithm doesn’t need the ‘r’ function;
and a variety of bug fixes, speedups, and better error trapping and checking.

Please see the NEWS file in the installed package for more details.

Writing reversible jump MCMC in NIMBLE

Posted on February 15, 2017February 15, 2017 by nimble-admin

Writing reversible jump MCMC samplers in NIMBLE

Introduction

Reversible jump Markov chain Monte Carlo (RJMCMC) is a powerful method for drawing posterior samples over multiple models by jumping between models as part of the sampling. For a simple example that I’ll use below, think about a regression model where we don’t know which explanatory variables to include, so we want to do variable selection. There may be a huge number of possible combinations of variables, so it would be nice to explore the combinations as part of one MCMC run rather than running many different MCMCs on some chosen combinations of variables. To do it in one MCMC, one sets up a model that includes all possible variables and coefficients. Then “removing” a variable from the model is equivalent to setting its coefficient to zero, and “adding” it back into the model requires a valid move to a non-zero coefficient. Reversible jump MCMC methods provide a way to do that.

Reversible jump is different enough from other MCMC situations that packages like WinBUGS, OpenBUGS, JAGS, and Stan don’t do it. An alternative way to set up the problem, which does not involve the technicality of changing model dimension, is to use indicator variables. An indicator variable is either zero or one and is multiplied by another parameter. Thus when the indicator is 0, the parameter that is multipled by 0 is effectively removed from the model. Darren Wilkinson has a nice old blog post on using indicator variables for Bayesian variable selection in BUGS code. The problem with using indicator variables is that they can create a lot of extra MCMC work and the samplers operating on them may not be well designed for their situation.

NIMBLE lets one program model-generic algorithms to use with models written in the BUGS language. The MCMC system works by first making a configuration in R, which can be modified by a user or a program, and then building and compiling the MCMC. The nimbleFunction programming system makes it easy to write new kinds of samplers.

The aim of this blog post is to illustrate how one can write reversible jump MCMC in NIMBLE. A variant of this may be incorporated into a later version of NIMBLE.

Example model

For illustration, I’ll use an extremely simple model: linear regression with two candidate explanatory variables. I’ll assume the first, x1, should definitely be included. But the analyst is not sure about the second, x2, and wants to use reversible jump to include it or exclude it from the model. I won’t deal with the issue of choosing the prior probability that it should be in the model. Instead I’ll just pick a simple choice and stay focused on the reversible jump aspect of the example. The methods below could be applied en masse to large models.

Here I’ll simulate data to use:

N <- 20
x1 <- runif(N, -1, 1)
x2 <- runif(N, -1, 1)
Y <- rnorm(N, 1.5 + 0.5 * x1, sd = 1)

I’ll take two approaches to implementing RJ sampling. In the first, I’ll use a traditional indicator variable and write the RJMCMC sampler to use it. In the second, I’ll write the RJMCMC sampler to incorporate the prior probability of inclusion for the coefficient it is sampling, so the indicator variable won’t be needed in the model.

First we’ll need nimble:

library(nimble)

RJMCMC implementation 1, with indicator variable included

Here is BUGS code for the first method, with an indicator variable written into the model, and the creation of a NIMBLE model object from it. Note that although RJMCMC technically jumps between models of different dimensions, we still start by creating the largest model so that changes of dimension can occur by setting some parameters to zero (or, in the second method, possibly another fixed value).

simpleCode1 <- nimbleCode({
    beta0 ~ dnorm(0, sd = 100)
    beta1 ~ dnorm(0, sd = 100)
    beta2 ~ dnorm(0, sd = 100)
    sigma ~ dunif(0, 100)
    z2 ~ dbern(0.8)  ## indicator variable for including beta2
    beta2z2 <- beta2 * z2
    for(i in 1:N) {
        Ypred[i] <- beta0 + beta1 * x1[i] + beta2z2 * x2[i]
        Y[i] ~ dnorm(Ypred[i], sd = sigma)
    }
})

simpleModel1 <- nimbleModel(simpleCode1,
                            data = list(Y = Y, x1 = x1, x2 = x2),
                            constants = list(N = N),
                            inits = list(beta0 = 0, beta1 = 0, beta2 = 0, sigma = sd(Y), z2 = 1))

Now here are two custom samplers. The first one will sample beta2 only if the indicator variable z2 is 1 (meaning that beta2 is included in the model). It does this by containing a regular random walk sampler but only calling it when the indicator is 1 (we could perhaps set it up to contain any sampler to be used when z2 is 1, but for now it’s a random walk sampler). The second sampler makes reversible jump proposals to move beta2 in and out of the model. When it is out of the model, both beta2 and z2 are set to zero. Since beta2 will be zero every time z2 is zero, we don’t really need beta2z2, but it ensures correct behavior in other cases, like if someone runs default samplers on the model and expects the indicator variable to do its job correctly. For use in reversible jump, z2’s role is really to trigger the prior probability (set to 0.8 in this example) of being in the model.

Don’t worry about the warning message emitted by NIMBLE. They are there because when a nimbleFunction is defined it tries to make sure the user knows anything else that needs to be defined.

RW_sampler_nonzero_indicator <- nimbleFunction(
    contains = sampler_BASE,
    setup = function(model, mvSaved, target, control) {
        regular_RW_sampler <- sampler_RW(model, mvSaved, target = target, control = control$RWcontrol)
        indicatorNode <- control$indicator
    },
    run = function() {
        if(model[[indicatorNode]] == 1) regular_RW_sampler$run()
    },
    methods = list(
        reset = function() {regular_RW_sampler$reset()}
    ))

## Warning in nf_checkDSLcode(code): For this nimbleFunction to compile, these
## functions must be defined as nimbleFunctions or nimbleFunction methods:
## reset.

RJindicatorSampler <- nimbleFunction(
    contains = sampler_BASE,
    setup = function( model, mvSaved, target, control ) {
        ## target should be the name of the indicator node, 'z2' above
        ## control should have an element called coef for the name of the corresponding coefficient, 'beta2' above.  
        coefNode <- control$coef
        scale <- control$scale
        calcNodes <- model$getDependencies(c(coefNode, target))
    },
    run = function( ) { ## The reversible-jump updates happen here.
        currentIndicator <- model[[target]]
        currentLogProb <- model$getLogProb(calcNodes)
        if(currentIndicator == 1) {
            ## propose removing it
            currentCoef <- model[[coefNode]]
            logProbReverseProposal <- dnorm(0, currentCoef, sd = scale, log = TRUE)
            model[[target]] <<- 0
            model[[coefNode]] <<- 0
            proposalLogProb <- model$calculate(calcNodes)
            log_accept_prob <- proposalLogProb - currentLogProb + logProbReverseProposal
        } else {
            ## propose adding it
            proposalCoef <- rnorm(1, 0, sd = scale)
            model[[target]] <<- 1
            model[[coefNode]] <<- proposalCoef
            logProbForwardProposal <- dnorm(0, proposalCoef, sd = scale, log = TRUE)
            proposalLogProb <- model$calculate(calcNodes)
            log_accept_prob <- proposalLogProb - currentLogProb - logProbForwardProposal
        }
        accept <- decide(log_accept_prob)
        if(accept) {
            copy(from = model, to = mvSaved, row = 1, nodes = calcNodes, logProb = TRUE)
        } else {
            copy(from = mvSaved, to = model, row = 1, nodes = calcNodes, logProb = TRUE)
        }
    },
    methods = list(reset = function() {
    })
    )

Now we’ll set up and run the samplers:

mcmcConf1 <- configureMCMC(simpleModel1)
mcmcConf1$removeSamplers('z2')
mcmcConf1$addSampler(target = 'z2',
                     type = RJindicatorSampler,
                     control = list(scale = 1, coef = 'beta2'))
mcmcConf1$removeSamplers('beta2')
mcmcConf1$addSampler(target = 'beta2',
                     type = 'RW_sampler_nonzero_indicator',
                     control = list(indicator = 'z2',
                                    RWcontrol = list(adaptive = TRUE,
                                                   adaptInterval = 100,
                                                   scale = 1,
                                                   log = FALSE,
                                                   reflective = FALSE)))

mcmc1 <- buildMCMC(mcmcConf1)
compiled1 <- compileNimble(simpleModel1, mcmc1)
compiled1$mcmc1$run(10000)

## |-------------|-------------|-------------|-------------|
## |-------------------------------------------------------|

## NULL

samples1 <- as.matrix(compiled1$mcmc1$mvSamples)

Here is a trace plot of the beta2 (slope) samples. The thick line at zero corresponds to having beta2 removed from the model.

plot(samples1[,'beta2'])

And here is a trace plot of the z2 (indicator variable) samples.

plot(samples1[,'z2'])

The chains look reasonable.

As a quick check of reasonableness, let’s compare the beta2 samples to what we’d get if it was always included in the model. I’ll do that by setting up default samplers and then removing the sampler for z2 (and z2 should be 1).

mcmcConf1b <- configureMCMC(simpleModel1)
mcmcConf1b$removeSamplers('z2')
mcmc1b <- buildMCMC(mcmcConf1b)
compiled1b <- compileNimble(simpleModel1, mcmc1b)
compiled1b$mcmc1b$run(10000)

## |-------------|-------------|-------------|-------------|
## |-------------------------------------------------------|

## NULL

samples1b <- as.matrix(compiled1b$mcmc1b$mvSamples)
plot(samples1b[,'beta2'])

qqplot(samples1[ samples1[,'z2'] == 1, 'beta2'], samples1b[,'beta2'])
abline(0,1)

That looks correct, in the sense that the distribution of beta2 given that it’s in the model (using reversible jump) should match the distribution of beta2 when it is
always in the model.

RJ implementation 2, without indicator variables

Now I’ll set up the second version of the model and samplers. I won’t include the indicator variable in the model but will instead include the prior probability for inclusion in the sampler. One added bit of generality is that being “out of the model” will be defined as taking some fixedValue, to be provided, which will typically but not necessarily be zero. These functions are very similar to the ones above.

Here is the code to define and build a model without the indicator variable:

simpleCode2 <- nimbleCode({
    beta0 ~ dnorm(0, sd = 100)
    beta1 ~ dnorm(0, sd = 100)
    beta2 ~ dnorm(0, sd = 100)
    sigma ~ dunif(0, 100)
    for(i in 1:N) {
        Ypred[i] <- beta0 + beta1 * x1[i] + beta2 * x2[i]
        Y[i] ~ dnorm(Ypred[i], sd = sigma)
    }
})

simpleModel2 <- nimbleModel(simpleCode2,
                            data = list(Y = Y, x1 = x1, x2 = x2),
                            constants = list(N = N),
                            inits = list(beta0 = 0, beta1 = 0, beta2 = 0, sigma = sd(Y)))

And here are the samplers (again, ignore the warning):

RW_sampler_nonzero <- nimbleFunction(
    ## "nonzero" is a misnomer because it can check whether it sits at any fixedValue, not just 0
    contains = sampler_BASE,
    setup = function(model, mvSaved, target, control) {
        regular_RW_sampler <- sampler_RW(model, mvSaved, target = target, control = control$RWcontrol)
        fixedValue <- control$fixedValue
    },
    run = function() { ## Now there is no indicator variable, so check if the target node is exactly
                       ## equal to the fixedValue representing "not in the model".
        if(model[[target]] != fixedValue) regular_RW_sampler$run()
    },
    methods = list(
        reset = function() {regular_RW_sampler$reset()}
    ))

## Warning in nf_checkDSLcode(code): For this nimbleFunction to compile, these
## functions must be defined as nimbleFunctions or nimbleFunction methods:
## reset.

RJsampler <- nimbleFunction(
    contains = sampler_BASE,
    setup = function( model, mvSaved, target, control ) {
        ## target should be a coefficient to be set to a fixed value (usually zero) or not
        ## control should have an element called fixedValue (usually 0),
        ##    a scale for jumps to and from the fixedValue,
        ##    and a prior prob of taking its fixedValue
        fixedValue <- control$fixedValue
        scale <- control$scale
        ## The control list contains the prior probability of inclusion, and we can pre-calculate
        ## this log ratio because it's what we'll need later.
        logRatioProbFixedOverProbNotFixed <- log(control$prior) - log(1-control$prior)
        calcNodes <- model$getDependencies(target)
    },
    run = function( ) { ## The reversible-jump moves happen here
        currentValue <- model[[target]]
        currentLogProb <- model$getLogProb(calcNodes)
        if(currentValue != fixedValue) { ## There is no indicator variable, so check if current value matches fixedValue
            ## propose removing it (setting it to fixedValue)
            logProbReverseProposal <- dnorm(fixedValue, currentValue, sd = scale, log = TRUE)
            model[[target]] <<- fixedValue
            proposalLogProb <- model$calculate(calcNodes)
            log_accept_prob <- proposalLogProb - currentLogProb - logRatioProbFixedOverProbNotFixed + logProbReverseProposal
        } else {
            ## propose adding it
            proposalValue <- rnorm(1, fixedValue, sd = scale)
            model[[target]] <<- proposalValue
            logProbForwardProposal <- dnorm(fixedValue, proposalValue, sd = scale, log = TRUE)
            proposalLogProb <- model$calculate(calcNodes)
            log_accept_prob <- proposalLogProb - currentLogProb + logRatioProbFixedOverProbNotFixed - logProbForwardProposal
        }
        accept <- decide(log_accept_prob)
        if(accept) {
            copy(from = model, to = mvSaved, row = 1, nodes = calcNodes, logProb = TRUE)
        } else {
            copy(from = mvSaved, to = model, row = 1, nodes = calcNodes, logProb = TRUE)
        }
    },
    methods = list(reset = function() {
    })
    )

Now let’s set up and use the samplers

mcmcConf2 <- configureMCMC(simpleModel2)
mcmcConf2$removeSamplers('beta2')

mcmcConf2$addSampler(target = 'beta2',
                     type = 'RJsampler',
                     control = list(fixedValue = 0, prior = 0.8, scale = 1))
mcmcConf2$addSampler(target = 'beta2',
                     type = 'RW_sampler_nonzero',
                     control = list(fixedValue = 0,
                                     RWcontrol = list(adaptive = TRUE,
                                     adaptInterval = 100,
                                     scale = 1,
                                     log = FALSE,
                                     reflective = FALSE)))

mcmc2 <- buildMCMC(mcmcConf2)
compiled2 <- compileNimble(simpleModel2, mcmc2)

compiled2$mcmc2$run(10000)

## |-------------|-------------|-------------|-------------|
## |-------------------------------------------------------|

## NULL

samples2 <- as.matrix(compiled2$mcmc2$mvSamples)

And again let’s look at the samples. As above, the horizontal line at 0 represents having beta2 removed from the model.

plot(samples2[,'beta2'])

Now let’s compare those results to results from the first method, above. They should match.

mean(samples1[,'beta2']==0)

## [1] 0.12

mean(samples2[,'beta2']==0)

## [1] 0.1173

qqplot(samples1[ samples1[,'beta2'] != 0,'beta2'], samples2[samples2[,'beta2'] != 0,'beta2'])
abline(0,1)

They match well.

How to apply this for larger models.

The samplers above could be assigned to arbitrary nodes in a model. The only additional code would arise from adding more samplers to an MCMC configuration. It would also be possible to refine the reversible-jump step to adapt the scale of its jumps in order to achieve better mixing. For example, one could try this method by Ehlers and Brooks. We’re interested in hearing from you if you plan to try using RJMCMC on your own models.

En plus des crises existantes, le monde occidental a été frappé par une pénurie de médicaments. Et nous ne parlons pas principalement de la production de quelques molécules à forte intensité de main-d’œuvre, mais des médicaments fortlapersonne de base qui devraient se trouver dans chaque armoire à pharmacie, comme le paracétamol et les antibiotiques, dont la pénurie peut avoir un impact important sur la santé de la nation.

Building Particle Filters and Particle MCMC in NIMBLE

Posted on January 9, 2017January 9, 2017 by nimble-admin

An Example of Using <code>nimble</code>‘s Particle Filtering Algorithms

This example shows how to construct and conduct inference on a state space model using particle filtering algorithms. nimble currently has versions of the bootstrap filter, the auxiliary particle filter, the ensemble Kalman filter, and the Liu and West filter implemented. Additionally, particle MCMC samplers are available and can be specified for both univariate and multivariate parameters.

Model Creation

Assume $x_{i}$ is the latent state and $y_{i}$ is the observation at time $i$ for $i=1,\ldots,t$ . We define our state space model as

$x_{i} \sim N(a \cdot x_{i-1} + b, \sigma_{PN})$
$y_{i} \sim t_{5}(x_{i}, \sigma_{OE})$

with initial states

$x_{1} \sim N(\frac{b}{1-a}, \frac{\sigma_{PN}}{\sqrt{1-a^2}})$
$y_{1} \sim t_{5}(x_{1}, \sigma_{OE})$

and prior distributions

$a \sim Unif(-0.999, 0.999)$
$b \sim N(0, 1000)$

where $N(\mu, \sigma)$ denotes a normal distribution with mean $\mu$ and standard deviation $\sigma$ , and $t_{\nu}(\mu, \sigma)$ is a shifted, scaled $t$ -distribution with center parameter $\mu$ , scale parameter $\sigma$ , and $\nu$ degrees of freedom.

We specify and build our state space model below, using $t=10$ time points:

## load the nimble library and set seed
library('nimble')
set.seed(1)

## define the model
stateSpaceCode <- nimbleCode({
    a ~ dunif(-0.9999, 0.9999)
    b ~ dnorm(0, sd = 1000)
    sigPN ~ dunif(1e-04, 1)
    sigOE ~ dunif(1e-04, 1)
    x[1] ~ dnorm(b/(1 - a), sd = sigPN/sqrt((1-a*a)))
    y[1] ~ dt(mu = x[1], sigma = sigOE, df = 5)
    for (i in 2:t) {
        x[i] ~ dnorm(a * x[i - 1] + b, sd = sigPN)
        y[i] ~ dt(mu = x[i], sigma = sigOE, df = 5)
    }
})

## define data, constants, and initial values  
data <- list(
    y = c(0.213, 1.025, 0.314, 0.521, 0.895, 1.74, 0.078, 0.474, 0.656, 0.802)
)
constants <- list(
    t = 10
)
inits <- list(
    a = 0,
    b = .5,
    sigPN = .1,
    sigOE = .05
)

## build the model
stateSpaceModel <- nimbleModel(stateSpaceCode,
                              data = data,
                              constants = constants,
                              inits = inits,
                              check = FALSE)

## defining model...

## building model...

## setting data and initial values...

## running calculate on model (any error reports that follow may simply
## reflect missing values in model variables) ...

##

## checking model sizes and dimensions...

## note that missing values (NAs) or non-finite values were found in model
## variables: x, lifted_a_times_x_oBi_minus_1_cB_plus_b. This is not an error,
## but some or all variables may need to be initialized for certain algorithms
## to operate properly.

##

## model building finished.

Construct and run a bootstrap filter

We next construct a bootstrap filter to conduct inference on the latent states of our state space model. Note that the bootstrap filter, along with the auxiliary particle filter and the ensemble Kalman filter, treat the top-level parameters a, b, sigPN, and sigOE as fixed. Therefore, the bootstrap filter below will proceed as though a = 0, b = .5, sigPN = .1, and sigOE = .05, which are the initial values that were assigned to the top-level parameters.

The bootstrap filter takes as arguments the name of the model and the name of the latent state variable within the model. The filter can also take a control list that can be used to fine-tune the algorithm’s configuration.

## build bootstrap filter and compile model and filter
bootstrapFilter <- buildBootstrapFilter(stateSpaceModel, nodes = 'x')
compiledList <- compileNimble(stateSpaceModel, bootstrapFilter)

## compiling... this may take a minute. Use 'showCompilerOutput = TRUE' to see C++ compiler details.

## compilation finished.

## run compiled filter with 10,000 particles.  
## note that the bootstrap filter returns an estimate of the log-likelihood of the model.
compiledList$bootstrapFilter$run(10000)

## [1] -28.13009

Particle filtering algorithms in nimble store weighted samples of the filtering distribution of the latent states in the mvSamples modelValues object. Equally weighted samples are stored in the mvEWSamples object. By default, nimble only stores samples from the final time point.

## extract equally weighted posterior samples of x[10]  and create a histogram
posteriorSamples <- as.matrix(compiledList$bootstrapFilter$mvEWSamples)
hist(posteriorSamples)

The auxiliary particle filter and ensemble Kalman filter can be constructed and run in the same manner as the bootstrap filter.

Conduct inference on top-level parameters using particle MCMC

Particle MCMC can be used to conduct inference on the posterior distribution of both the latent states and any top-level parameters of interest in a state space model. The particle marginal Metropolis-Hastings sampler can be specified to jointly sample the a, b, sigPN, and sigOE top level parameters within nimble‘s MCMC framework as follows:

## create MCMC specification for the state space model
stateSpaceMCMCconf <- configureMCMC(stateSpaceModel, nodes = NULL)

## add a block pMCMC sampler for a, b, sigPN, and sigOE 
stateSpaceMCMCconf$addSampler(target = c('a', 'b', 'sigPN', 'sigOE'),
                              type = 'RW_PF_block', control = list(latents = 'x'))

## build and compile pMCMC sampler
stateSpaceMCMC <- buildMCMC(stateSpaceMCMCconf)
compiledList <- compileNimble(stateSpaceModel, stateSpaceMCMC, resetFunctions = TRUE)

## compiling... this may take a minute. Use 'showCompilerOutput = TRUE' to see C++ compiler details.

## compilation finished.

## run compiled sampler for 5000 iterations
compiledList$stateSpaceMCMC$run(5000)

## |-------------|-------------|-------------|-------------|
## |-------------------------------------------------------|

## NULL

## create trace plots for each parameter
library('coda')

par(mfrow = c(2,2))
posteriorSamps <- as.mcmc(as.matrix(compiledList$stateSpaceMCMC$mvSamples))
traceplot(posteriorSamps[,'a'], ylab = 'a')
traceplot(posteriorSamps[,'b'], ylab = 'b')
traceplot(posteriorSamps[,'sigPN'], ylab = 'sigPN')
traceplot(posteriorSamps[,'sigOE'], ylab = 'sigOE')

The above RW_PF_block sampler uses a multivariate normal proposal distribution to sample vectors of top-level parameters. To sample a scalar top-level parameter, use the RW_PF sampler instead.