NIMBLE short course at March ENAR meeting in Nashville

Posted on January 4, 2020January 4, 2020 by Chris Paciorek

We’ll be holding a half-day short course on NIMBLE on March 22 at the ENAR meeting in Nashville, Tennessee.

The annual ENAR meeting is a major biostatistics conference, sponsored by the eastern North American region of the International Biometric Society.

The short course will focus on usage of NIMBLE in applied statistics and is titled:

“Programming with hierarchical statistical models:
Using the BUGS-compatible NIMBLE system for MCMC and more”

More details on the conference and the short course are available at the conference website.

Variable selection in NIMBLE using reversible jump MCMC

Posted on January 2, 2020January 2, 2020 by Chris Paciorek

Prepared by Sally Paganin.

Reversible Jump MCMC Overview

Reversible Jump MCMC (RJMCMC) is a general framework for MCMC simulation in which the dimension of the parameter space (i.e., the number of parameters) can vary between iterations of the Markov chain. It can be viewed as an extension of the Metropolis-Hastings algorithm onto more general state spaces. A common use case for RJMCMC is for variable selection in regression-style problems, where the dimension of the parameter space varies as variables are included or excluded from the regression specification.

Recently we added an RJMCMC sampler for variable selection to NIMBLE, using a univariate normal distribution as proposal distribution. There are two ways to use RJMCMC variable selection in your model. If you know the prior probability for inclusion of a variable in the model, you can use that directly in the RJMCMC without modifying your model. If you need the prior probability for inclusion in the model to be a model node itself, such as if it will have a prior and be estimated, you will need to write the model with indicator variables. In this post we will illustrate the basic usage of NIMBLE RJMCMC in both situations.

More information can be found in the NIMBLE User Manual and via help(configureRJ).

Linear regression example

In the following we consider a linear regression example in which we have 15 explanatory variables, and five of those are real effects while the others have no effect. First we simulate some data.

N <- 100
p <- 15
set.seed(1)
X <- matrix(rnorm(N*p), nrow = N, ncol = p)
true_betas <- c(c(0.1, 0.2, 0.3, 0.4, 0.5),
                rep(0, 10))

y <- rnorm(N, X%*%true_betas, sd = 1)

summary(lm(y ~ X))

## 
## Call:
## lm(formula = y ~ X)
## 
## Residuals:
##      Min       1Q   Median       3Q      Max 
## -2.64569 -0.50496  0.01613  0.59480  1.97766 
## 
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)    
## (Intercept) -0.04190    0.11511  -0.364   0.7168    
## X1           0.06574    0.12113   0.543   0.5888    
## X2           0.21162    0.11469   1.845   0.0686 .  
## X3           0.16606    0.10823   1.534   0.1287    
## X4           0.66582    0.11376   5.853 9.06e-08 ***
## X5           0.51343    0.09351   5.490 4.18e-07 ***
## X6           0.01506    0.11399   0.132   0.8952    
## X7          -0.12203    0.10127  -1.205   0.2316    
## X8           0.18177    0.10168   1.788   0.0774 .  
## X9          -0.09645    0.10617  -0.908   0.3663    
## X10          0.15986    0.11294   1.416   0.1606    
## X11          0.03806    0.10530   0.361   0.7186    
## X12          0.05354    0.10834   0.494   0.6225    
## X13         -0.02510    0.10516  -0.239   0.8119    
## X14         -0.07184    0.12842  -0.559   0.5774    
## X15         -0.04327    0.11236  -0.385   0.7011    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 1.035 on 84 degrees of freedom
## Multiple R-squared:  0.5337,	Adjusted R-squared:  0.4505 
## F-statistic:  6.41 on 15 and 84 DF,  p-value: 7.726e-09

Reversible jump with indicator variables

Next we set up the model. In this case we explicitly include indicator variables that include or exclude the corresponding predictor variable. For this example we assume the indicator variables are exchangeable and we include the inclusion probability in the inference.

lmIndicatorCode <- nimbleCode({
  sigma ~ dunif(0, 20)  ## uniform prior per Gelman (2006)
  psi ~ dunif(0,1)    ## prior on inclusion probability

  for(i in 1:numVars) {
    z[i] ~ dbern(psi) ## indicator variable for each coefficient
    beta[i] ~ dnorm(0, sd = 100)
    zbeta[i] <- z[i] * beta[i]  ## indicator * beta
  }
  for(i in 1:N) {
    pred.y[i] <- inprod(X[i, 1:numVars], zbeta[1:numVars])
    y[i] ~ dnorm(pred.y[i], sd = sigma)
  }
})

## Set up the model.
lmIndicatorConstants <- list(N = 100, numVars = 15)
lmIndicatorInits <- list(sigma = 1, psi = 0.5,
                         beta = rnorm(lmIndicatorConstants$numVars),
                         z = sample(0:1, lmIndicatorConstants$numVars, 0.5))

lmIndicatorData  <- list(y = y, X = X)
lmIndicatorModel <- nimbleModel(code = lmIndicatorCode, constants = lmIndicatorConstants,
                                inits = lmIndicatorInits, data = lmIndicatorData)

## defining model...

## building model...

## setting data and initial values...

## running calculate on model (any error reports that follow may simply reflect missing values in model variables) ... 
## checking model sizes and dimensions...
## model building finished.

The above model code can potentially be used to set up variable selection in NIMBLE without using RJMCMC, since the indicator variables can turn the regression parameters off and on. However, in that case the MCMC sampling can be inefficient because a given regression parameter can wander widely in the parameter space when the corresponding variable is not in the model. This can make it difficult for the variable to be put back into the model, unless the prior for that variable is (perhaps artificially) made somewhat informative. Configuring RJMCMC sampling via our NIMBLE function configureRJ results in the MCMC not sampling the regression coefficients for variables for iterations where the variables are not in the model.

Configuring RJMCMC

The RJMCMC sampler can be added to the MCMC configuration by calling the function configureRJ(). In the example considered we introduced z as indicator variables associated with the regression coefficients beta. We can pass these, respectively, to configureRJ using the arguments indicatorNodes and targetNodes. The control arguments allow one to specify the mean and the scale of the normal proposal distribution used when proposing to put a coefficient back into the model.

lmIndicatorConf <- configureMCMC(lmIndicatorModel)
lmIndicatorConf$addMonitors('z')

## thin = 1: sigma, psi, beta, z

configureRJ(lmIndicatorConf,
            targetNodes = 'beta',
            indicatorNodes = 'z',
            control = list(mean = 0, scale = .2))

Checking the assigned samplers we see that the indicator variables are each assigned an RJ_indicator sampler whose targetNode is the corresponding coefficient, while the beta parameters have a RJ_toggled sampler. The latter sampler is a modified version of the original sampler to the targetNode that is invoked only when the variable is currently in the model.

## Check the assigned samplers
lmIndicatorConf$printSamplers(c("z[1]", "beta[1]"))

## [3] RJ_indicator sampler: z[1],  mean: 0,  scale: 0.2,  targetNode: beta[1]
## [4] RJ_toggled sampler: beta[1],  samplerType: conjugate_dnorm_dnorm

Build and run the RJMCMC

mcmcIndicatorRJ <- buildMCMC(lmIndicatorConf)

cIndicatorModel <- compileNimble(lmIndicatorModel)

## compiling... this may take a minute. Use 'showCompilerOutput = TRUE' to see C++ compilation details.

## compilation finished.

CMCMCIndicatorRJ <- compileNimble(mcmcIndicatorRJ, project = lmIndicatorModel)

## compiling... this may take a minute. Use 'showCompilerOutput = TRUE' to see C++ compilation details.
## compilation finished.

set.seed(1)
system.time(samplesIndicator <- runMCMC(CMCMCIndicatorRJ, niter = 10000, nburnin = 1000))

## running chain 1...

## |-------------|-------------|-------------|-------------|
## |-------------------------------------------------------|

##    user  system elapsed 
##   2.749   0.004   2.753

Looking at the results

We can look at the sampled values of the indicator and corresponding coefficient for some of the variables.

par(mfrow = c(2, 2))
plot(samplesIndicator[,'beta[3]'], pch = 16, cex = 0.4, main = "beta[3] traceplot")
plot(samplesIndicator[,'z[3]'], pch = 16, cex = 0.4, main = "z[3] traceplot")
plot(samplesIndicator[,'beta[5]'], pch = 16, cex = 0.4, main = "beta[5] traceplot")
plot(samplesIndicator[,'z[5]'], pch = 16, cex = 0.4, main = "z[5] traceplot")

Individual inclusion probabilities

Now let’s look at the inference on the variable selection problem. We see that the fourth and fifth predictors are almost always included (these are the ones with the largest true coefficient values), while the others, including some variables that are truly associated with the outcome but have smaller true coefficient values, are almost never included.

par(mfrow = c(1, 1))
zCols <- grep("z\\[", colnames(samplesIndicator))
posterior_inclusion_prob <- colMeans(samplesIndicator[, zCols])
plot(1:length(true_betas), posterior_inclusion_prob,
        xlab = "beta", ylab = "inclusion probability",
        main = "Inclusion probabilities for each beta")

Reversible jump without indicator variables

If we assume that the inclusion probabilities for the coefficients are known, we can use the RJMCMC with model code written without indicator variables.

lmNoIndicatorCode <- nimbleCode({
  sigma ~ dunif(0, 20)

  for(i in 1:numVars) {
    beta[i] ~ dnorm(0, sd = 100)
  }
  for(i in 1:N) {
    pred.y[i] <- inprod(X[i, 1:numVars], beta[1:numVars])
    y[i] ~ dnorm(pred.y[i], sd = sigma)
  }
})

## Define model constants, inits and data
lmNoIndicatorConstants <- list(N = 100, numVars = 15)
lmNoIndicatorInits <- list(sigma = 1,
                           beta = rnorm(lmNoIndicatorConstants$numVars))
lmNoIndicatorData  <- list(y = y, X = X)

lmNoIndicatorModel <- nimbleModel(code = lmNoIndicatorCode,
                                  constants = lmNoIndicatorConstants,
                                  inits = lmNoIndicatorInits,
                                  data = lmNoIndicatorData)

## defining model...

## building model...

## setting data and initial values...

## running calculate on model (any error reports that follow may simply reflect missing values in model variables) ... 
## checking model sizes and dimensions...
## model building finished.

Configuring RJMCMC with no indicator variables

Again, the RJMCMC sampler can be added to the MCMC configuration by calling the function configureRJ() for nodes specified in targetNodes, but since there are no indicator variables we need to provide the prior inclusion probabilities. We use the priorProb argument, and we can provide either a vector of values or a common value.

lmNoIndicatorConf <- configureMCMC(lmNoIndicatorModel)
configureRJ(lmNoIndicatorConf,
            targetNodes = 'beta',
            priorProb = 0.5,
            control = list(mean = 0, scale = .2))

Since there are no indicator variables in this case, a RJ_fixed_prior sampler is assigned directly to each of coefficents along with the RJ_toggled sampler, which still uses the default sampler for the node, but only if the corresponding variable is in the model at a given iteration. In addition in this case one can set the coefficient to a value different from zero via the fixedValue argument in the control list.

## Check the assigned samplers
lmNoIndicatorConf$printSamplers(c("beta[1]"))

## [2] RJ_fixed_prior sampler: beta[1],  priorProb: 0.5,  mean: 0,  scale: 0.2,  fixedValue: 0
## [3] RJ_toggled sampler: beta[1],  samplerType: conjugate_dnorm_dnorm,  fixedValue: 0

Build and run the RJMCMC

mcmcNoIndicatorRJ <- buildMCMC(lmNoIndicatorConf)

cNoIndicatorModel <- compileNimble(lmNoIndicatorModel)

## compiling... this may take a minute. Use 'showCompilerOutput = TRUE' to see C++ compilation details.

## compilation finished.

CMCMCNoIndicatorRJ <- compileNimble(mcmcNoIndicatorRJ, project = lmNoIndicatorModel)

## compiling... this may take a minute. Use 'showCompilerOutput = TRUE' to see C++ compilation details.
## compilation finished.

set.seed(100)
system.time(samplesNoIndicator <- runMCMC(CMCMCNoIndicatorRJ, niter = 10000, nburnin = 1000))

## running chain 1...

## |-------------|-------------|-------------|-------------|
## |-------------------------------------------------------|

##    user  system elapsed 
##   3.014   0.004   3.050

Looking at the results

In this case we just look at one of the model coefficients.

plot(samplesNoIndicator[,'beta[1]'], pch = 16, cex = 0.4, main = "beta[1] traceplot")

Individual inclusion proportion

We can calculate the proportion of times each coefficient is included in the model.

betaCols <- grep("beta\\[", colnames(samplesNoIndicator))
posterior_inclusion_proportions <- colMeans(apply(samplesNoIndicator[, betaCols],
                                                  2, function(x) x != 0))
posterior_inclusion_proportions

##      beta[1]      beta[2]      beta[3]      beta[4]      beta[5] 
## 0.0017777778 0.0097777778 0.0017777778 1.0000000000 0.9996666667 
##      beta[6]      beta[7]      beta[8]      beta[9]     beta[10] 
## 0.0015555556 0.0015555556 0.0031111111 0.0018888889 0.0028888889 
##     beta[11]     beta[12]     beta[13]     beta[14]     beta[15] 
## 0.0006666667 0.0015555556 0.0007777778 0.0024444444 0.0007777778

Version 0.9.0 of NIMBLE released

Posted on December 21, 2019December 21, 2019 by Chris Paciorek

We’ve released the newest version of NIMBLE on CRAN and on our website. NIMBLE is a system for building and sharing analysis methods for statistical models, especially for hierarchical models and computationally-intensive methods (such as MCMC and SMC). Version 0.9.0 provides some new features as well as providing a variety of speed improvements, better output handling, and bug fixes.

New features and bug fixes include:

added an iterated filtering 2 (IF2) algorithm (a sequential Monte Carlo (SMC) method) for parameter estimation via maximum likelihood;
fixed several bugs in our SMC algorithms;
improved the speed of MCMC configuration;
improved the user interface for interacting with the MCMC configuration; and
improved our conjugacy checking system to detect some additional cases of conjugacy.

Please see the release notes on our website for more details.

New NIMBLE code examples

Posted on December 19, 2019December 19, 2019 by Chris Paciorek

We’ve posted a variety of new code examples for NIMBLE at https://r-nimble.org/examples.

These include some introductory code for getting started with NIMBLE, as well as examples of setting up specific kinds of models such as GLMMs, spatial models, item-response theory models, and a variety of models widely used in ecology. We also have examples of doing variable selection via reversible jump MCMC, using MCEM, and writing new distributions for use with NIMBLE.

And if you have examples that you think might be useful to make available to others, please get in touch with us.

NIMBLE short course, June 3-4, 2020 at UC Berkeley

Posted on December 11, 2019December 11, 2019 by Chris Paciorek

We’ll be holding a two-day training workshop on NIMBLE, June 3-4, 2020 in Berkeley, California. NIMBLE is a system for building and sharing analysis methods for statistical models, especially for hierarchical models and computationally-intensive methods (such as MCMC and SMC).

The tutorial will cover

the basic concepts and workflows for using NIMBLE and converting BUGS or JAGS models to work in NIMBLE.
overview of different MCMC sampling strategies and how to use them in NIMBLE.
writing new distributions and functions for more flexible modeling and more efficient computation.
tips and tricks for improving computational efficiency.
using advanced model components, including Bayesian non-parametric distributions (based on Dirichlet process priors), conditional auto-regressive (CAR) models for spatially correlated random fields, and reversible jump samplers for variable selection.
an introduction to programming new algorithms in NIMBLE.
calling R and compiled C++ code from compiled NIMBLE models or functions.

If participant interests vary sufficiently, the second half-day will be split into two tracks. One of these will likely focus on ecological models. The other will be chosen based on attendee interest from topics such as (a) advanced NIMBLE programming including writing new MCMC samplers, (b) advanced spatial or Bayesian non-parametric modeling, or (c) non-MCMC algorithms in NIMBLE such as sequential Monte Carlo. Prior to the workshop, we will survey attendee interests and adjust content to meet attendee interests.

If you are interested in attending, please pre-register to hold a spot at https://forms.gle/6AtNgfdUdvhni32Q6. The form also asks if you are interested in a relatively cheap dormitory-style housing option. No payment is necessary to pre-register. Fees to finalize registration will be $230 (regular) or $115 (student). We hope to be able to offer student travel awards; more information will follow.

The workshop will assume attendees have a basic understanding of hierarchical/Bayesian models and MCMC, the BUGS (or JAGS) model language, and some familiarity with R.

NIMBLE short course at Bayes Comp 2020 conference

Posted on September 13, 2019September 13, 2019 by Chris Paciorek

We’ll be giving a short course on NIMBLE on January 7, 2020 at the Bayes Comp 2020 conference being held January 7-10 in Gainesville, Florida, USA.

Bayes Comp is a popular biennial ISBA-sponsored conference focused on computational methods/algorithms/technologies for Bayesian inference.

The short course focuses on programming algorithms in NIMBLE and is titled:

“Developing, modifying, and sharing Bayesian algorithms (MCMC samplers, SMC, and more) using the NIMBLE platform in R.”

More details on the conference and the short course are available at the conference website.

Version 0.8.0 of NIMBLE released

Posted on June 7, 2019June 7, 2019 by Chris Paciorek

We’ve released the newest version of NIMBLE on CRAN and on our website. NIMBLE is a system for building and sharing analysis methods for statistical models, especially for hierarchical models and computationally-intensive methods (such as MCMC and SMC). Version 0.8.0 provides some new features, speed improvements, and a variety of bug fixes and better error/warning messages.

New features include:

a reversible jump MCMC sampler for variable selection via configureRJ();
greatly improved speed of MCMC sampling for Bayesian nonparametric models with a dCRP distribution by not sampling parameters for empty clusters;
experimental faster MCMC configuration, available by setting nimbleOptions(oldConjugacyChecking = FALSE) and nimbleOptions(useNewConfigureMCMC = TRUE);
and improved warning and error messages for MCEM and slice sampling.

Please see the release notes on our website for more details.

Version 0.7.1 of NIMBLE released

Posted on March 15, 2019March 15, 2019 by Chris Paciorek

Version 0.7.1 is primarily a maintenance release with a couple important bug fixes and a few additional features. Users with large models and users of the dCRP Bayesian nonparametric distribution are strongly encouraged to update to this version to pick up the bug fixes related to these uses.

New features include:

recognition of normal-normal conjugacy in additional multivariate regression settings;
handling of six-dimensional arrays in models.

Please see the release notes on our website for more details.

Version 0.7.0 of NIMBLE released

Posted on February 5, 2019February 5, 2019 by Chris Paciorek

We’ve released the newest version of NIMBLE on CRAN and on our website. NIMBLE is a system for building and sharing analysis methods for statistical models, especially for hierarchical models and computationally-intensive methods (such as MCMC and SMC). Version 0.7.0 provides a variety of new features, as well as various bug fixes.

New features include:

greatly improved efficiency of sampling for Bayesian nonparametric (BNP) mixture models that use the dCRP (Chinese Restaurant process) distribution;
addition of the double exponential (Laplace) distribution for use in models and nimbleFunctions;
a new “RW_wishart” MCMC sampler, for sampling non-conjugate Wishart and inverse-Wishart nodes;
handling of the normal-inverse gamma conjugacy for BNP mixture models using the dCRP distribution;
enhanced functionality of the getSamplesDPmeasure function for posterior sampling from BNP random measures with Dirichlet process priors.
handling of five-dimensional arrays in models;
enhanced warning messages; and
an HTML version of the NIMBLE manual.

Please see the NEWS file in the installed package for more details.

Bayesian Nonparametric Models in NIMBLE, Part 2: Nonparametric Random Effects

Posted on December 6, 2018December 6, 2018 by Chris Paciorek

Bayesian nonparametrics in NIMBLE: Nonparametric random effects

Overview

NIMBLE is a hierarchical modeling package that uses nearly the same language for model specification as the popular MCMC packages WinBUGS, OpenBUGS and JAGS, while making the modeling language extensible — you can add distributions and functions — and also allowing customization of the algorithms used to estimate the parameters of the model.

Recently, we added support for Markov chain Monte Carlo (MCMC) inference for Bayesian nonparametric (BNP) mixture models to NIMBLE. In particular, starting with version 0.6-11, NIMBLE provides functionality for fitting models involving Dirichlet process priors using either the Chinese Restaurant Process (CRP) or a truncated stick-breaking (SB) representation of the Dirichlet process prior.

We will illustrate NIMBLE’s BNP capabilities using two examples. In a previous post, we showed how to use nonparametric mixture models with different kernels for density estimation. In this post, we will take a parametric generalized linear mixed model and show how to switch to a nonparametric representation of the random effects that avoids the assumption of normally-distributed random effects.

For more detailed information on NIMBLE and Bayesian nonparametrics in NIMBLE, see the NIMBLE User Manual.

Parametric meta analysis of Avandia myocardial infarctions (MIs)

We will illustrate the use of nonparametric mixture models for modeling random effects distributions in the context of a meta-analysis of the side effects of a formerly very popular drug for diabetes called Avandia. The data we analyze played a role in raising serious questions about the safety of this drug. The question is whether Avandia use increases the risk of myocardial infarction (heart attack). There are 48 studies (the 49th study in the data file is different in some ways and excluded here), each with treatment and control arms.

dat <- read.csv('https://rawgit.com/nimble-dev/nimble-demos/master/intro_bnp/avandia.csv')
head(dat)

##   trial nAvandia avandiaMI nControl controlMI
## 1     1      357         2      176         0
## 2     2      391         2      207         1
## 3     3      774         1      185         1
## 4     4      213         0      109         1
## 5     5      232         1      116         0
## 6     6       43         0       47         1

dat <- dat[-49, ]

Model formulation

We begin with a standard generalized linear mixed model (GLMM)-based meta analysis. The vectors $n$ and $x$ contain the total number of patients in the control and the number of patients suffering from myocardial infarctions in the control group of each study, respectively. Similarly, the vectors $m$ and $y$ contain similar information for patients receiving the drug Avandia. The model takes the form

$x_{i} \mid \theta, \gamma_i \sim \mbox{Bin} \left(n_i, \frac{\exp\left\{ \gamma_i \right\}}{1 + \exp\left\{ \gamma_i \right\}} \right) , \quad\quad y_{i} \mid \theta, \gamma_i \sim \mbox{Bin} \left(m_i, \frac{\exp\left\{ \theta + \gamma_i \right\}}{1 + \exp\left\{ \theta + \gamma_i \right\}} \right)$

where the random effects, $\gamma_i$ , follow a common normal distribution, $\gamma_i \sim \mbox{N}(0, \tau^2)$ , and the $\theta$ and $\tau^2$ are given reasonably non-informative priors. The parameter $\theta$ quantifies the difference in risk between the control and treatment arms, while the $\gamma_i$ quantify study-specific variation.

This model can be specified in NIMBLE using the following code:

x <- dat$controlMI
n <- dat$nControl
y <- dat$avandiaMI
m <- dat$nAvandia

nStudies <- nrow(dat)
data <- list(x = x, y = y)
constants = list(n = n, m = m, nStudies = nStudies)

codeParam <- nimbleCode({
    for(i in 1:nStudies) {
        y[i] ~ dbin(size = m[i], prob = q[i]) # avandia MIs
        x[i] ~ dbin(size = n[i], prob = p[i]) # control MIs
        q[i] <- expit(theta + gamma[i])       # Avandia log-odds
        p[i] <- expit(gamma[i])               # control log-odds
        gamma[i] ~ dnorm(mu, var = tau2)      # study effects
    }
    theta ~ dflat()        # effect of Avandia
    # random effects hyperparameters
    mu ~ dnorm(0, 10)
    tau2 ~ dinvgamma(2, 1)
})

Running the MCMC

Let’s run a basic MCMC.

set.seed(9)
inits = list(theta = 0, mu = 0, tau2 = 1, gamma = rnorm(nStudies))

samples <- nimbleMCMC(code = codeParam, data = data, inits = inits,
                      constants = constants, monitors = c("mu", "tau2", "theta", "gamma"),
                      thin = 10, niter = 22000, nburnin = 2000, nchains = 1, setSeed = TRUE)

## defining model...

## building model...

## setting data and initial values...

## running calculate on model (any error reports that follow may simply reflect missing values in model variables) ... 
## checking model sizes and dimensions...
## checking model calculations...
## model building finished.
## compiling... this may take a minute. Use 'showCompilerOutput = TRUE' to see C++ compiler details.
## compilation finished.
## runMCMC's handling of nburnin changed in nimble version 0.6-11. Previously, nburnin samples were discarded *post-thinning*.  Now nburnin samples are discarded *pre-thinning*.  The number of samples returned will be floor((niter-nburnin)/thin).
## running chain 1...

## |-------------|-------------|-------------|-------------|
## |-------------------------------------------------------|

par(mfrow = c(1, 4), cex = 1.1, mgp = c(1.8,.7,0))
ts.plot(samples[ , 'theta'], xlab = 'iteration', ylab = expression(theta))
hist(samples[ , 'theta'], xlab = expression(theta), main = 'effect of Avandia')

gammaCols <- grep('gamma', colnames(samples))
gammaMn <- colMeans(samples[ , gammaCols])
hist(gammaMn, xlab = 'posterior means', main = 'random effects distribution')
hist(samples[1000, gammaCols], xlab = 'single draw',
                   main = 'random effects distribution')

The results suggests there is an overall difference in risk between the control and treatment arms. But what about the normality assumption? Are our conclusions robust to that assumption? Perhaps the random effects distribution are skewed. (And recall that the estimates above of the random effects are generated under the normality assumption, which pushes the estimated effects to look more normal…)

DP-based random effects modeling for meta analysis

Model formulation

Now, we use a nonparametric distribution for the $\gamma_i$ s. More specifically, we assume that each $\gamma_i$ is generated from a location-scale mixture of normal distributions:

$\gamma_i \mid \mu_i, \tau_i^2 \sim \mbox{N}(\mu_i, \tau_i^2), \quad\quad (\mu_i, \tau_i^2) \mid G \sim G, \quad\quad G \sim \mbox{DP}(\alpha, H),$

where $H$ is a normal-inverse-gamma distribution.

This specification induces clustering among the random effects. As in the case of density estimation problems, the DP prior allows the data to determine the number of components, from as few as one component (i.e., simplifying to the parametric model), to as many as $n$ components, i.e., one component for each observation. This allows the distribution of the random effects to be multimodal if the data supports such behavior, greatly increasing its flexibility. This model can be specified in NIMBLE using the following code:

codeBNP <- nimbleCode({
    for(i in 1:nStudies) {
        y[i] ~ dbin(size = m[i], prob = q[i])   # avandia MIs
        x[i] ~ dbin(size = n[i], prob = p[i])   # control MIs
        q[i] <- expit(theta + gamma[i])         # Avandia log-odds
        p[i] <- expit(gamma[i])                 # control log-odds
        gamma[i] ~ dnorm(mu[i], var = tau2[i])  # random effects from mixture dist.
        mu[i] <- muTilde[xi[i]]                 # mean for random effect from cluster xi[i]
        tau2[i] <- tau2Tilde[xi[i]]             # var for random effect from cluster xi[i]
    }
    # mixture component parameters drawn from base measures
    for(i in 1:nStudies) {
        muTilde[i] ~ dnorm(mu0, var = var0)
        tau2Tilde[i] ~ dinvgamma(a0, b0)
    }
    # CRP for clustering studies to mixture components
    xi[1:nStudies] ~ dCRP(alpha, size = nStudies)
    # hyperparameters
    alpha ~ dgamma(1, 1)
    mu0 ~ dnorm(0, 10)
    var0 ~ dinvgamma(2, 1)
    a0 ~ dinvgamma(2, 1)
    b0 ~ dinvgamma(2, 1)
    theta ~ dflat()          # effect of Avandia
})

Running the MCMC

The following code compiles the model and runs a collapsed Gibbs sampler for the model

inits <- list(gamma = rnorm(nStudies), xi = sample(1:2, nStudies, replace = TRUE),
              alpha = 1, mu0 = 0, var0 = 1, a0 = 1, b0 = 1, theta = 0,
              muTilde = rnorm(nStudies), tau2Tilde = rep(1, nStudies))

samplesBNP <- nimbleMCMC(code = codeBNP, data = data, inits = inits,
               constants = constants,
               monitors = c("theta", "gamma", "alpha", "xi", "mu0", "var0", "a0", "b0"),
               thin = 10, niter = 22000, nburnin = 2000, nchains = 1, setSeed = TRUE)

## defining model...

## building model...

## setting data and initial values...

## running calculate on model (any error reports that follow may simply reflect missing values in model variables) ... 
## checking model sizes and dimensions...
## checking model calculations...
## model building finished.
## compiling... this may take a minute. Use 'showCompilerOutput = TRUE' to see C++ compiler details.
## compilation finished.
## runMCMC's handling of nburnin changed in nimble version 0.6-11. Previously, nburnin samples were discarded *post-thinning*.  Now nburnin samples are discarded *pre-thinning*.  The number of samples returned will be floor((niter-nburnin)/thin).
## running chain 1...

## |-------------|-------------|-------------|-------------|
## |-------------------------------------------------------|

gammaCols <- grep('gamma', colnames(samplesBNP))
gammaMn <- colMeans(samplesBNP[ , gammaCols])
xiCols <- grep('xi', colnames(samplesBNP))

par(mfrow = c(1,5), cex = 1.1, mgp = c(1.8,.7,0))
ts.plot(samplesBNP[ , 'theta'], xlab = 'iteration', ylab = expression(theta),
   main = expression(paste('traceplot for ', theta)))
hist(samplesBNP[ , 'theta'], xlab = expression(theta), main = 'effect of Avandia')
hist(gammaMn, xlab = 'posterior means',
              main = "random effects distrib'n")
hist(samplesBNP[1000, gammaCols], xlab = 'single draw',
                   main = "random effects distrib'n")

# How many mixture components are inferred?
xiRes <- samplesBNP[ , xiCols]
nGrps <- apply(xiRes, 1, function(x) length(unique(x)))
ts.plot(nGrps, xlab = 'iteration', ylab = 'number of components',
   main = 'number of components')

The primary inference seems robust to the original parametric assumption. This is probably driven by the fact that there is not much evidence of lack of normality in the random effects distribution (as evidenced by the fact that the posterior distribution of the number of mixture components places a large amount of probability on exactly one component).

More information and future development

Please see our User Manual for more details.

We’re in the midst of improvements to the existing BNP functionality as well as adding additional Bayesian nonparametric models, such as hierarchical Dirichlet processes and Pitman-Yor processes, so please add yourself to our announcement or user support/discussion Google groups.