Comparing two MCMC Samplers
We next compare the performance of two MCMC samplers on the state space model described above. The first sampler we consider is NIMBLE’s RW_block
sampler, a Metropolis-Hastings sampler with a multivariate normal proposal distribution. This sampler has an adaptive routine that modifies the proposal covariance to look like the empirical covariance of the posterior samples of the parameters. However, as we shall see below, this proposal covariance adaptation does not lead to efficient sampling for our state space model.
We first build and compile the MCMC algorithm.
RW_mcmcConfig <- configureMCMC(stateSpaceModel)
RW_mcmcConfig$removeSamplers(c('a', 'b', 'sigOE', 'sigPN'))
RW_mcmcConfig$addSampler(target = c('a', 'b', 'sigOE', 'sigPN'), type = 'RW_block')
RW_mcmc <- buildMCMC(RW_mcmcConfig)
C_RW_mcmc <- compileNimble(RW_mcmc, project = stateSpaceModel)
We next run the compiled MCMC algorithm for 10,000 iterations, recording the overall MCMC efficiency from the posterior output. The overall efficiency here is defined as , where ESS denotes the effective sample size, and the total run-time of the sampling algorithm. The minimum is taken over all parameters that were sampled. We repeat this process 5 times to get a very rough idea of the average minimum efficiency for this combination of model and sampler.
RW_minEfficiency <- numeric(5)
for(i in 1:5){
runTime <- system.time(C_RW_mcmc$run(50000, progressBar = FALSE))['elapsed']
RW_mcmcOutput <- as.mcmc(as.matrix(C_RW_mcmc$mvSamples))
RW_minEfficiency[i] <- min(effectiveSize(RW_mcmcOutput)/runTime)
}
summary(RW_minEfficiency)
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 0.3323 0.4800 0.5505 0.7567 0.7341 1.6869
Examining a trace plot of the output below, we see that the $a$ and $b$ parameters are mixing especially poorly.
plot(RW_mcmcOutput, density = FALSE)
Plotting the posterior samples of against those of reveals a strong negative correlation. This presents a problem for the Metropolis-Hastings sampler — we have found that adaptive algorithms used to tune the proposal covariance are often slow to reach a covariance that performs well for blocks of strongly correlated parameters.
plot.default(RW_mcmcOutput[,'a'], RW_mcmcOutput[,'b'])
cor(RW_mcmcOutput[,'a'], RW_mcmcOutput[,'b'])
In such situations with strong posterior correlation, we’ve found the AFSS to often run much more efficiently, so we next build and compile an MCMC algorithm using the AFSS sampler. Our hope is that the AFSS sampler will be better able to to produce efficient samples in the face of high posterior correlation.
AFSS_mcmcConfig <- configureMCMC(stateSpaceModel)
AFSS_mcmcConfig$removeSamplers(c('a', 'b', 'sigOE', 'sigPN'))
AFSS_mcmcConfig$addSampler(target = c('a', 'b', 'sigOE', 'sigPN'), type = 'AF_slice')
AFSS_mcmc<- buildMCMC(AFSS_mcmcConfig)
C_AFSS_mcmc <- compileNimble(AFSS_mcmc, project = stateSpaceModel, resetFunctions = TRUE)
We again run the AFSS MCMC algorithm 5 times, each with 10,000 MCMC iterations.
AFSS_minEfficiency <- numeric(5)
for(i in 1:5){
runTime <- system.time(C_AFSS_mcmc$run(50000, progressBar = FALSE))['elapsed']
AFSS_mcmcOutput <- as.mcmc(as.matrix(C_AFSS_mcmc$mvSamples))
AFSS_minEfficiency[i] <- min(effectiveSize(AFSS_mcmcOutput)/runTime)
}
summary(AFSS_minEfficiency)
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 9.467 9.686 10.549 10.889 10.724 14.020
Note that the minimum overall efficiency of the AFSS sampler is approximately 28
times that of the RW_block
sampler. Additionally, trace plots from the output of the AFSS sampler show that the and parameters are mixing much more effectively than they were under the RW_block
sampler.
plot(AFSS_mcmcOutput, density = FALSE)
Tibbits, M. M, C. Groendyke, M. Haran, et al.
(2014).
“Automated factor slice sampling”.
In: Journal of Computational and Graphical Statistics 23.2, pp. 543–563.
Turek, D, P. de
Valpine, C. J. Paciorek, et al.
(2017).
“Automated parameter blocking for efficient Markov chain Monte Carlo sampling”.
In: Bayesian Analysis 12.2, pp. 465–490.