Chapter 12 Creating user-defined distributions and functions for models

NIMBLE allows you to define your own functions and distributions as nimbleFunctions for use in model code. As a result, NIMBLE frees you from being constrained to the functions and distributions discussed in Chapter 5. For example, instead of setting up a Dirichlet prior with multinomial data and needing to use MCMC, one could recognize that this results in a Dirichlet-multinomial distribution for the data and provide that as a user-defined distribution instead.

Since NIMBLE allows you to wrap calls to external compiled code or arbitrary R functions as nimbleFunctions, and since you can define model functions and distributions as nimbleFunctions, you can combine these features to build external compiled code or arbitrary R functions into a model. See Sections 11.6-11.7.

As of version 1.2.0, NIMBLE also supports using more advanced nimbleFunctions, those with setup code and possibly multiple methods (see Chapter 15), to provide functions and/or distributions for use in models. This allows the functions and/or distributions to store information internally. See section 12.3 below.

Note that NIMBLE generally expects user-defined distributions or functions to be defined in the global environment. If you define them in a function (which would generally be the case if you are using them in the context of parallelization), one approach would be to assign them to the global environment in your function:

## for a user-defined function
assign('myfun', myfun, envir = .GlobalEnv)
## for a user-defined distribution
assign('dfoo', dfoo, envir = .GlobalEnv)
assign('rfoo', rfoo, envir = .GlobalEnv)
## similarly for 'p' and 'q' functions if you define them

12.1 User-defined functions

To provide a new function for use in BUGS code, simply create a nimbleFunction as discussed in Chapter 11. Then use it in your BUGS code. That’s it.

Writing nimbleFunctions requires that you declare the dimensionality of arguments and the returned object (Section 11.4). Make sure that the dimensionality specified in your nimbleFunction matches how you use it in BUGS code. For example, if you define scalar parameters in your BUGS code you will want to define nimbleFunctions that take scalar arguments. Here is an example that returns twice its input argument:

timesTwo <- nimbleFunction(    
    run = function(x = double(0)) {
        returnType(double(0))
        return(2*x)
    })

code <- nimbleCode({
    for(i in 1:3) {
        mu[i] ~ dnorm(0, 1)
        mu_times_two[i] <- timesTwo(mu[i])
    }
})

The x = double(0) argument and returnType(double(0)) establish that the input and output will both be zero-dimensional (scalar) numbers.

You can define nimbleFunctions that take inputs and outputs with more dimensions. Here is an example that takes a vector (1-dimensional) as input and returns a vector with twice the input values:

vectorTimesTwo  <- nimbleFunction(    
    run = function(x = double(1)) {
        returnType(double(1))
        return(2*x)
    }
    )
code <- nimbleCode({
    for(i in 1:3) {
        mu[i] ~ dnorm(0, 1)
    }
    mu_times_two[1:3] <- vectorTimesTwo(mu[1:3])
})

There is a subtle difference between the mu_times_two variables in the two examples. In the first example, there are individual nodes for each mu_times_two[i]. In the second example, there is a single multivariate node, mu_times_two[1:3]. Each implementation could be more efficient for different needs. For example, suppose an algorithm modifies the value of mu[2] and then updates nodes that depend on it. In the first example, mu_times_two[2] would be updated. In the second example mu_times_two[1:3] would be updated because it is a single, vector node.

At present in compiled use of a model, you cannot provide a scalar argument where the user-defined nimbleFunction expects a vector; unlike in R, scalars are not simply vectors of length 1.

12.2 User-defined distributions

To provide a user-defined distribution, you need to define density (‘d’) and (optionally) simulation (‘r’) nimbleFunctions for your distribution. In many cases you can then simply use your distribution in BUGS code as you would any distribution already provided by NIMBLE, while in a few special cases²⁴ you need to explicitly register your distribution as described in Section 12.2.1.

You need to provide the simulation (‘r’) function if any algorithm used with a model that uses the distribution needs to simulate from the distribution. This is not the case for NIMBLE’s built-in MCMC sampler functions, and therefore the simulation function is not generally required for standard MCMC in NIMBLE. However, the ‘r’ function is necessary for initialization of nodes that are assigned the user-defined distribution if no initial value is provided, and for sampling posterior predictive nodes (those nodes with no downstream data dependencies) that are assigned the user-defined distribution.

You can optionally provide distribution (‘p’) and quantile (‘q’) functions, which will allow truncation to be applied to a user-defined distribution. You can also provide a list of alternative parameterizations, but only if you explicitly register the distribution.

Here is an extended example of providing a univariate exponential distribution (solely for illustration as this is already provided by NIMBLE) and a multivariate Dirichlet-multinomial distribution.

dmyexp <- nimbleFunction(
    run = function(x = double(0), rate = double(0, default = 1), 
        log = integer(0, default = 0)) {
        returnType(double(0))
        logProb <- log(rate) - x*rate
        if(log) return(logProb)
        else return(exp(logProb)) 
    })

rmyexp <- nimbleFunction(
    run = function(n = integer(0), rate = double(0, default = 1)) {
        returnType(double(0))
        if(n != 1) print("rmyexp only allows n = 1; using n = 1.")
        dev <- runif(1, 0, 1)
        return(-log(1-dev) / rate)
    })

pmyexp <- nimbleFunction(
    run = function(q = double(0), rate = double(0, default = 1), 
        lower.tail = integer(0, default = 1), 
        log.p = integer(0, default = 0)) {
        returnType(double(0))
        if(!lower.tail) { 
            logp <- -rate * q
            if(log.p) return(logp)
            else return(exp(logp))
        } else {
            p <- 1 - exp(-rate * q)
            if(!log.p) return(p)
            else return(log(p))
        }
    })

qmyexp <- nimbleFunction(
    run = function(p = double(0), rate = double(0, default = 1), 
        lower.tail = integer(0, default = 1), 
        log.p = integer(0, default = 0)) {
        returnType(double(0))
        if(log.p) p <- exp(p)
        if(!lower.tail) p <- 1 - p
        return(-log(1 - p) / rate)
    })

ddirchmulti <- nimbleFunction(
    run = function(x = double(1), alpha = double(1), size = double(0), 
        log = integer(0, default = 0)) {
        returnType(double(0))
        logProb <- lgamma(size+1) - sum(lgamma(x+1)) + lgamma(sum(alpha)) -
            sum(lgamma(alpha)) + sum(lgamma(alpha + x)) -
            lgamma(sum(alpha) + size)
        if(log) return(logProb)
        else return(exp(logProb))
    })

rdirchmulti <- nimbleFunction(
    run = function(n = integer(0), alpha = double(1), size = double(0)) {
        returnType(double(1))
        if(n != 1) print("rdirchmulti only allows n = 1; using n = 1.")
        p <- rdirch(1, alpha)
        return(rmulti(1, size = size, prob = p))
    })

code <- nimbleCode({
     y[1:K] ~ ddirchmulti(alpha[1:K], n)
     for(i in 1:K) {
         alpha[i] ~ dmyexp(1/3)
      }
     })

model <- nimbleModel(code, constants = list(K = 5, n = 10))

The distribution-related functions should take as input the parameters for a single parameterization, which will be the canonical parameterization that NIMBLE will use.

Here are more details on the requirements for distribution-related nimbleFunctions, which follow R’s conventions:

Your distribution-related functions must have names that begin with ‘d’, ‘r’, ‘p’ and ‘q’. The name of the distribution must not be identical to any of the NIMBLE-provided distributions.
All simulation (‘r’) functions must take n as their first argument. Note that you may simply have your function only handle n=1 and return an warning for other values of n.
NIMBLE uses doubles for numerical calculations, so we suggest simply using doubles in general, even for integer-valued parameters or values of random variables. In fact, non-scalars must be declared as doubles.
All density functions must have as their last argument log and implement return of the log probability density. NIMBLE algorithms typically use only log = 1 (i.e., TRUE), but we recommend you implement the log = 0 (i.e., FALSE) case for completeness.
All distribution and quantile functions must have their last two arguments be (in order) lower.tail and log.p. These functions must work for lower.tail = 1 (i.e., TRUE) and log.p = 0 (i.e., FALSE), as these are the inputs we use when working with truncated distributions. It is your choice whether you implement the necessary calculations for other combinations of these inputs, but again we recommend doing so for completeness.
Define the nimbleFunctions in R’s global environment. Don’t expect R’s standard scoping to work²⁵.

12.2.1 Using registerDistributions for alternative parameterizations and providing other information

Behind the scenes, NIMBLE uses the function registerDistributions to set up new distributions for use in BUGS code. In some circumstances, you will need to call registerDistributions directly to provide information that NIMBLE can’t obtain automatically from the nimbleFunctions you write.

The cases in which you’ll need to explicitly call registerDistributions are when you want to do any of the following:

provide alternative parameterizations,
indicate a distribution is discrete, and
provide the range of possible values for a distribution.

If you would like to allow for multiple parameterizations, you can do this via the Rdist element of the list provided to registerDistributions as illustrated below. If you provide CDF (‘p’) and inverse CDF (quantile, i.e. ‘q’) functions, be sure to specify pqAvail = TRUE when you call registerDistributions. Here’s an example of using registerDistributions to provide an alternative parameterization (scale instead of rate) and to provide the range for the user-defined exponential distribution. We can then use the alternative parameterization in our BUGS code.

registerDistributions(list(
    dmyexp = list(
        BUGSdist = "dmyexp(rate, scale)",
        Rdist = "dmyexp(rate = 1/scale)",
        altParams = c("scale = 1/rate", "mean = 1/rate"),
        pqAvail = TRUE, 
        range = c(0, Inf)
        )
    ))

code <- nimbleCode({
     y[1:K] ~ ddirchmulti(alpha[1:K], n)
     for(i in 1:K) {
         alpha[i] ~ T(dmyexp(scale = 3), 0, 100)
      }
     })

model <- nimbleModel(code, constants = list(K = 5, n = 10), 
                     inits = list(alpha = rep(1, 5)))

There are a few rules for how you specify the information about a distribution that you provide to registerDistributions:

The function name in the BUGSdist entry in the list provided to registerDistributions will be the name you can use in BUGS code.
The names of your nimbleFunctions must match the function name in the Rdist entry. If missing, the Rdist entry defaults to be the same as the BUGSdist entry.
Your distribution-related functions must take as arguments the parameters in default order, starting as the second argument and in the order used in the parameterizations in the Rdist argument to registerDistributions or the BUGSdist argument if there are no alternative parameterizations.
You must specify a types entry in the list provided to registerDistributions if the distribution is multivariate or if any parameter is non-scalar.

Further details on using registerDistributions can be found via R help on registerDistributions. NIMBLE uses the same list format as registerDistributions to define its distributions. This list can be found in the R/distributions_inputList.R file in the package source code directory or as the R list nimble:::distributionsInputList.

12.3 Advanced user-defined functions and distributions

Sometimes it is useful for a user-defined function or distribution to use external information and/or hold variables that persist between calls. One can do this by creating nimbleFunctions that use setup code and have one or more methods. If you are familiar with object-oriented programming, this variety of nimbleFunction defines a class. Here we will give a toy example that is not fully explained because the NIMBLE programming concepts are covered in Chapter 15. After studying that chapter, the example here should make more sense.

As a toy example, consider this basic linear regression model with one explanatory variable:

code <- nimbleCode({
  beta0 ~ dnorm(0, sd = 100)
  beta1 ~ dnorm(0, sd = 100)
  sigma ~ dhalfflat()
  for(i in 1:n) {
    y[i] ~ dnorm(beta0 + beta1*x[i], sd = sigma)
  }
})
# Simulate some data:
x <- runif(10)
y <- rnorm(10, 3 + 0.5 * x, sd = 0.2)
model <- nimbleModel(code, constants = list(n = 10, x = x), 
                     data = list(y = y),
                     inits = list(beta0=3.1,beta1=0.4,sigma=0.3))

Now suppose we want to rewrite this model in an equivalent way by providing a “distribution” that actually contains all the steps to calculate the log likelihood for the linear regression internally (for all of y[1:10] at once), including that it contains the explanatory variable x. We could do that as follows:

nimbleOptions(allowNFobjInModel=TRUE) # This option should be TRUE by default,
                                        # but we include this step to be sure.
linear_reg <- nimbleFunction(
  setup = function(predictor) {}, # Call x "predictor" here for clarity
  methods = list(
    dcalc = function(x = double(1), # Note that this will be y!
                     beta0=double(), beta1=double(), sigma=double(),
                     log=integer(0, default=0)) {
      llh <- sum(dnorm(x, beta0+beta1*predictor, sd=sigma, log=TRUE))
      if(log) return(llh)
      return(exp(llh))
      returnType(double())
    }))
my_linear_reg <- linear_reg(x)
code2 <- nimbleCode({
  beta0 ~ dnorm(0, sd = 100)
  beta1 ~ dnorm(0, sd = 100)
  sigma ~ dhalfflat()
  y[1:n] ~ my_linear_reg$dcalc(beta0, beta1, sigma)
})
model <- nimbleModel(code2, constants = list(n = 10), 
                     data = list(y = y),
                     inits = list(beta0=3.1,beta1=0.4,sigma=0.3))

What is going on, and why would one even think of doing it this way? linear_reg here is defined as a nimbleFunction to create objects that will hold values of x internally and use them when the dcalc function (or “method”) is called. my_linear_reg is one instance of linear_reg that holds the specific values of x provided. The method dcalc has the format (arguments, return type, and prefix “d”) of a user-defined distribution, so it can be used that way via my_linear_reg$dcalc. One can provide a corresponding rcalc method, but as above this is only required if it will actually be needed by an algorithm. The suffix “calc” could be replaced with anything, but a user-defined distribution must have the prefix “d” (and can even be named just “d”), and a corresponding simulation must begin with “r”. Any method in an object like my_linear_reg can be used as a user-defined function, and the same nimbleFunction (linear_reg) can define multiple user-defined functions and/or distributions.

(Note the deliberate illustration of a potential confusion: The random variable argument to a user-defined distribution is always named “x”, which in this example will get the values of y[1:10].)

Now, why would one want to do this? There are several reasons. First, sometimes instead of a short vector like x, there may be quite large data objects involved in some calculation. It can be wasteful (of time and memory when building and compiling models) to provide large objects in models for the sole purpose of passing them to user-defined functions or distributions. It can work well to access them internally as shown with x in this example. One can also consider caching intermediate values from costly calculations or other tricks to speed up calculations.

It is vital when using this approach to understand clearly how an algorithm will use a model, and there is potential to create problems. For example, NIMBLE’s MCMC system uses a set of samplers, each of which assumes the model is fully up-to-date when it begins its operations. “Up-to-date” means that model variables all have the correct current values and all deterministic and stochastic calculations are up-to-date with those values. Sometimes, samplers copy values around to achieve this. However, a sampler has no way to know what variables a nimbleFunction object (e.g. my_linear_reg) is managing internally, and thus has no way to keep them up to date, if necessary. Techniques such as caching or “memoization” should be careful in this regard.

One can also register a distribution in a nimbleFunction object manually, for the reasons given above. The name of the distribution is simply the string “my_linear_reg$dcalc”, to use the current example.

12.4 User-defined model macros

As discussed in Section 5.2.8, users can use macros that provide a succinct syntax for expressing components of a model. The nimbleMacros package provides an initial set of macros, focused on the components of generalized linear (mixed) models.

In addition, NIMBLE users can write their own macros for use in model code.

As a toy example, suppose you wanted to create a macro called MYDIST, which creates a normal distribution with mean 0 and provided standard deviation. The macro would look like this in the NIMBLE model code:

y ~ MYDIST(sdev = 2)

Note that by convention, we define the macro name in all capital letters.

We want the final, processed output to look like:

y ~ dnorm(0, sd = 2)

One would then need to create the MYDIST macro using buildMacro.

See the vignette for the nimbleMacros package and help(buildMacro) for more information about how to use buildMacro to create a new macro.

These include providing alternative parameterizations, specifying the range of the distribution, or specifying that the distribution is a discrete distribution.↩︎
NIMBLE can’t use R’s standard scoping because it doesn’t work for R reference classes, and nimbleFunctions are implemented as custom-generated reference classes.↩︎