Chapter 11 Writing simple nimbleFunctions

11.1 Introduction to simple nimbleFunctions

nimbleFunctions are the heart of programming in NIMBLE. In this chapter, we introduce simple nimbleFunctions that contain only one function to be executed, in either compiled or uncompiled form, but no setup function or additional methods.

Defining a simple nimbleFunction is like defining an R function: nimbleFunction returns a function that can be executed, and it can also be compiled. Simple nimbleFunctions are useful for doing math or the other kinds of processing available in NIMBLE when no model or modelValues is needed. These can be used for any purpose in R programming. They can also be used as new functions and distributions in NIMBLE’s extension of BUGS (Chapter 12).

Here’s a basic example implementing the textbook calculation of least squares estimation of linear regression parameters21:

solveLeastSquares <- nimbleFunction(
    run = function(X = double(2), y = double(1)) { # type declarations
        ans <- inverse(t(X) %*% X) %*% (t(X) %*% y)
        return(ans)
        returnType(double(2))  # return type declaration
    } )

X <- matrix(rnorm(400), nrow = 100)
y <- rnorm(100)
solveLeastSquares(X, y)
##              [,1]
## [1,]  0.088547575
## [2,]  0.002025635
## [3,] -0.090387283
## [4,] -0.072660391
CsolveLeastSquares <- compileNimble(solveLeastSquares)
CsolveLeastSquares(X, y)
##              [,1]
## [1,]  0.088547575
## [2,]  0.002025635
## [3,] -0.090387283
## [4,] -0.072660391

In this example, we fit a linear model for 100 random response values (y) to four columns of randomly generated explanatory variables (X). We ran the nimbleFunction solveLeastSquares uncompiled, natively in R, allowing testing and debugging (Section 15.7). Then we compiled it and showed that the compiled version does the same thing, but faster22. NIMBLE’s compiler creates C++ that uses the Eigen (http://eigen.tuxfamily.org) library for linear algebra.

Notice that the actual NIMBLE code is written as an R function definition that is passed to nimbleFunction as the run argument. Hence we call it the run code. run code is written in the NIMBLE language. This is similar to a narrow subset of R with some additional features. Formally, we view it as a distinct language that encompasses what can be compiled from a nimbleFunction.

To write nimbleFunctions, you will need to learn:

  • what R functions are supported for NIMBLE compilation and any ways they differ from their regular R counterparts;
  • how NIMBLE handles types of variables;
  • how to declare types of nimbleFunction arguments and return values;
  • that compiled nimbleFunctions always pass arguments to each other by reference.

The next sections cover each of these topics in turn.

11.2 R functions (or variants) implemented in NIMBLE

11.2.1 Finding help for NIMBLE’s versions of R functions

Often, R help pages are available for NIMBLE’s versions of R functions using the prefix ‘nim’ and capitalizing the next letter. For example, help on NIMBLE’s version of numeric can be found by help(nimNumeric). In some cases help is found directly using the name of the function as it appears in R.

11.2.2 Basic operations

Basic R operations supported for NIMBLE compilation are listed in Table 11.1.

Table 11.1: Basic R manipulation functions in NIMBLE. For some of these functions, in addition to the usual use of help(), you can find help in R for NIMBLE’s version of a function by using the “nim” prefix and capitalizing the next letter. E.g. help(nimC) for help with c().
Function Comments (differences from R)
c() No recursive argument.
rep() No rep.int or rep_len arguments.
seq() and ‘:’ Negative integer sequences from ‘:’, e.g. , 2:1 do not work.
which() No arr.ind or useNames arguments.
diag() Works like R in three ways: diag(vector) returns a matrix with vector on the diagonal;
diag(matrix) returns the diagonal vector of matrix;
diag(n) returns an \(n \times n\) identity matrix. No nrow or ncol arguments.
diag()<- Works for assigning the diagonal vector of a matrix.
dim() Works on a vector as well as higher-dimensional arguments.
length()
is.na() Does not correctly handle NAs (or NaNs) from R that are type 'logical' or 'integer',
so convert these using as.numeric() before passing from R to NIMBLE.
is.nan()
any() One argument only; NAs treated as FALSE.
all() One argument only; NAs treated as FALSE.
numeric() Allows additional arguments to control initialization.
logical() Allows additional arguments to control initialization.
integer() Allows additional arguments to control initialization.
matrix() Allows additional arguments to control initialization.
array() Allows additional arguments to control initialization.
indexing Arbitrary integer and logical indexing is supported for objects of one or two dimensions.
For higher-dimensional objects, only : indexing works and then only to create an object
of at most two dimensions.

In addition to the above functions, we provide functions any_na() and any_nan() for finding if there are any NA or NaN values in a vector. These are equivalent to using any(is.na()) and any(is.nan()) in R, with the exception noted in the table above regarding is.na().

Other R functions with numeric arguments and return value can be called during compiled execution by wrapping them as a nimbleRcall (see Section 11.7).

Next we cover some of these functions in more detail.

11.2.2.1 numeric, integer, logical, matrix and array

numeric, integer, or logical will create a 1-dimensional vector of floating-point (or ‘double’ [precision]), integer, or logical values, respectively. The length argument specifies the vector length (default 0), and the value argument specifies the initial value either as a scalar (used for all vector elements, with default 0) or a vector. If a vector and its length is not equal to length, the remaining values will be filled by R-style recycling if the recycle argument is TRUE (which is the default). The init argument specifies whether or not to initialize the elements in compiled code (default TRUE). If first use of the variable does not rely on initial values, using init = FALSE will yield slightly more efficient performance.

matrix creates a 2-dimensional matrix object of either floating-point (if type = "double", the default), integer (if type = "integer"), or logical (if type = "logical") values. As in R, nrow and ncol arguments specify the number of rows and columns, respectively. The value and init arguments are used in the same way as for numeric, integer, and logical.

array creates a vector or higher-dimensional object, depending on the dim argument, which takes a vector of sizes for each dimension. The type, value and init argument behave the same as for matrix.

The best way to create an identity matrix is with diag(n), which returns an \(n \times n\) identity matrix. NIMBLE also provides a deprecated nimbleFunction identityMatrix that does the same thing.

Examples of these functions, and the related function \(`setSize`\) for changing the size of a numeric object, are given in Section 11.3.2.4.

11.2.2.2 length and dim

length behaves like R’s length function. It returns the entire length of X. That means if X is multivariate, length(X) returns the product of the sizes of the dimensions. NIMBLE’s version of dim, which has synonym nimDim, behaves like R’s dim function for matrices or arrays and like R’s length function for vectors. In other words, regardless of whether the number of dimensions is 1 or more, it returns a vector of the sizes.

11.2.2.3 Deprecated creation of non-scalar objects using declare

Previous versions of NIMBLE provided a function declare for declaring variables. The more R-like functions numeric, integer, logical, matrix and array are intended to replace declare, but declare is still supported for backward compatibility. In a future version of NIMBLE, declare may be removed.

11.2.3 Math and linear algebra

Numeric scalar and matrix mathematical operations are listed in Tables 11.2-11.3.

As in R, many scalar operations in NIMBLE will work component-wise on vectors or higher dimensional objects. For example if B and C are vectors, A = B + C will add them and create vector C by component-wise addition of B and C. In the current version of NIMBLE, component-wise operations generally only work for vectors and matrices, not arrays with more than two dimensions. The only exception is assignment: A = B will work up to NIMBLE’s current limit of four dimensions.

Table 11.2: Functions operating on scalars, many of which can operate on each element (component-wise) of vectors and matrices. Status column indicates if the function is currently provided in NIMBLE. Vector input column indicates if the function can take a vector as an argument (i.e., if the function is vectorized).
Usage Description Comments Status Vector input
x | y, x & y logical OR (\(|\)) and AND(&) yes yes
!x logical not yes yes
x > y, x >= y greater than (and or equal to) yes yes
x < y, x <= y less than (and or equal to) yes yes
x != y, x == y (not) equals yes yes
x + y, x - y, x * y component-wise operators mix of scalar and vector yes yes
x / y component-wise division vector \(x\) and scalar \(y\) yes yes
x^y, pow(x, y) power \(x^y\); vector \(x\),scalar \(y\) yes yes
x %% y modulo (remainder) yes no
min(x1, x2), min. (max.) of two scalars yes See pmin,
max(x1, x2) pmax
exp(x) exponential yes yes
log(x) natural logarithm yes yes
sqrt(x) square root yes yes
abs(x) absolute value yes yes
step(x) step function at 0 0 if \(x<0\), 1 if \(x>=0\) yes yes
equals(x, y) equality of two scalars 1 if \(x==y\), 0 if \(x != y\) yes yes
cube(x) third power \(x^3\) yes yes
sin(x), cos(x), trigonometric functions yes yes
tan(x)
asin(x), acos(x), inverse trigonometric functions yes yes
atan(x)
asinh(x), acosh(x), inv. hyperbolic trig. functions yes yes
atanh(x)
logit(x) logit \(\log(x/(1-x))\) yes yes
ilogit(x), expit(x) inverse logit \(\exp(x)/(1 + \exp(x))\) yes yes
probit(x) probit (Gaussian quantile) \(\Phi^{-1}(x)\) yes yes
iprobit(x), phi(x) inverse probit (Gaussian CDF) \(\Phi(x)\) yes yes
cloglog(x) complementary log log \(\log(-\log(1-x))\) yes yes
icloglog(x) inverse complementary log log \(1 - \exp(-\exp(x))\) yes yes
ceiling(x) ceiling function \(\lceil(x)\rceil\) yes yes
floor(x) floor function \(\lfloor(x)\rfloor\) yes yes
round(x) round to integer yes yes
trunc(x) truncation to integer yes yes
lgamma(x), loggam(x) log gamma function \(\log \Gamma(x)\) yes yes
besselK(x, nu, modified bessel function returns vector even if yes yes
...expon.scaled) of the second kind \(x\) a matrix/array
log1p(x) log of 1 + x \(\log(1+x)\) yes yes
lfactorial(x), log factorial \(\log x!\) yes yes
logfact(x)
qDIST(x, PARAMS) “q” distribution functions canonical parameterization yes yes
pDIST(x, PARAMS) “p” distribution functions canonical parameterization yes yes
rDIST(x, PARAMS) “r” distribution functions canonical parameterization yes yes
dDIST(x, PARAMS) “d” distribution functions canonical parameterization yes yes
sort(x) no
rank(x, s) no
ranked(x, s) no
order(x) no
Table 11.3: Functions operating on vectors and matrices. Status column indicates if the function is currently provided in NIMBLE.
Usage Description Comments Status
inverse(x) matrix inverse \(x\) symmetric, positive def. yes
chol(x) matrix Cholesky factorization \(x\) symmetric, positive def. yes
returns upper triang. matrix
t(x) matrix transpose \(x^\top\) yes
x%*%y matrix multiply \(xy\); \(x\), \(y\) conformant yes
inprod(x, y) dot product \(x^\top y\); \(x\) and \(y\) vectors yes
solve(x,y) solve system of equations \(x^{-1} y\); \(y\) matrix or vector yes
forwardsolve(x, y) solve lower-triangular system of equations \(x^{-1} y\); \(x\) lower-triangular yes
backsolve(x, y) solve upper-triangular system of equations \(x^{-1} y\); \(x\) upper-triangular yes
logdet(x) log matrix determinant \(\log|x|\) yes
asRow(x) convert vector to 1-row matrix sometimes automatic yes
asCol(x) convert vector to 1-column matrix sometimes automatic yes
sum(x) sum of elements of x yes
mean(x) mean of elements of x yes
sd(x) standard deviation of elements of x yes
prod(x) product of elements of x yes
min(x), max(x) min. (max.) of elements of x yes
pmin(x, y), pmax(x,y) vector of mins (maxs) of elements of yes
x and y
interp.lin(x, v1, v2) linear interpolation no
eigen(x) matrix eigendecomposition; returns a \(x\) symmetric yes
nimbleList of type eigenNimbleList
svd(x) matrix singular value decomposition; yes
returns a nimbleList of type
svdNimbleList

More information on the nimbleLists returned by the eigen and svd functions in NIMBLE can be found in Section 14.2.2.

11.2.4 Distribution functions

Distribution ‘d’, ‘r’, ‘p’, and ‘q’ functions can all be used from nimbleFunctions (and in BUGS model code), but care is needed in the syntax, as follows.

  • Names of the distributions generally (but not always) match those of R, which sometimes differ from BUGS. See the list below.
  • Supported parameterizations are also indicated in the list below.
  • For multivariate distributions (multivariate normal, Dirichlet, and Wishart), ‘r’ functions only return one random draw at a time, and the first argument must always be 1.
  • R’s recycling rule (re-use of an argument as needed to accommodate longer values of other arguments) is generally followed, but the returned object is always a scalar or a vector, not a matrix or array.

As in R (and nimbleFunctions), arguments are matched by order or by name (if given). Standard arguments to distribution functions in R (log, log.p, lower.tail) can be used and have the same defaults. User-defined distributions for BUGS (Chapter 12) can also be used from nimbleFunctions.

For standard distributions, we rely on R’s regular help pages (e.g., help(dgamma). For distributions unique to NIMBLE (e.g., dexp_nimble, ddirch), we provide help pages.

Supported distributions, listed by their ‘d’ function, include:

  • dbinom(x, size, prob, log)
  • dcat(x, prob, log)
  • dmulti(x, size, prob, log)
  • dnbinom(x, size, prob, log)
  • dpois(x, lambda, log)
  • dbeta(x, shape1, shape2, log)
  • dchisq(x, df, log)
  • ddexp(x, location, rate, log)
  • ddexp(x, location, scale, log)
  • dexp(x, rate, log)
  • dexp_nimble(x, rate, log)
  • dexp_nimble(x, scale, log)
  • dgamma(x, shape, rate, log)
  • dgamma(x, shape, scale, log)
  • dinvgamma(x, shape, rate, log)
  • dinvgamma(x, shape, scale, log)
  • dlnorm(x, meanlog, sdlog, log)
  • dlogis(x, location, scale, log)
  • dnorm(x, mean, sd, log)
  • dt_nonstandard(x, df, mu, sigma, log)
  • dt(x, df, log)
  • dunif(x, min, max, log)
  • dweibull(x, shape, scale, log)
  • ddirch(x, alpha, log)
  • dlkj_corr_cholesky(x, shape, size, log)
  • dmnorm_chol(x, mean, cholesky, prec_param, log)
  • dmvt_chol(x, mu, cholesky, df, prec_param, log)
  • dwish_chol(x, cholesky, df, scale_param, log)
  • dinvwish_chol(x, cholesky, df, scale_param, log)

In the last three, cholesky stands for Cholesky decomposition of the relevant matrix; prec_param is a logical indicating whether the Cholesky is of a precision matrix (TRUE) or covariance matrix (FALSE)23; and scale_param is a logical indicating whether the Cholesky is of a scale matrix (TRUE) or an inverse scale matrix (FALSE).

11.2.5 Flow control: if-then-else, for, while, and stop

These basic flow-control structures use the same syntax as in R. However, for-loops are limited to sequential integer indexing. For example, for(i in 2:5) {...} works as it does in R. Decreasing index sequences are not allowed. Unlike in R, if is not itself a function that returns a value.

We plan to include more flexible for-loops in the future, but for now we’ve included just one additional useful feature: for(i in seq_along(NFL)) will work as in R, where NFL is a nimbleFunctionList. This is described in Section 15.4.8.

stop, or equivalently nimStop, throws control to R’s error-handling system and can take a character argument that will be displayed in an error message.

11.2.6 print and cat

print, or equivalently nimPrint, prints an arbitrary set of outputs in order and adds a newline character at the end. cat or nimCat is identical, except without a newline at the end.

11.2.7 Checking for user interrupts: checkInterrupt

When you write algorithms that will run for a long time in C++, you may want to explicitly check whether a user has tried to interrupt the execution (i.e., by pressing Control-C). Simply include checkInterrupt in run code at places where a check should be done. If there has been an interrupt waiting to be handled, the process will stop and return control to R.

11.2.8 Optimization: optim and nimOptim

NIMBLE provides a nimOptim function that partially implement’s R’s optim function with some minor differences. nimOptim can use methods from R’s optim, including ‘Nelder-Mead’, ‘BFGS’, ‘CG’, ‘L-BFGS-B’, but it does not support methods ‘SANN’ and ‘Brent’. As of version 1.2.0, it also allows users to provide their own optimization methods as R functions (although this feature is subject to change in future versions). Support for R’s nlminb is provided in this way. NIMBLE’s nimOptim allows gradients to be supplied using user-provided functions if available or finite differences otherwise. Currently nimOptim does not support extra parameters to the function being optimized (via \dots), but a work-around is to create a new nimbleFunction that calls another one with the additional parameters. Finally, nimOptim requires a nimbleList datatype for the control parameter, whereas R’s optim uses a simple R list. To define the control parameter, create a default value with the nimOptimDefaultControl function, and set any desired fields.

See help(nimOptim) for details, including example usage.

11.2.9 Integration: integrate and nimIntegrate

NIMBLE provides a nimIntegrate function that implement’s R’s integrate function, which carries out adaptive Gauss-Kronrod quadrature to integrate a function of one variable over a finite or infinite interval.

In addition to general use in a nimbleFunction, this can, of course, be used in a user-defined function in model code, e.g., to implement models that involve an integral, such as certain point process and survival models.

One could also consider using nimIntegrate to numerically integrate over a parameter in a model, such as to remove a parameter that is not mixing well in an MCMC. Note that NIMBLE Laplace approximation and adaptive Gauss-Hermite quadrature for integrating over continuous parameters such as random effects. See section 16.2 and help(Laplace) for more information on those.

Here is an example of using nimIntegrate to implement a vectorized von Mises distribution.

integrand <- nimbleFunction(
    run = function(x = double(1), theta = double(1)) {
        return( exp(theta[1] * cos(x) + theta[2] * cos(2 * (x + theta[3]))) )
        returnType(double(1))
})

dGenVonMises <- nimbleFunction(
  run = function(x = double(1), mu1 = double(), mu2 = double(),
                 kappa1 = double(), kappa2 = double(),
  limits = double(1), log = integer(0, default = 0)){
    range <- limits[2] - limits[1]
    mu1R <- (mu1 - range/2)/range*2*pi
    mu2R <- (mu2 - range/2)/range*2*pi
    z <- (x - range/2)/range*2*pi
    d <- (mu1R - mu2R) %% pi
    num <- exp(kappa1 * cos(z - mu1R) + kappa2 * cos(2 * (z - mu2R)))
    tmp <- c(kappa1, kappa2, d)
    den <- nimIntegrate(integrand, lower = 0, upper = 2*pi, tmp)[1]
    dens <- num/den
    result <- dens*2*pi/range
    if(log) {
        return(log(result))
    } else return(result)
    returnType(double(1))
  }
)

cgvonmises <- compileNimble(dGenVonMises)
x <- 0:360
# plot(x, circular::dgenvonmises(circular(x, type = "angles", units = "degrees"), -1, 1, 2, 1))
# plot(x, cgvonmises(x, mu1=20, mu2=300, kappa1=2, kappa2=1, limits = c(0, 360)), type = 'l')

11.2.10 ‘nim’ synonyms for some functions

NIMBLE uses some keywords, such as dim and print, in ways similar but not identical to R. In addition, there are some keywords in NIMBLE that have the same names as R functions with quite different functionality. For example, step is part of the BUGS language, but it is also an R function for stepwise model selection. And equals is part of the BUGS language but is also used in the testthat package, which we use in testing NIMBLE.

NIMBLE tries to avoid conflicts by replacing some keywords immediately upon creating a nimbleFunction. These replacements include

  • c \(\rightarrow\) nimC
  • copy \(\rightarrow\) nimCopy
  • dim \(\rightarrow\) nimDim
  • print \(\rightarrow\) nimPrint
  • cat \(\rightarrow\) nimCat
  • step \(\rightarrow\) nimStep
  • equals \(\rightarrow\) nimEquals
  • rep \(\rightarrow\) nimRep
  • round \(\rightarrow\) nimRound
  • seq \(\rightarrow\) nimSeq
  • stop \(\rightarrow\) nimStop
  • switch \(\rightarrow\) nimSwitch
  • numeric, integer, logical \(\rightarrow\) nimNumeric, nimInteger, nimLogical
  • matrix, array \(\rightarrow\) nimMatrix, nimArray
  • optim \(\rightarrow\) nimOptim
  • integrate \(\rightarrow\) nimIntegrate

This system gives programmers the choice between using the keywords like nimPrint directly, to avoid confusion in their own code about which ‘print’ is being used, or to use the more intuitive keywords like print but remember that they are not the same as R’s functions.

11.3 How NIMBLE handles types of variables

Variables in the NIMBLE language are statically typed. Once a variable is used for one type, it can’t subsequently be used for a different type. This rule facilitates NIMBLE’s compilation to C++. The NIMBLE compiler often determines types automatically, but sometimes the programmer needs to explicitly provide them.

The elemental types supported by NIMBLE include double (floating-point), integer, logical, and character. The type of a numeric or logical object refers to the number of dimensions and the elemental type of the elements. Hence if x is created as a double matrix, it can only be used subsequently for a double matrix. The size of each dimension is not part of its type and thus can be changed. Up to four dimensions are supported for double, integer, and logical. Only vectors (one dimension) are supported for character. Unlike R, NIMBLE supports true scalars, which have 0 dimensions.

11.3.1 nimbleList data structures

A nimbleList is a data structure that can contain arbitrary other NIMBLE objects, including other nimbleLists. Like other NIMBLE types, nimbleLists are strongly typed: each nimbleList is created from a configuration that declares what types of objects it will hold. nimbleLists are covered in Chapter 14.2.

11.3.2 How numeric types work

R’s dynamic types support easy programming because one type can sometimes be transformed to another type automatically when an expression is evaluated. NIMBLE’s static types makes it stricter than R.

11.3.2.1 When NIMBLE can automatically set a numeric type

When a variable is first created by assignment, its type is determined automatically by that assignment. For example, if x has not appeared before, then

x <- A %*% B # assume A and B are double matrices or vectors

will create x to be a double matrix of the correct size (determined during execution).

11.3.2.1.1 Avoid changing types of a variable within a nimbleFunction

Because NIMBLE is statically typed, you cannot use the same variable name for two objects of different types (including objects of different dimensions).

Suppose we have (implicitly) created x as a double matrix. If x is used subsequently, it can only be used as a double matrix. This is true even if it is assigned a new value, which will again set its size automatically but cannot change its type.

x <- A %*% B # assume A and B are double matrices or vectors
x <- nimMatrix(0, nrow = 5, ncol = 2)  # OK: 'x' is still a double matrix
x <- rnorm(10)  # NOT OK: 'x' is a double vector

11.3.2.2 When a numeric object needs to be created before being used

If the contents of a variable are to be populated by assignment into some indices in steps, the variable must be created first. Further, it must be large enough for its eventual contents; it will not be automatically resized if assignments are made beyond its current size. For example, in the following code, x must be created before being filled with contents for specific indices.

x <- numeric(10)

for(i in 1:10) 
   x[i] <- foo(y[i]) 

11.3.2.3 How NIMBLE handles NA

NIMBLE supports use of NA with some caveats. In the compiled version of a nimbleFunction:

  • NA values in a logical scalar or vector will not work;
  • Assigning NA to a new variable will make that variable have type double (in R, this would be a logical!);
  • Non-trivial use of NA in integer variables may fail by having the value become a large (negative) number;
  • Use of NA in doubles should generally work.

These issues arise because NIMBLE uses R’s encoding of NA values in C(++) but uses native compiled math and type-casting in C++, which do not always preserve R’s NA encodings. In summary, NA should generally work for doubles and should be used cautiously (i.e. after you test that what you need to work actually works) for integers. For some needs, NaN is a suitable alternative to NA.

11.3.2.4 Changing the sizes of existing objects: setSize

setSize changes the size of an object, preserving its contents in column-major order.

# Example of creating and resizing a floating-point vector
# myNumericVector will be of length 10, with all elements initialized to 2 
myNumericVector <- numeric(10, value = 2) 
# resize this numeric vector to be length 20; last 10 elements will be 0
setSize(myNumericVector, 20)
# Example of creating a 1-by-10 matrix with values 1:10 and resizing it
myMatrix <- matrix(1:10, nrow = 1, ncol = 10) 
# resize this matrix to be a 10-by-10 matrix
setSize(myMatrix, c(10, 10))
# The first column will have the 1:10

11.3.2.5 Confusions between scalars and length-one vectors

In R, there is no such thing is a true scalar; scalars can always be treated as vectors of length one. NIMBLE allows true scalars, which can create confusions. For example, consider the following code:

myfun <- nimbleFunction(
    run = function(i = integer()) { # i is an integer scalar
        randomValues <- rnorm(10)   # double vector
        a <- randomValues[i]        # double scalar
        b <- randomValues[i:i]      # double vector
        d <- a + b                  # double vector
        f <- c(i)                   # integer vector
    })

In the line that creates b, the index range i:i is not evaluated until run time. Even though i:i will always evaluate to simpy i, the compiler does not determine that. Since there is a vector index range provided, the result of randomValues[i:i] is determined to be a vector. The following line then creates d as a vector, because a vector plus a scalar returns a vector. Another way to create a vector from a scalar is to use c, as illustrated in the last line.

11.3.2.6 Confusions between vectors and one-column or one-row matrices

Consider the following code:

myfun <- nimbleFunction(
    run = function() { 
        A <- matrix(value = rnorm(9), nrow = 3)
        B <- rnorm(3)
        Cmatrix <- A %*% B                # double matrix, one column
        Cvector <- (A %*% B)[,1]          # double vector
        Cmatrix <- (A %*% B)[,1]          # error, vector assigned to matrix
        Cmatrix[,1] <- (A %*% B)[,1]      # ok, if Cmatrix is large enough
    })

This creates a matrix A, a vector B, and matrix-multiplies them. The vector B is automatically treated as a one-column matrix in matrix algebra computations. The result of matrix multiplication is always a matrix, but a programmer may expect a vector, since they know the result will have one column. To make it a vector, simply extract the first column. More information about such handling is provided in the next section.

11.3.2.7 Understanding dimensions and sizes from linear algebra

As much as possible, NIMBLE behaves like R when determining types and sizes returned from linear algebra expressions, but in some cases this is not possible because R uses run-time information while NIMBLE must determine dimensions at compile time. For example, when matrix multiplying a matrix by a vector, R treats the vector as a one-column matrix unless treating it as a one-row matrix is the only way to make the expression valid, as determined at run time. NIMBLE usually must assume during compilation that it should be a one-column matrix, unless it can determine not just the number of dimensions but the actual sizes during compilation. When needed asRow and asCol can control how a vector will be treated as a matrix.

Here is a guide to such issues. Suppose v1 and v2 are vectors, and M1 is a matrix. Then

  • v1 + M1 promotes v1 to a 1-column matrix if M1 is a one-column matrix. Otherwise, this results in a run-time error. This behavior occurs for all component-wise binary functions.
  • v1 %*% M1 defaults to promoting v1 to a 1-row matrix, unless it is known at compile-time that M1 has 1 row, in which case v1 is promoted to a 1-column matrix.
  • M1 %*% v1 defaults to promoting v1 to a 1-column matrix, unless it is known at compile time that M1 has 1 column, in which case v1 is promoted to a 1-row matrix.
  • v1 %*% v2 promotes v1 to a 1-row matrix and v2 to a 1-column matrix, so the returned values is a 1x1 matrix with the inner product of v1 and v2. If you want the inner product as a scalar, use inprod(v1, v2).
  • asRow(v1) explicitly promotes v1 to a 1-row matrix. Therefore v1 %*% asRow(v2) gives the outer product of v1 and v2.
  • asCol(v1) explicitly promotes v1 to a 1-column matrix.
  • The default promotion for a vector is to a 1-column matrix. Therefore, v1 %*% t(v2) is equivalent to v1 %*% asRow(v2) .
  • When indexing, dimensions with scalar indices will be dropped. For example, M1[1,] and M1[,1] are both vectors. If you do not want this behavior, use drop=FALSE just as in R. For example, M1[1,,drop=FALSE] is a matrix.
  • The left-hand side of an assignment can use indexing, but if so it must already be correctly sized for the result. For example, Y[5:10, 20:30] <- x will not work – and could crash your R session with a segmentation fault – if Y is not already at least 10x30 in size. This can be done by setSize(Y, c(10, 30)). See Section 11.3.2.4 for more details. Note that non-indexed assignment to Y, such as Y <- x, will automatically set Y to the necessary size.

Here are some examples to illustrate the above points, assuming M2 is a square matrix.

  • Y <- v1 + M2 %*% v2 will return a 1-column matrix. If Y is created by this statement, it will be a 2-dimensional variable. If Y already exists, it must already be 2-dimesional, and it will be automatically re-sized for the result.
  • Y <- v1 + (M2 %*% v2)[,1] will return a vector. Y will either be created as a vector or must already exist as a vector and will be re-sized for the result.

11.3.2.8 Size warnings and the potential for crashes

For matrix algebra, NIMBLE cannot ensure perfect behavior because sizes are not known until run time. Therefore, it is possible for you to write code that will crash your R session. In Version 1.2.1, NIMBLE attempts to issue a warning if sizes are not compatible, but it does not halt execution. Therefore, if you execute A <- M1 %*% M2, and M1 and M2 are not compatible for matrix multiplication, NIMBLE will output a warning that the number of rows of M1 does not match the number of columns of M2. After that warning the statement will be executed and may result in a crash. Another easy way to write code that will crash is to do things like Y[5:10, 20:30] <- x without ensuring Y is at least 10x30. In the future we hope to prevent crashes, but in Version 1.2.1 we limit ourselves to trying to provide useful information.

11.4 Declaring argument and return types

NIMBLE requires that types of arguments and the type of the return value be explicitly declared.

As illustrated in the example in Section 11.1, the syntax for a type declaration is:

type(nDim, sizes)

where type is double, integer, logical or character. (In more general nimbleFunction programming, a type can also be a nimbleList type, discussed in Section 14.2.)

For example run = function(x = double(1)) { ...} sets the single argument of the run function to be a vector of numeric values of unknown size.

For type(nDim, sizes), nDim is the number of dimensions, with 0 indicating scalar and omission of nDim defaulting to a scalar. sizes is an optional vector of fixed, known sizes.
For example, double(2, c(4, 5)) declares a \(4 \times 5\) matrix. If sizes are omitted, they will be set either by assignment or by setSize.

In the case of scalar arguments only, a default value can be provided. For example, to provide 1.2 as a default:

myfun <- nimbleFunction(
    run = function(x = double(0, default = 1.2)) {
})

Functions with return values must have their return type explicitly declared using returnType, which can occur anywhere in the run code. For example returnType(integer(2)) declares the return type to be a matrix of integers. A return type of void() means there is no return value, which is the default if no returnType statement is included.

Note that because all values in models are stored as doubles and because of some limitations in NIMBLE’s automatic casting, non-scalar return values of user-defined distributions must be doubles.

11.5 Compiled nimbleFunctions pass arguments by reference

Uncompiled nimbleFunctions pass arguments like R does, by copy. If x is passed as an argument to function foo, and foo modifies x internally, it is modifying its copy of x, not the original x that was passed to it.

Compiled nimbleFunctions pass arguments to other compiled nimbleFunctions by reference (or pointer). This is very different. Now if foo modifies x internally, it is modifying the same x that was passed to it. This allows much faster execution but is obviously a fundamentally different behavior.

Uncompiled execution of nimbleFunctions is primarily intended for debugging. However, debugging of how nimbleFunctions interact via arguments requires testing the compiled versions.

11.6 Calling external compiled code

If you have a function in your own compiled C or C++ code and an appropriate header file, you can generate a nimbleFunction that wraps access to that function, which can then be used in other nimbleFunctions. See help(nimbleExternalCall) for an example. This also contains an example of using an externally compiled function in the BUGS code of a model.

11.7 Calling uncompiled R functions from compiled nimbleFunctions

Sometimes one may want to combine R functions with compiled nimbleFunctions. Obviously a compiled nimbleFunction can be called from R. An R function with numeric inputs and output can be called from compiled nimbleFunctions. The call to the R function is wrapped in a nimbleFunction returned by nimbleRcall. See help(nimbleRcall) for an example, including an example of using the resulting function in the BUGS code of a model.


  1. Of course, in general, explicitly calculating the inverse is not the recommended numerical recipe for least squares.↩︎

  2. On the machine this is being written on, the compiled version runs a few times faster than the uncompiled version. However we refrain from formal speed tests.↩︎

  3. For the multivariate t, these are more properly termed the ‘inverse scale’ and ‘scale’ matrices↩︎