Chapter 11 Writing simple nimbleFunctions
11.1 Introduction to simple nimbleFunctions
nimbleFunctions are the heart of programming in NIMBLE. In this chapter, we introduce simple nimbleFunctions that contain only one function to be executed, in either compiled or uncompiled form, but no setup function or additional methods.
Defining a simple nimbleFunction is like defining an R function:
nimbleFunction
returns a function that can be executed, and it
can also be compiled. Simple nimbleFunctions are useful for doing math or the other
kinds of processing available in NIMBLE when no model or modelValues
is needed. These can be used for any purpose in R programming. They
can also be used as new functions and distributions in NIMBLE’s
extension of BUGS (Chapter 12).
Here’s a basic example implementing the textbook calculation of least squares estimation of linear regression parameters21:
solveLeastSquares <- nimbleFunction(
run = function(X = double(2), y = double(1)) { # type declarations
ans <- inverse(t(X) %*% X) %*% (t(X) %*% y)
return(ans)
returnType(double(2)) # return type declaration
} )
X <- matrix(rnorm(400), nrow = 100)
y <- rnorm(100)
solveLeastSquares(X, y)
## [,1]
## [1,] 0.088547575
## [2,] 0.002025635
## [3,] -0.090387283
## [4,] -0.072660391
## [,1]
## [1,] 0.088547575
## [2,] 0.002025635
## [3,] -0.090387283
## [4,] -0.072660391
In this example, we fit a linear model for 100 random response values (y
) to four columns of
randomly generated explanatory variables (X
). We ran the
nimbleFunction
solveLeastSquares
uncompiled,
natively in R, allowing testing and debugging (Section 15.7). Then we compiled it and showed that the compiled version does the same thing,
but faster22. NIMBLE’s compiler creates C++ that uses the Eigen (http://eigen.tuxfamily.org)
library for linear algebra.
Notice that the actual NIMBLE code is written as an R function
definition that is passed to nimbleFunction
as the run
argument. Hence we call it the run code. run code is
written in the NIMBLE language. This is similar to a narrow subset of
R with some additional features. Formally, we view it as a distinct language
that encompasses what can be compiled from a nimbleFunction.
To write nimbleFunctions, you will need to learn:
- what R functions are supported for NIMBLE compilation and any ways they differ from their regular R counterparts;
- how NIMBLE handles types of variables;
- how to declare types of
nimbleFunction
arguments and return values; - that compiled nimbleFunctions always pass arguments to each other by reference.
The next sections cover each of these topics in turn.
11.2 R functions (or variants) implemented in NIMBLE
11.2.1 Finding help for NIMBLE’s versions of R functions
Often, R help pages are available for NIMBLE’s versions of R functions
using the prefix ‘nim’ and capitalizing the next letter. For
example, help on NIMBLE’s version of numeric
can be found by help(nimNumeric)
. In some cases help is found directly using the name of the function as it appears in R.
11.2.2 Basic operations
Basic R operations supported for NIMBLE compilation are listed in Table 11.1.
Function | Comments (differences from R) |
---|---|
c() |
No recursive argument. |
rep() |
No rep.int or rep_len arguments. |
seq() and ‘:’ |
Negative integer sequences from ‘:’, e.g. , 2:1 do not work. |
which() |
No arr.ind or useNames arguments. |
diag() |
Works like R in three ways: diag(vector) returns a matrix with vector on the diagonal; |
diag(matrix) returns the diagonal vector of matrix ; |
|
diag(n) returns an \(n \times n\) identity matrix. No nrow or ncol arguments. |
|
diag()<- |
Works for assigning the diagonal vector of a matrix. |
dim() |
Works on a vector as well as higher-dimensional arguments. |
length() |
|
is.na() |
Does not correctly handle NAs (or NaNs) from R that are type 'logical' or 'integer' , |
so convert these using as.numeric() before passing from R to NIMBLE. |
|
is.nan() |
|
any() |
One argument only; NAs treated as FALSE. |
all() |
One argument only; NAs treated as FALSE. |
numeric() |
Allows additional arguments to control initialization. |
logical() |
Allows additional arguments to control initialization. |
integer() |
Allows additional arguments to control initialization. |
matrix() |
Allows additional arguments to control initialization. |
array() |
Allows additional arguments to control initialization. |
indexing | Arbitrary integer and logical indexing is supported for objects of one or two dimensions. |
For higher-dimensional objects, only : indexing works and then only to create an object |
|
of at most two dimensions. |
In addition to the above functions, we provide functions any_na()
and any_nan()
for finding if there are any NA
or NaN
values in a vector. These are equivalent to using any(is.na())
and any(is.nan())
in R, with the exception noted in the table above regarding is.na()
.
Other R functions with numeric arguments and return value can be called during compiled execution by wrapping them as a nimbleRcall
(see Section 11.7).
Next we cover some of these functions in more detail.
11.2.2.1 numeric, integer, logical, matrix and array
numeric
, integer
, or logical
will create a 1-dimensional vector of
floating-point (or ‘double’ [precision]), integer, or logical values, respectively. The length
argument specifies the vector
length (default 0), and the value
argument specifies the initial
value either as a scalar (used for all vector elements, with default
0) or a vector. If a vector and its length is not equal to length
, the remaining values will be filled by R-style recycling if the recycle
argument is TRUE
(which is the default). The init
argument specifies
whether or not to initialize the elements in compiled code (default TRUE
). If
first use of the variable does not rely on initial values, using
init = FALSE
will yield slightly more efficient performance.
matrix
creates a 2-dimensional matrix object of either floating-point (if
type = "double"
, the default), integer (if type = "integer"
), or logical (if type = "logical"
) values.
As in R, nrow
and ncol
arguments specify the number of rows and columns, respectively.
The value
and init
arguments are used in the same way as for
numeric
, integer
, and logical
.
array
creates a vector or higher-dimensional object, depending
on the dim
argument, which takes a vector of sizes for each
dimension. The type
, value
and init
argument behave
the same as for matrix
.
The best way to create an identity matrix is with diag(n)
, which
returns an \(n \times n\) identity matrix. NIMBLE also provides a deprecated
nimbleFunction identityMatrix
that does the same thing.
Examples of these functions, and the related function \(`setSize`\) for changing the size of a numeric object, are given in Section 11.3.2.4.
11.2.2.2 length and dim
length
behaves like R’s length
function. It
returns the entire length of X. That means if X
is
multivariate, length(X)
returns the product of the sizes of
the dimensions. NIMBLE’s version of dim
, which has synonym nimDim
, behaves like R’s
dim
function for matrices or arrays and like R’s
length
function for vectors. In other words, regardless of
whether the number of dimensions is 1 or more, it returns a vector
of the sizes.
11.2.2.3 Deprecated creation of non-scalar objects using declare
Previous versions of NIMBLE provided a function declare
for
declaring variables. The more R-like functions numeric
,
integer
, logical
, matrix
and array
are intended to replace
declare
, but declare
is still supported for backward
compatibility. In a future version of NIMBLE, declare
may be removed.
11.2.3 Math and linear algebra
Numeric scalar and matrix mathematical operations are listed in Tables 11.2-11.3.
As in R, many scalar operations in NIMBLE will work component-wise on vectors or
higher dimensional objects. For example if B and C are vectors, A = B + C
will add them and create vector C by component-wise
addition of B and C. In the current version of NIMBLE, component-wise operations
generally only work for vectors and matrices, not arrays with more
than two dimensions. The only exception is assignment: A = B
will
work up to NIMBLE’s current limit of four dimensions.
Usage | Description | Comments | Status | Vector input |
---|---|---|---|---|
x | y , x & y |
logical OR (\(|\)) and AND(&) | yes | yes | |
!x |
logical not | yes | yes | |
x > y, x >= y |
greater than (and or equal to) | yes | yes | |
x < y, x <= y |
less than (and or equal to) | yes | yes | |
x != y, x == y |
(not) equals | yes | yes | |
x + y, x - y, x * y |
component-wise operators | mix of scalar and vector | yes | yes |
x / y |
component-wise division | vector \(x\) and scalar \(y\) | yes | yes |
x^y, pow(x, y) |
power | \(x^y\); vector \(x\),scalar \(y\) | yes | yes |
x %% y |
modulo (remainder) | yes | no | |
min(x1, x2), |
min. (max.) of two scalars | yes | See pmin , |
|
max(x1, x2) |
pmax |
|||
exp(x) |
exponential | yes | yes | |
log(x) |
natural logarithm | yes | yes | |
sqrt(x) |
square root | yes | yes | |
abs(x) |
absolute value | yes | yes | |
step(x) |
step function at 0 | 0 if \(x<0\), 1 if \(x>=0\) | yes | yes |
equals(x, y) |
equality of two scalars | 1 if \(x==y\), 0 if \(x != y\) | yes | yes |
cube(x) |
third power | \(x^3\) | yes | yes |
sin(x), cos(x), |
trigonometric functions | yes | yes | |
tan(x) |
||||
asin(x), acos(x), |
inverse trigonometric functions | yes | yes | |
atan(x) |
||||
asinh(x), acosh(x), |
inv. hyperbolic trig. functions | yes | yes | |
atanh(x) |
||||
logit(x) |
logit | \(\log(x/(1-x))\) | yes | yes |
ilogit(x), expit(x) |
inverse logit | \(\exp(x)/(1 + \exp(x))\) | yes | yes |
probit(x) |
probit (Gaussian quantile) | \(\Phi^{-1}(x)\) | yes | yes |
iprobit(x), phi(x) |
inverse probit (Gaussian CDF) | \(\Phi(x)\) | yes | yes |
cloglog(x) |
complementary log log | \(\log(-\log(1-x))\) | yes | yes |
icloglog(x) |
inverse complementary log log | \(1 - \exp(-\exp(x))\) | yes | yes |
ceiling(x) |
ceiling function | \(\lceil(x)\rceil\) | yes | yes |
floor(x) |
floor function | \(\lfloor(x)\rfloor\) | yes | yes |
round(x) |
round to integer | yes | yes | |
trunc(x) |
truncation to integer | yes | yes | |
lgamma(x), loggam(x) |
log gamma function | \(\log \Gamma(x)\) | yes | yes |
besselK(x, nu, |
modified bessel function | returns vector even if | yes | yes |
...expon.scaled) |
of the second kind | \(x\) a matrix/array | ||
log1p(x) |
log of 1 + x | \(\log(1+x)\) | yes | yes |
lfactorial(x), |
log factorial | \(\log x!\) | yes | yes |
logfact(x) |
||||
qDIST(x, PARAMS) |
“q” distribution functions | canonical parameterization | yes | yes |
pDIST(x, PARAMS) |
“p” distribution functions | canonical parameterization | yes | yes |
rDIST(x, PARAMS) |
“r” distribution functions | canonical parameterization | yes | yes |
dDIST(x, PARAMS) |
“d” distribution functions | canonical parameterization | yes | yes |
sort(x) |
no | |||
rank(x, s) |
no | |||
ranked(x, s) |
no | |||
order(x) |
no |
Usage | Description | Comments | Status |
---|---|---|---|
inverse(x) |
matrix inverse | \(x\) symmetric, positive def. | yes |
chol(x) |
matrix Cholesky factorization | \(x\) symmetric, positive def. | yes |
returns upper triang. matrix | |||
t(x) |
matrix transpose | \(x^\top\) | yes |
x%*%y |
matrix multiply | \(xy\); \(x\), \(y\) conformant | yes |
inprod(x, y) |
dot product | \(x^\top y\); \(x\) and \(y\) vectors | yes |
solve(x,y) |
solve system of equations | \(x^{-1} y\); \(y\) matrix or vector | yes |
forwardsolve(x, y) |
solve lower-triangular system of equations | \(x^{-1} y\); \(x\) lower-triangular | yes |
backsolve(x, y) |
solve upper-triangular system of equations | \(x^{-1} y\); \(x\) upper-triangular | yes |
logdet(x) |
log matrix determinant | \(\log|x|\) | yes |
asRow(x) |
convert vector to 1-row matrix | sometimes automatic | yes |
asCol(x) |
convert vector to 1-column matrix | sometimes automatic | yes |
sum(x) |
sum of elements of x |
yes | |
mean(x) |
mean of elements of x |
yes | |
sd(x) |
standard deviation of elements of x |
yes | |
prod(x) |
product of elements of x |
yes | |
min(x), max(x) |
min. (max.) of elements of x |
yes | |
pmin(x, y), pmax(x,y) |
vector of mins (maxs) of elements of | yes | |
x and y |
|||
interp.lin(x, v1, v2) |
linear interpolation | no | |
eigen(x) |
matrix eigendecomposition; returns a | \(x\) symmetric | yes |
nimbleList of type eigenNimbleList |
|||
svd(x) |
matrix singular value decomposition; | yes | |
returns a nimbleList of type | |||
svdNimbleList |
More information on the nimbleLists returned by the eigen
and svd
functions in NIMBLE can be found in Section 14.2.2.
11.2.4 Distribution functions
Distribution ‘d’, ‘r’, ‘p’, and ‘q’ functions can all be used from nimbleFunctions (and in BUGS model code), but care is needed in the syntax, as follows.
- Names of the distributions generally (but not always) match those of R, which sometimes differ from BUGS. See the list below.
- Supported parameterizations are also indicated in the list below.
- For multivariate distributions (multivariate normal, Dirichlet, and Wishart), ‘r’ functions only return one random draw at a time, and the first argument must always be 1.
- R’s recycling rule (re-use of an argument as needed to accommodate longer values of other arguments) is generally followed, but the returned object is always a scalar or a vector, not a matrix or array.
As in R (and nimbleFunctions), arguments are matched by order or by name
(if given). Standard arguments to distribution functions in R
(log
, log.p
, lower.tail
) can be used and have the same
defaults. User-defined distributions for BUGS (Chapter
12) can also be used from
nimbleFunctions.
For standard distributions, we rely on R’s regular help pages
(e.g., help(dgamma)
. For distributions unique to NIMBLE
(e.g., dexp_nimble
, ddirch
), we provide help pages.
Supported distributions, listed by their ‘d’ function, include:
dbinom(x, size, prob, log)
dcat(x, prob, log)
dmulti(x, size, prob, log)
dnbinom(x, size, prob, log)
dpois(x, lambda, log)
dbeta(x, shape1, shape2, log)
dchisq(x, df, log)
ddexp(x, location, rate, log)
ddexp(x, location, scale, log)
dexp(x, rate, log)
dexp_nimble(x, rate, log)
dexp_nimble(x, scale, log)
dgamma(x, shape, rate, log)
dgamma(x, shape, scale, log)
dinvgamma(x, shape, rate, log)
dinvgamma(x, shape, scale, log)
dlnorm(x, meanlog, sdlog, log)
dlogis(x, location, scale, log)
dnorm(x, mean, sd, log)
dt_nonstandard(x, df, mu, sigma, log)
dt(x, df, log)
dunif(x, min, max, log)
dweibull(x, shape, scale, log)
ddirch(x, alpha, log)
dlkj_corr_cholesky(x, shape, size, log)
dmnorm_chol(x, mean, cholesky, prec_param, log)
dmvt_chol(x, mu, cholesky, df, prec_param, log)
dwish_chol(x, cholesky, df, scale_param, log)
dinvwish_chol(x, cholesky, df, scale_param, log)
In the last three, cholesky
stands for Cholesky decomposition of the relevant matrix;
prec_param
is a logical indicating whether the Cholesky is of a
precision matrix (TRUE) or covariance matrix (FALSE)23; and scale_param
is a logical indicating whether the Cholesky is of a scale matrix (TRUE) or an inverse scale matrix (FALSE).
11.2.5 Flow control: if-then-else, for, while, and stop
These basic flow-control structures use the same syntax as in R.
However, for
-loops are limited to sequential integer
indexing. For example, for(i in 2:5) {...}
works as it does
in R. Decreasing index sequences are not allowed. Unlike in R,
if
is not itself a function that returns a value.
We plan to include more flexible for
-loops in the future, but for
now we’ve included just one additional useful feature: for(i in seq_along(NFL))
will work as in R, where NFL
is a
nimbleFunctionList
. This is described in Section 15.4.8.
stop
, or equivalently nimStop
, throws
control to R’s error-handling system and can take a character
argument that will be displayed in an error message.
11.2.6 print and cat
print
, or equivalently
nimPrint
, prints an arbitrary set of outputs in order and adds a
newline character at the end. cat
or nimCat
is identical,
except without a newline at the end.
11.2.7 Checking for user interrupts: checkInterrupt
When you write algorithms that will run for a long time in C++, you
may want to explicitly check whether a user has tried to interrupt the
execution (i.e., by pressing Control-C). Simply include
checkInterrupt
in run code at places where a check should
be done. If there has been an
interrupt waiting to be handled, the process will stop and return control to R.
11.2.8 Optimization: optim and nimOptim
NIMBLE provides a nimOptim
function that partially implement’s R’s optim
function with some minor differences. nimOptim
can use methods from R’s
optim
, including ‘Nelder-Mead’, ‘BFGS’, ‘CG’, ‘L-BFGS-B’, but it does not
support methods ‘SANN’ and ‘Brent’. As of version 1.2.0, it also allows users to
provide their own optimization methods as R functions (although this feature is
subject to change in future versions). Support for R’s nlminb
is provided in
this way. NIMBLE’s nimOptim
allows gradients to be supplied using
user-provided functions if available or finite differences otherwise. Currently
nimOptim
does not support extra parameters to the function being optimized
(via \dots
), but a work-around is to create a new nimbleFunction
that calls
another one with the additional parameters. Finally, nimOptim
requires a
nimbleList
datatype for the control
parameter, whereas R’s optim
uses a
simple R list
. To define the control
parameter, create a default value with
the nimOptimDefaultControl
function, and set any desired fields.
See help(nimOptim)
for details, including example usage.
11.2.9 Integration: integrate and nimIntegrate
NIMBLE provides a nimIntegrate
function that implement’s R’s
integrate
function, which carries out adaptive Gauss-Kronrod
quadrature to integrate a function of one variable over a finite
or infinite interval.
In addition to general use in a nimbleFunction, this can, of course, be used in a user-defined function in model code, e.g., to implement models that involve an integral, such as certain point process and survival models.
One could also consider using nimIntegrate
to numerically integrate over a
parameter in a model, such as to remove a parameter that is not mixing well in
an MCMC. Note that NIMBLE Laplace approximation and adaptive Gauss-Hermite
quadrature for integrating over continuous parameters such as random effects.
See section 16.2 and help(Laplace)
for more information on those.
Here is an example of using nimIntegrate
to implement a vectorized von Mises
distribution.
integrand <- nimbleFunction(
run = function(x = double(1), theta = double(1)) {
return( exp(theta[1] * cos(x) + theta[2] * cos(2 * (x + theta[3]))) )
returnType(double(1))
})
dGenVonMises <- nimbleFunction(
run = function(x = double(1), mu1 = double(), mu2 = double(),
kappa1 = double(), kappa2 = double(),
limits = double(1), log = integer(0, default = 0)){
range <- limits[2] - limits[1]
mu1R <- (mu1 - range/2)/range*2*pi
mu2R <- (mu2 - range/2)/range*2*pi
z <- (x - range/2)/range*2*pi
d <- (mu1R - mu2R) %% pi
num <- exp(kappa1 * cos(z - mu1R) + kappa2 * cos(2 * (z - mu2R)))
tmp <- c(kappa1, kappa2, d)
den <- nimIntegrate(integrand, lower = 0, upper = 2*pi, tmp)[1]
dens <- num/den
result <- dens*2*pi/range
if(log) {
return(log(result))
} else return(result)
returnType(double(1))
}
)
cgvonmises <- compileNimble(dGenVonMises)
x <- 0:360
# plot(x, circular::dgenvonmises(circular(x, type = "angles", units = "degrees"), -1, 1, 2, 1))
# plot(x, cgvonmises(x, mu1=20, mu2=300, kappa1=2, kappa2=1, limits = c(0, 360)), type = 'l')
11.2.10 ‘nim’ synonyms for some functions
NIMBLE uses some keywords, such as dim
and print
, in ways
similar but not identical to R. In addition, there are some keywords
in NIMBLE that have the same names as R functions with quite different functionality.
For example, step
is part of the BUGS language, but it is also an
R function for stepwise model selection. And equals
is part of
the BUGS language but is also used in the testthat
package, which
we use in testing NIMBLE.
NIMBLE tries to avoid conflicts by replacing some keywords immediately upon creating a nimbleFunction. These replacements include
c
\(\rightarrow\)nimC
copy
\(\rightarrow\)nimCopy
dim
\(\rightarrow\)nimDim
print
\(\rightarrow\)nimPrint
cat
\(\rightarrow\)nimCat
step
\(\rightarrow\)nimStep
equals
\(\rightarrow\)nimEquals
rep
\(\rightarrow\)nimRep
round
\(\rightarrow\)nimRound
seq
\(\rightarrow\)nimSeq
stop
\(\rightarrow\)nimStop
switch
\(\rightarrow\)nimSwitch
numeric, integer, logical
\(\rightarrow\)nimNumeric, nimInteger, nimLogical
matrix, array
\(\rightarrow\)nimMatrix, nimArray
optim
\(\rightarrow\)nimOptim
integrate
\(\rightarrow\)nimIntegrate
This system gives programmers the choice between using the keywords like
nimPrint
directly, to avoid confusion in their own code about
which ‘print’ is being used, or to use the more intuitive keywords
like print
but remember that they are not the same as R’s functions.
11.3 How NIMBLE handles types of variables
Variables in the NIMBLE language are statically typed. Once a variable is used for one type, it can’t subsequently be used for a different type. This rule facilitates NIMBLE’s compilation to C++. The NIMBLE compiler often determines types automatically, but sometimes the programmer needs to explicitly provide them.
The elemental types supported by NIMBLE include double
(floating-point), integer, logical, and character. The type of
a numeric or logical object refers to the number
of dimensions and the elemental type of the elements. Hence if x
is created as a double matrix, it can only be used subsequently for a double
matrix. The size of each dimension is not part of its type and thus
can be changed. Up to four
dimensions are supported for double, integer, and logical. Only
vectors (one dimension) are supported for character. Unlike R, NIMBLE supports true
scalars, which have 0 dimensions.
11.3.1 nimbleList data structures
A nimbleList
is a data structure that can contain arbitrary other NIMBLE objects, including other nimbleList
s.
Like other NIMBLE types, nimbleLists are strongly typed: each nimbleList
is created from a configuration that
declares what types of objects it will hold. nimbleList
s are covered in Chapter 14.2.
11.3.2 How numeric types work
R’s dynamic types support easy programming because one type can sometimes be transformed to another type automatically when an expression is evaluated. NIMBLE’s static types makes it stricter than R.
11.3.2.1 When NIMBLE can automatically set a numeric type
When a variable is first created by assignment, its type is determined
automatically by that assignment. For example, if x
has not appeared before, then
will create x
to be a double matrix of the correct size
(determined during execution).
11.3.2.1.1 Avoid changing types of a variable within a nimbleFunction
Because NIMBLE is statically typed, you cannot use the same variable name for two objects of different types (including objects of different dimensions).
Suppose we have (implicitly) created x
as a double matrix.
If x
is used subsequently, it can
only be used as a double matrix. This is true even if it is assigned a
new value, which will again set its size automatically but cannot
change its type.
11.3.2.2 When a numeric object needs to be created before being used
If the contents of a variable are to be populated by assignment into
some indices in steps, the variable must be created first. Further,
it must be large enough for its eventual contents; it will not be automatically
resized if assignments are made beyond its current size. For example,
in the following code, x
must be created before being filled with
contents for specific indices.
11.3.2.3 How NIMBLE handles NA
NIMBLE supports use of NA
with some caveats. In the compiled version of a nimbleFunction:
NA
values in a logical scalar or vector will not work;- Assigning
NA
to a new variable will make that variable have type double (in R, this would be a logical!); - Non-trivial use of
NA
in integer variables may fail by having the value become a large (negative) number; - Use of
NA
in doubles should generally work.
These issues arise because NIMBLE uses R’s encoding of NA values in C(++) but uses native compiled math and type-casting in C++, which do not always preserve R’s NA encodings. In summary, NA should generally work for doubles and should be used cautiously (i.e. after you test that what you need to work actually works) for integers. For some needs, NaN is a suitable alternative to NA.
11.3.2.4 Changing the sizes of existing objects: setSize
setSize
changes the size of an object, preserving its contents in
column-major order.
11.3.2.5 Confusions between scalars and length-one vectors
In R, there is no such thing is a true scalar; scalars can always be treated as vectors of length one. NIMBLE allows true scalars, which can create confusions. For example, consider the following code:
myfun <- nimbleFunction(
run = function(i = integer()) { # i is an integer scalar
randomValues <- rnorm(10) # double vector
a <- randomValues[i] # double scalar
b <- randomValues[i:i] # double vector
d <- a + b # double vector
f <- c(i) # integer vector
})
In the line that creates b
, the index range i:i
is not
evaluated until run time. Even though i:i
will always evaluate
to simpy i
, the compiler does not determine that. Since there is
a vector index range provided, the result of randomValues[i:i]
is
determined to be a vector. The following line then creates d
as
a vector, because a vector plus a scalar returns a vector. Another
way to create a vector from a scalar is to use c
, as
illustrated in the last line.
11.3.2.6 Confusions between vectors and one-column or one-row matrices
Consider the following code:
myfun <- nimbleFunction(
run = function() {
A <- matrix(value = rnorm(9), nrow = 3)
B <- rnorm(3)
Cmatrix <- A %*% B # double matrix, one column
Cvector <- (A %*% B)[,1] # double vector
Cmatrix <- (A %*% B)[,1] # error, vector assigned to matrix
Cmatrix[,1] <- (A %*% B)[,1] # ok, if Cmatrix is large enough
})
This creates a matrix A
, a vector B
, and matrix-multiplies
them. The vector B
is automatically treated as a one-column
matrix in matrix algebra computations. The result of matrix multiplication is always a matrix, but
a programmer may expect a vector, since they know the result will have
one column. To make it a vector, simply extract the first column.
More information about such handling is provided in the next section.
11.3.2.7 Understanding dimensions and sizes from linear algebra
As much as possible, NIMBLE behaves like R when determining types and
sizes returned from linear algebra expressions, but in some cases this
is not possible because R uses run-time information while NIMBLE must
determine dimensions at compile time. For example, when matrix
multiplying a matrix by a vector, R treats the vector as a one-column
matrix unless treating it as a one-row matrix is the only way to make
the expression valid, as determined at run time. NIMBLE usually must assume during compilation
that it should be a one-column matrix, unless it can determine not
just the number of dimensions but the actual sizes during
compilation. When needed asRow
and asCol
can control how a
vector will be treated as a matrix.
Here is a guide to such issues. Suppose v1
and v2
are vectors, and M1
is a matrix. Then
v1 + M1
promotesv1
to a 1-column matrix ifM1
is a one-column matrix. Otherwise, this results in a run-time error. This behavior occurs for all component-wise binary functions.v1 %*% M1
defaults to promotingv1
to a 1-row matrix, unless it is known at compile-time thatM1
has 1 row, in which casev1
is promoted to a 1-column matrix.M1 %*% v1
defaults to promotingv1
to a 1-column matrix, unless it is known at compile time thatM1
has 1 column, in which casev1
is promoted to a 1-row matrix.v1 %*% v2
promotesv1
to a 1-row matrix andv2
to a 1-column matrix, so the returned values is a 1x1 matrix with the inner product ofv1
andv2
. If you want the inner product as a scalar, useinprod(v1, v2)
.asRow(v1)
explicitly promotesv1
to a 1-row matrix. Thereforev1 %*% asRow(v2)
gives the outer product ofv1
andv2
.asCol(v1)
explicitly promotesv1
to a 1-column matrix.
- The default promotion for a vector is to a 1-column matrix.
Therefore,
v1 %*% t(v2)
is equivalent tov1 %*% asRow(v2)
. - When indexing, dimensions with scalar indices will be dropped.
For example,
M1[1,]
andM1[,1]
are both vectors. If you do not want this behavior, usedrop=FALSE
just as in R. For example,M1[1,,drop=FALSE]
is a matrix. - The left-hand side of an assignment can use indexing, but if so
it must already be correctly sized for the result. For example,
Y[5:10, 20:30] <- x
will not work – and could crash your R session with a segmentation fault – if Y is not already at least 10x30 in size. This can be done bysetSize(Y, c(10, 30))
. See Section 11.3.2.4 for more details. Note that non-indexed assignment toY
, such asY <- x
, will automatically setY
to the necessary size.
Here are some examples to illustrate the above points, assuming M2
is
a square matrix.
Y <- v1 + M2 %*% v2
will return a 1-column matrix. If Y is created by this statement, it will be a 2-dimensional variable. If Y already exists, it must already be 2-dimesional, and it will be automatically re-sized for the result.Y <- v1 + (M2 %*% v2)[,1]
will return a vector. Y will either be created as a vector or must already exist as a vector and will be re-sized for the result.
11.3.2.8 Size warnings and the potential for crashes
For matrix algebra, NIMBLE cannot ensure perfect behavior because
sizes are not known until run time. Therefore, it is possible for you
to write code that will crash your R session. In Version 1.2.1, NIMBLE
attempts to issue a warning if sizes are not compatible, but it does not
halt execution. Therefore, if you execute A <- M1 %*% M2
, and M1
and M2
are not compatible for matrix multiplication,
NIMBLE will output a warning that the number of rows of M1
does not
match the number of columns of M2
. After that warning the statement
will be executed and may result in a crash. Another easy way to write
code that will crash is to do things like Y[5:10, 20:30] <- x
without ensuring Y is at least 10x30. In the future we hope to
prevent crashes, but in Version 1.2.1 we limit ourselves to trying to provide
useful information.
11.4 Declaring argument and return types
NIMBLE requires that types of arguments and the type of the return value be explicitly declared.
As illustrated in the example in Section 11.1, the syntax for a type declaration is:
type(nDim, sizes)
where type
is double
, integer
, logical
or character
.
(In more general nimbleFunction programming, a type can also be a
nimbleList type, discussed in Section 14.2.)
For example run = function(x = double(1)) { ...
} sets the single argument of the run function to be a vector of numeric values of unknown size.
For type(nDim, sizes)
, nDim
is the number of dimensions, with 0 indicating scalar and omission of nDim
defaulting to a scalar.
sizes
is an optional vector of fixed, known sizes.
For example, double(2, c(4, 5))
declares a \(4 \times 5\) matrix. If sizes
are omitted, they will be set either by assignment or by setSize
.
In the case of scalar arguments only, a default value can be provided. For example, to provide 1.2 as a default:
Functions with return values must have their return type explicitly declared using returnType
, which can occur anywhere in the run code. For example returnType(integer(2))
declares the return type to be a matrix of integers. A return type of void()
means there is
no return value, which is the default if no returnType
statement
is included.
Note that because all values in models are stored as doubles and because of some limitations in NIMBLE’s automatic casting, non-scalar return values of user-defined distributions must be doubles.
11.5 Compiled nimbleFunctions pass arguments by reference
Uncompiled nimbleFunctions pass arguments like R does, by copy. If
x
is passed as an argument to function foo
, and foo
modifies x
internally, it is modifying its copy of x
, not
the original x
that was passed to it.
Compiled nimbleFunctions pass arguments to other compiled
nimbleFunctions by reference (or pointer). This is very different.
Now if foo
modifies x
internally, it is modifying the same
x
that was passed to it. This allows much faster execution but
is obviously a fundamentally different behavior.
Uncompiled execution of nimbleFunctions is primarily intended for debugging. However, debugging of how nimbleFunctions interact via arguments requires testing the compiled versions.
11.6 Calling external compiled code
If you have a function in your own compiled C or C++ code and an appropriate header file, you can generate a nimbleFunction that wraps access to that function, which can then be used in other nimbleFunctions. See help(nimbleExternalCall)
for an example. This also contains an example of using an externally compiled function in the BUGS code of a model.
11.7 Calling uncompiled R functions from compiled nimbleFunctions
Sometimes one may want to combine R functions with compiled nimbleFunctions. Obviously a compiled nimbleFunction can be called from R. An R function with numeric inputs and output can be called from compiled nimbleFunctions. The call to the R function is wrapped in a nimbleFunction returned by nimbleRcall
. See help(nimbleRcall)
for an example, including an example of using the resulting function in the BUGS code of a model.
Of course, in general, explicitly calculating the inverse is not the recommended numerical recipe for least squares.↩︎
On the machine this is being written on, the compiled version runs a few times faster than the uncompiled version. However we refrain from formal speed tests.↩︎
For the multivariate t, these are more properly termed the ‘inverse scale’ and ‘scale’ matrices↩︎