Chapter 14 Data structures in NIMBLE
NIMBLE provides several data structures useful for programming.
We’ll first describe modelValues, which are containers designed for storing values for models. Then in Section 14.2 we’ll describe nimbleLists, which have a similar purpose to lists in R, allowing you to store heterogeneous information in a single object.
modelValues can be created in either R or in nimbleFunction setup code. nimbleLists can be created in R code, in nimbleFunction setup code, and in nimbleFunction run code, from a nimbleList definition created in R or setup code. Once created, modelValues and nimbleLists
can then be used either in R or in nimbleFunction setup or run code. If used in run code, they will be compiled along with the nimbleFunction.
14.1 The modelValues data structure
modelValues are containers designed for storing values for models. They may be used for model outputs or model inputs. A modelValues object will contain rows of variables. Each row contains one object of each variable, which may be multivariate. The simplest way to build a modelValues object is from a model object. This will create a modelValues object with the same variables as the model. Although they were motivated by models, one is free to set up a modelValues with any variables one wants.
As with the material in the rest of this chapter, modelValues objects will generally be used in nimbleFunctions that interact with models (see Chapter 15)27. modelValues objects can be defined either in setup code or separately in R (and then passed as an argument to setup code). The modelValues object can then used in run code of nimbleFunctions.
14.1.1 Creating modelValues objects
Here is a simple example of creating a modelValues object:
## [1] 5 1 5 14 3 19 1 1 4 22
## [[1]]
## [1] NA NA NA NA NA NA NA NA NA NA
##
## [[2]]
## [1] NA NA NA NA NA NA NA NA NA NA
In this example, pumpModelValues
has the same variables as
pumpModel
, and we set pumpModelValues
to have m = 2
rows. As you can see, the rows are stored as elements of a list.
Alternatively, one can define a modelValues object manually by first defining a modelValues configuration via the
modelValuesConf
function, and then creating an instance from that configuration, like this:
mvConf = modelValuesConf(vars = c('a', 'b', 'c'),
type = c('double', 'int', 'double'),
size = list(a = 2, b =c(2,2), c = 1) )
customMV = modelValues(mvConf, m = 2)
customMV$a
## [[1]]
## [1] NA NA
##
## [[2]]
## [1] NA NA
The arguments to modelValuesConf
are matching lists of variable
names, types, and sizes. See help(modelValuesConf)
for more
details. Note that in R execution, the types are not enforced. But
they will be the types created in C++ code during compilation, so they
should be specified carefully.
The object returned by modelValues
is an uncompiled
modelValues object. When a nimbleFunction is compiled, any modelValues
objects it uses are also compiled. A NIMBLE model always contains a
modelValues object that it uses as a default location to store the values of its variables.
Here is an example where the customMV
created above is used as
the setup argument for a nimbleFunction, which is then compiled. Its
compiled modelValues is then accessed with $
.
# simple nimbleFunction that uses a modelValues object
resizeMV <- nimbleFunction(
setup = function(mv){},
run = function(k = integer() ){
resize(mv,k)})
rResize <- resizeMV(customMV)
cResize <- compileNimble(rResize)
cResize$run(5)
cCustomMV <- cResize$mv
# cCustomMV is a compiled modelValues object
cCustomMV[['a']]
## [[1]]
## [1] NA NA
##
## [[2]]
## [1] NA NA
##
## [[3]]
## [1] 0 0
##
## [[4]]
## [1] 0 0
##
## [[5]]
## [1] 0 0
Compiled modelValues objects can be accessed and altered in all the same ways as uncompiled ones. However, only uncompiled modelValues can be used as arguments to setup code in nimbleFunctions.
In the example above a modelValues object is passed to setup code, but a modelValues configuration can also be passed, with creation of modelValues object(s) from the configuration done in setup code.
14.1.2 Accessing contents of modelValues
The values in a modelValues object can be accessed in several ways from R, and in fewer ways from NIMBLE.
# sets the first row of a to (0, 1). R only.
customMV[['a']][[1]] <- c(0,1)
# sets the second row of a to (2, 3)
customMV['a', 2] <- c(2,3)
# can access subsets of each row
customMV['a', 2][2] <- 4
# accesses all values of 'a'. Output is a list. R only.
customMV[['a']]
## [[1]]
## [1] 0 1
##
## [[2]]
## [1] 2 4
# sets the first row of b to a matrix with values 1. R only.
customMV[['b']][[1]] <- matrix(1, nrow = 2, ncol = 2)
# sets the second row of b. R only.
customMV[['b']][[2]] <- matrix(2, nrow = 2, ncol = 2)
# make sure the size of inputs is correct
# customMV['a', 1] <- 1:10 "
# problem: size of 'a' is 2, not 10!
# will cause problems when compiling nimbleFunction using customMV
Currently, only the syntax customMV["a", 2]
works in the NIMBLE
language, not customMV[["a"][[2]]
.
We can query and change the number of rows using getsize
and
resize
, respectively. These work in both R and NIMBLE. Note
that we don’t specify the variables in this case: all variables in a
modelValues object will have the same number of rows.
## [1] 2
## [1] 3
## [[1]]
## [1] 0 1
##
## [[2]]
## [1] 2 4
##
## [[3]]
## [1] NA NA
Often it is useful to convert a modelValues object to a matrix for use
in R. For example, we may want to convert MCMC output into a matrix,
for use with the coda
package for processing MCMC samples, or into a
list. This can be done with the as.matrix
and as.list
methods for
modelValues objects, respectively. as.matrix
will generate column
names from every scalar element of variables (e.g. “b[1, 1]” ,“b[2,
1]”, etc.). The rows of the modelValues will be the rows of the
matrix, with any matrices or arrays converted to a vector based on
column-major ordering.
## a[1] a[2]
## [1,] 0 1
## [2,] 2 4
## [3,] NA NA
## a[1] a[2] b[1, 1] b[2, 1] b[1, 2] b[2, 2] c[1]
## [1,] 0 1 1 1 1 1 NA
## [2,] 2 4 2 2 2 2 NA
## [3,] NA NA NA NA NA NA NA
as.list
will return a list with an element for each variable. Each
element will be an array of the same size and shape as the variable
with an additional index for the iteration (e.g., MCMC iteration when used for
MCMC output). By default iteration is the first index, but it
can be the last if needed.
## $a
## [,1] [,2]
## [1,] 0 1
## [2,] 2 4
## [3,] NA NA
## $a
## [,1] [,2]
## [1,] 0 1
## [2,] 2 4
## [3,] NA NA
##
## $b
## , , 1
##
## [,1] [,2]
## [1,] 1 1
## [2,] 2 2
## [3,] NA NA
##
## , , 2
##
## [,1] [,2]
## [1,] 1 1
## [2,] 2 2
## [3,] NA NA
##
##
## $c
## [,1]
## [1,] NA
## [2,] NA
## [3,] NA
## $a
## [,1] [,2] [,3]
## [1,] 0 2 NA
## [2,] 1 4 NA
If a variable is a scalar, using unlist
in R to extract all rows as a vector can be useful.
## [1] 1 2 3
Once we have a modelValues object, we can see the structure of its
contents via the varNames
and sizes
components of the object.
## [1] "a" "b" "c"
## $a
## [1] 2
##
## $b
## [1] 2 2
##
## $c
## [1] 1
As with most NIMBLE objects, modelValues are passed by reference, not by value. That means any modifications of modelValues objects in either R functions or nimbleFunctions will persist outside of the function. This allows for more efficient computation, as stored values are immediately shared among nimbleFunctions.
## [1] 0 1
## [1] 1 1
However, when you retrieve a variable from a modelValues object, the result is a standard R list, which is subsequently passed by value, as usual in R.
14.1.2.1 Automating calculation and simulation using modelValues
The nimbleFunctions simNodesMV
, calcNodesMV
, and getLogProbsMV
can be used to operate on a model based on rows in a modelValues object.
For example,
simNodesMV
will simulate in the model multiple times and record
each simulation in a row of its modelValues. calcNodesMV
and
getLogProbsMV
iterate over the rows of a modelValues, copy the
nodes into the model, and then do their job of calculating or
collecting log probabilities (densities), respectively. Each of these
returns a numeric vector with the summed log probabilities of the
chosen nodes from each row. calcNodesMV
will
save the log probabilities back into the modelValues object if
saveLP = TRUE
, a run-time argument.
Here are some examples:
mv <- modelValues(simpleModel)
rSimManyXY <- simNodesMV(simpleModel, nodes = c('x', 'y'), mv = mv)
rCalcManyXDeps <- calcNodesMV(simpleModel, nodes = 'x', mv = mv)
rGetLogProbMany <- getLogProbNodesMV(simpleModel, nodes = 'x', mv = mv)
cSimManyXY <- compileNimble(rSimManyXY, project = simpleModel)
cCalcManyXDeps <- compileNimble(rCalcManyXDeps, project = simpleModel)
cGetLogProbMany <- compileNimble(rGetLogProbMany, project = simpleModel)
cSimManyXY$run(m = 5) # simulating 5 times
cCalcManyXDeps$run(saveLP = TRUE) # calculating
## [1] -11.24771 -26.42169 -15.48325 -17.24588 -16.43909
## [1] -11.24771 -26.42169 -15.48325 -17.24588 -16.43909
14.2 The nimbleList data structure
nimbleLists provide a container for storing different types of objects in NIMBLE, similar to the list data structure in R. Before a nimbleList can be created and used, a definition28 for that nimbleList must be created that provides the names, types, and dimensions of the elements in the nimbleList. nimbleList definitions must be created in R (either in R’s global environment or in setup code), but the nimbleList instances can be created in run code.
Unlike lists in R, nimbleLists must have the names and types of all list elements provided by a definition before the list can be used. A nimbleList definition can be made by using the nimbleList
function in one of two manners. The first manner is to provide the nimbleList
function with a series of expressions of the form name = type(nDim)
, similar to the specification of run-time arguments to nimbleFunctions. The types allowed for a nimbleList are the same as those allowed as run-time arguments to a nimbleFunction, described in Section 11.4. For example, the following line of code creates a nimbleList definition with two elements: x
, which is a scalar integer, and Y
, which is a matrix of doubles.
The second method of creating a nimbleList definition is by providing an R list of nimbleType objects to the nimbleList()
function. A nimbleType object can be created using the nimbleType
function, which must be provided with three arguments: the name
of the element being created, the type
of the element being created, and the dim
of the element being created. For example, the following code creates a list with two nimbleType objects and uses these objects to create a nimbleList definition.
nimbleListTypes <- list(nimbleType(name = 'x', type = 'integer', dim = 0),
nimbleType(name = 'Y', type = 'double', dim = 2))
# this nimbleList definition is identical to the one created above
exampleNimListDef2 <- nimbleList(nimbleListTypes)
Creating definitions using a list of nimbleType
s can be useful, as it allows for programmatic generation of nimbleList elements.
Once a nimbleList definition has been created, new instances of nimbleLists can be made from that definition using the new
member function. The new
function can optionally take initial values for the list elements as arguments. Below, we create a new nimbleList from our exampleNimListDef
and specify values for the two elements of our list:
Once created, nimbleList elements can be accessed using the $
operator, just as with lists in R. For example, the value of the x
element of our exampleNimbleList
can be set to 7
using
nimbleList definitions can be created either in R’s global environment or in setup
code of a nimbleFunction. Once a nimbleList definition has been made, new instances of nimbleLists can be created using the new
function in R’s global environment, in setup code, or in run code of a nimbleFunction.
nimbleLists can also be passed as arguments to run code of nimbleFunctions and returned from nimbleFunctions. To use a nimbleList as a run function argument, the name of the nimbleList definition should be provided as the argument type, with a set of parentheses following. To return a nimbleList from the run code of a nimbleFunction, the returnType
of that function should be the name of the nimbleList definition, again using a following set of parentheses.
Below, we demonstrate a function that takes the exampleNimList
as an argument, modifies its Y
element, and returns the nimbleList.
mynf <- nimbleFunction(
run = function(vals = exampleNimListDef()){
onesMatrix <- matrix(value = 1, nrow = 2, ncol = 2)
vals$Y <- onesMatrix
returnType(exampleNimListDef())
return(vals)
})
# pass exampleNimList as argument to mynf
mynf(exampleNimList)
## nimbleList object of type nfRefClass_25
## Field "x":
## [1] 7
## Field "Y":
## [,1] [,2]
## [1,] 1 1
## [2,] 1 1
nimbleList arguments to run functions are passed by reference – this means that if an element of a nimbleList argument is modified within a function, that element will remain modified when the function has finished running. To see this, we can inspect the value of the Y
element of the exampleNimList
object and see that it has been modified.
## [,1] [,2]
## [1,] 1 1
## [2,] 1 1
In addition to storing basic data types, nimbleLists can also store other nimbleLists. To achieve this, we must create a nimbleList definition that declares the types of nested nimbleLists a nimbleList will store. Below, we create two types of nimbleLists: the first, named innerNimList
, will be stored inside the second, named outerNimList
:
# first, create definitions for both inner and outer nimbleLists
innerNimListDef <- nimbleList(someText = character(0))
outerNimListDef <- nimbleList(xList = innerNimListDef(),
z = double(0))
# then, create outer nimbleList
outerNimList <- outerNimListDef$new(z = 3.14)
# access element of inner nimbleList
outerNimList$xList$someText <- "hello, world"
Note that definitions for inner, or nested, nimbleLists must be created before the definition for an outer nimbleList.
14.2.1 Pre-defined nimbleList types
Several types of nimbleLists are predefined in NIMBLE for use with particular kinds of nimbleFunctions.
eigenNimbleList
andsvdNimbleList
are the return types of theeigen
andsvd
functions (Section @ref(sec:eigen-nimFunctions}).waicNimbleList
is the type provided for WAIC output from an MCMC.waicDetailsNimbleList
is the type provided for detailed WAIC output from an MCMC.optimResultNimbleList
is the return type ofnimOptim
.optimControlNimbleList
is the type for the control list input tonimOptim
.
14.2.2 Using eigen and svd in nimbleFunctions
NIMBLE has two linear algebra functions that return nimbleLists. The eigen
function takes a symmetic matrix, x
, as an argument and returns a nimbleList of type eigenNimbleList
. nimbleLists of type eigenNimbleList
have two elements: values
, a vector with the eigenvalues of x
, and vectors
, a square matrix with the same dimension as x
whose columns are the eigenvectors of x
. The eigen
function has two additional arguments: symmetric
and only.values
. The symmetric
argument can be used to specify if x
is a symmetric matrix or not. If symmetric = FALSE
(the default value), x
will be checked for symmetry. Eigendecompositions in NIMBLE for symmetric matrices are both faster and more accurate. Additionally, eigendecompostions of non-symmetric matrices can have complex entries, which are not supported by NIMBLE. If a complex entry is detected, NIMBLE will issue a warning and that entry will be set to NaN
. The only.values
arument defaults to FALSE
. If only.values = TRUE
, the eigen
function will not calculate the eigenvectors of x
, leaving the vectors
nimbleList element empty. This can reduce calculation time if only the eigenvalues of x
are needed.
The svd
function takes an \(n \times p\) matrix x
as an argument, and returns a nimbleList of type svdNimbleList
. nimbleLists of type svdNimbleList
have three elements: d
, a vector with the singular values of x
, u
a matrix with the left singular vectors of x
, and v
, a matrix with the right singular vectors of x
. The svd
function has an optional argument vectors
which defaults to a value of "full"
. The vectors
argument can be used to specify the number of singular vectors that are returned. If vectors = "full"
, v
will be an \(n \times n\) matrix and u
will be an \(p \times p\) matrix. If vectors = "thin"
, v
will be an\(n \times m\) matrix, where \(m = \min(n,p)\), and u
will be an \(m \times p\) matrix. If vectors = "none"
, the u
and v
elements of the returned nimbleList will not be populated.
nimbleLists created by either eigen
or svd
can be returned from a nimbleFunction, using returnType(eigenNimbleList())
or returnType(svdNimbleList())
respectively. nimbleLists created by eigen
and svd
can also be used within other nimbleLists by specifying the nimbleList element types as eigenNimbleList()
and svdNimbleList()
. The below example demonstrates the use of eigen
and svd
within a nimbleFunction.
eigenListFunctionGenerator <- nimbleFunction(
setup = function(){
demoMatrix <- diag(4) + 2
eigenAndSvdListDef <- nimbleList(demoEigenList = eigenNimbleList(),
demoSvdList = svdNimbleList())
eigenAndSvdList <- eigenAndSvdListDef$new()
},
run = function(){
# we will take the eigendecomposition and svd of a symmetric matrix
eigenAndSvdList$demoEigenList <<- eigen(demoMatrix, symmetric = TRUE,
only.values = TRUE)
eigenAndSvdList$demoSvdList <<- svd(demoMatrix, vectors = 'none')
returnType(eigenAndSvdListDef())
return(eigenAndSvdList)
})
eigenListFunction <- eigenListFunctionGenerator()
outputList <- eigenListFunction$run()
outputList$demoEigenList$values
## [1] 9 1 1 1
## [1] 9 1 1 1
The eigenvalues and singular values returned from the above function are the same since the matrix being decomposed was symmetric. However, note that both eigendecompositions and singular value decompositions are numerical procedures, and computed solutions may have slight differences even for a symmetric input matrix.