Chapter 14 Data structures in NIMBLE

NIMBLE provides several data structures useful for programming.

We’ll first describe modelValues, which are containers designed for storing values for models. Then in Section 14.2 we’ll describe nimbleLists, which have a similar purpose to lists in R, allowing you to store heterogeneous information in a single object.

modelValues can be created in either R or in nimbleFunction setup code. nimbleLists can be created in R code, in nimbleFunction setup code, and in nimbleFunction run code, from a nimbleList definition created in R or setup code. Once created, modelValues and nimbleLists can then be used either in R or in nimbleFunction setup or run code. If used in run code, they will be compiled along with the nimbleFunction.

14.1 The modelValues data structure

modelValues are containers designed for storing values for models. They may be used for model outputs or model inputs. A modelValues object will contain rows of variables. Each row contains one object of each variable, which may be multivariate. The simplest way to build a modelValues object is from a model object. This will create a modelValues object with the same variables as the model. Although they were motivated by models, one is free to set up a modelValues with any variables one wants.

As with the material in the rest of this chapter, modelValues objects will generally be used in nimbleFunctions that interact with models (see Chapter 15)24. modelValues objects can be defined either in setup code or separately in R (and then passed as an argument to setup code). The modelValues object can then used in run code of nimbleFunctions.

14.1.1 Creating modelValues objects

Here is a simple example of creating a modelValues object:

pumpModelValues = modelValues(pumpModel, m = 2)
pumpModel$x
##  [1]  5  1  5 14  3 19  1  1  4 22
pumpModelValues$x
## [[1]]
##  [1] NA NA NA NA NA NA NA NA NA NA
## 
## [[2]]
##  [1] NA NA NA NA NA NA NA NA NA NA

In this example, pumpModelValues has the same variables as pumpModel, and we set pumpModelValues to have m = 2 rows. As you can see, the rows are stored as elements of a list.

Alternatively, one can define a modelValues object manually by first defining a modelValues configuration via the modelValuesConf function, and then creating an instance from that configuration, like this:

mvConf = modelValuesConf(vars = c('a', 'b', 'c'), 
                         type = c('double', 'int', 'double'), 
                         size = list(a = 2, b =c(2,2), c = 1) )

customMV = modelValues(mvConf, m = 2)
customMV$a
## [[1]]
## [1] NA NA
## 
## [[2]]
## [1] NA NA

The arguments to modelValuesConf are matching lists of variable names, types, and sizes. See help(modelValuesConf) for more details. Note that in R execution, the types are not enforced. But they will be the types created in C++ code during compilation, so they should be specified carefully.

The object returned by modelValues is an uncompiled modelValues object. When a nimbleFunction is compiled, any modelValues objects it uses are also compiled. A NIMBLE model always contains a modelValues object that it uses as a default location to store the values of its variables.

Here is an example where the customMV created above is used as the setup argument for a nimbleFunction, which is then compiled. Its compiled modelValues is then accessed with $.

# simple nimbleFunction that uses a modelValues object
resizeMV <- nimbleFunction(
  setup = function(mv){},
  run = function(k = integer() ){
    resize(mv,k)})

rResize <- resizeMV(customMV)
cResize <- compileNimble(rResize)
cResize$run(5)
## NULL
cCustomMV <- cResize$mv
# cCustomMV is a compiled modelValues object
cCustomMV[['a']]
## [[1]]
## [1] NA NA
## 
## [[2]]
## [1] NA NA
## 
## [[3]]
## [1] 0 0
## 
## [[4]]
## [1] 0 0
## 
## [[5]]
## [1] 0 0

Compiled modelValues objects can be accessed and altered in all the same ways as uncompiled ones. However, only uncompiled modelValues can be used as arguments to setup code in nimbleFunctions.

In the example above a modelValues object is passed to setup code, but a modelValues configuration can also be passed, with creation of modelValues object(s) from the configuration done in setup code.

14.1.2 Accessing contents of modelValues

The values in a modelValues object can be accessed in several ways from R, and in fewer ways from NIMBLE.

# sets the first row of a to (0, 1).  R only.
customMV[['a']][[1]] <- c(0,1)   

# sets the second row of a to (2, 3)
customMV['a', 2] <- c(2,3)       

# can access subsets of each row
customMV['a', 2][2] <- 4

# accesses all values of 'a'. Output is a list.  R only.
customMV[['a']]                  
## [[1]]
## [1] 0 1
## 
## [[2]]
## [1] 2 4
# sets the first row of b to a matrix with values 1. R only.
customMV[['b']][[1]] <- matrix(1, nrow = 2, ncol = 2)  

# sets the second row of b.  R only.
customMV[['b']][[2]] <- matrix(2, nrow = 2, ncol = 2)  

# make sure the size of inputs is correct
# customMV['a', 1] <- 1:10  "
# problem: size of 'a' is 2, not 10!
# will cause problems when compiling nimbleFunction using customMV

Currently, only the syntax customMV["a", 2] works in the NIMBLE language, not customMV[["a"][[2]].

We can query and change the number of rows using getsize and resize, respectively. These work in both R and NIMBLE. Note that we don’t specify the variables in this case: all variables in a modelValues object will have the same number of rows.

getsize(customMV)
## [1] 2
resize(customMV, 3)
getsize(customMV)
## [1] 3
customMV$a
## [[1]]
## [1] 0 1
## 
## [[2]]
## [1] 2 4
## 
## [[3]]
## [1] NA NA

Often it is useful to convert a modelValues object to a matrix for use in R. For example, we may want to convert MCMC output into a matrix for use with the coda package for processing MCMC samples. This can be done with the as.matrix method for modelValues objects. This will generate column names from every scalar element of variables (e.g. “b[1, 1]” ,“b[2, 1]”, etc.). The rows of the modelValues will be the rows of the matrix, with any matrices or arrays converted to a vector based on column-major ordering.

as.matrix(customMV, 'a')   # convert 'a'
##      a[1] a[2]
## [1,]    0    1
## [2,]    2    4
## [3,]   NA   NA
as.matrix(customMV)        # convert all variables
##      a[1] a[2] b[1, 1] b[2, 1] b[1, 2] b[2, 2] c[1]
## [1,]    0    1       1       1       1       1   NA
## [2,]    2    4       2       2       2       2   NA
## [3,]   NA   NA      NA      NA      NA      NA   NA

If a variable is a scalar, using unlist in R to extract all rows as a vector can be useful.

customMV['c', 1] <- 1
customMV['c', 2] <- 2
customMV['c', 3] <- 3
unlist(customMV['c', ])
## [1] 1 2 3

Once we have a modelValues object, we can see the structure of its contents via the varNames and sizes components of the object.

customMV$varNames
## [1] "a" "b" "c"
customMV$sizes
## $a
## [1] 2
## 
## $b
## [1] 2 2
## 
## $c
## [1] 1

As with most NIMBLE objects, modelValues are passed by reference, not by value. That means any modifications of modelValues objects in either R functions or nimbleFunctions will persist outside of the function. This allows for more efficient computation, as stored values are immediately shared among nimbleFunctions.

alter_a <- function(mv){
  mv['a',1][1] <- 1
}
customMV['a', 1]
## [1] 0 1
alter_a(customMV)
customMV['a',1]
## [1] 1 1

However, when you retrieve a variable from a modelValues object, the result is a standard R list, which is subsequently passed by value, as usual in R.

14.1.2.1 Automating calculation and simulation using modelValues

The nimbleFunctions simNodesMV, calcNodesMV, and getLogProbsMV can be used to operate on a model based on rows in a modelValues object. For example, simNodesMV will simulate in the model multiple times and record each simulation in a row of its modelValues. calcNodesMV and getLogProbsMV iterate over the rows of a modelValues, copy the nodes into the model, and then do their job of calculating or collecting log probabilities (densities), respectively. Each of these returns a numeric vector with the summed log probabilities of the chosen nodes from each row. calcNodesMV will save the log probabilities back into the modelValues object if saveLP = TRUE, a run-time argument.

Here are some examples:

mv <- modelValues(simpleModel)
rSimManyXY <- simNodesMV(simpleModel, nodes = c('x', 'y'), mv = mv)
rCalcManyXDeps <- calcNodesMV(simpleModel, nodes = 'x', mv = mv)
rGetLogProbMany <- getLogProbNodesMV(simpleModel, nodes = 'x', mv = mv)

cSimManyXY <- compileNimble(rSimManyXY, project = simpleModel)
cCalcManyXDeps <- compileNimble(rCalcManyXDeps, project = simpleModel)
cGetLogProbMany <- compileNimble(rGetLogProbMany, project = simpleModel)

cSimManyXY$run(m = 5) # simulating 5 times
## NULL
cCalcManyXDeps$run(saveLP = TRUE) # calculating 
## [1] -12.43447 -17.17085 -16.52450 -16.49514 -12.73522
cGetLogProbMany$run() #
## [1] -12.43447 -17.17085 -16.52450 -16.49514 -12.73522

14.2 The nimbleList data structure

nimbleLists provide a container for storing different types of objects in NIMBLE, similar to the list data structure in R. Before a nimbleList can be created and used, a definition25 for that nimbleList must be created that provides the names, types, and dimensions of the elements in the nimbleList. nimbleList definitions must be created in R (either in R’s global environment or in setup code), but the nimbleList instances can be created in run code.

Unlike lists in R, nimbleLists must have the names and types of all list elements provided by a definition before the list can be used. A nimbleList definition can be made by using the nimbleList function in one of two manners. The first manner is to provide the nimbleList function with a series of expressions of the form name = type(nDim), similar to the specification of run-time arguments to nimbleFunctions. The types allowed for a nimbleList are the same as those allowed as run-time arguments to a nimbleFunction, described in Section 11.4. For example, the following line of code creates a nimbleList definition with two elements: x, which is a scalar integer, and Y, which is a matrix of doubles.

 exampleNimListDef <- nimbleList(x = integer(0), Y = double(2))

The second method of creating a nimbleList definition is by providing an R list of nimbleType objects to the nimbleList() function. A nimbleType object can be created using the nimbleType function, which must be provided with three arguments: the name of the element being created, the type of the element being created, and the dim of the element being created. For example, the following code creates a list with two nimbleType objects and uses these objects to create a nimbleList definition.

nimbleListTypes <- list(nimbleType(name = 'x', type = 'integer', dim = 0),
                        nimbleType(name = 'Y', type = 'double', dim = 2))

 # this nimbleList definition is identical to the one created above
 exampleNimListDef2 <- nimbleList(nimbleListTypes)

Creating definitions using a list of nimbleTypes can be useful, as it allows for programmatic generation of nimbleList elements.

Once a nimbleList definition has been created, new instances of nimbleLists can be made from that definition using the new member function. The new function can optionally take initial values for the list elements as arguments. Below, we create a new nimbleList from our exampleNimListDef and specify values for the two elements of our list:

 exampleNimList <- exampleNimListDef$new(x = 1, Y = diag(2))

Once created, nimbleList elements can be accessed using the $ operator, just as with lists in R. For example, the value of the x element of our exampleNimbleList can be set to 7 using

 exampleNimList$x <- 7

nimbleList definitions can be created either in R’s global environment or in setup code of a nimbleFunction. Once a nimbleList definition has been made, new instances of nimbleLists can be created using the new function in R’s global environment, in setup code, or in run code of a nimbleFunction.

nimbleLists can also be passed as arguments to run code of nimbleFunctions and returned from nimbleFunctions. To use a nimbleList as a run function argument, the name of the nimbleList definition should be provided as the argument type, with a set of parentheses following. To return a nimbleList from the run code of a nimbleFunction, the returnType of that function should be the name of the nimbleList definition, again using a following set of parentheses.

Below, we demonstrate a function that takes the exampleNimList as an argument, modifies its Y element, and returns the nimbleList.

mynf <- nimbleFunction(
  run = function(vals = exampleNimListDef()){
    onesMatrix <- matrix(value = 1, nrow = 2, ncol = 2)
    vals$Y  <- onesMatrix
    returnType(exampleNimListDef())
    return(vals)
  })
 
# pass exampleNimList as argument to mynf
mynf(exampleNimList)
## nimbleList object of type nfRefClass_65
## Field "x":
## [1] 7
## Field "Y":
##      [,1] [,2]
## [1,]    1    1
## [2,]    1    1

nimbleList arguments to run functions are passed by reference – this means that if an element of a nimbleList argument is modified within a function, that element will remain modified when the function has finished running. To see this, we can inspect the value of the Y element of the exampleNimList object and see that it has been modified.

exampleNimList$Y
##      [,1] [,2]
## [1,]    1    1
## [2,]    1    1

In addition to storing basic data types, nimbleLists can also store other nimbleLists. To achieve this, we must create a nimbleList definition that declares the types of nested nimbleLists a nimbleList will store. Below, we create two types of nimbleLists: the first, named innerNimList, will be stored inside the second, named outerNimList:

# first, create definitions for both inner and outer nimbleLists
innerNimListDef <- nimbleList(someText = character(0))
outerNimListDef <- nimbleList(xList = innerNimListDef(),
                              z = double(0))

# then, create outer nimbleList
outerNimList <- outerNimListDef$new(z = 3.14)

# access element of inner nimbleList
outerNimList$xList$someText <- "hello, world"

Note that definitions for inner, or nested, nimbleLists must be created before the definition for an outer nimbleList.

14.2.1 Using eigen and svd in nimbleFunctions

NIMBLE has two linear algebra functions that return nimbleLists. The eigen function takes a symmetic matrix, x, as an argument and returns a nimbleList of type eigenNimbleList. nimbleLists of type eigenNimbleList have two elements: values, a vector with the eigenvalues of x, and vectors, a square matrix with the same dimension as x whose columns are the eigenvectors of x. The eigen function has two additional arguments: symmetric and only.values. The symmetric argument can be used to specify if x is a symmetric matrix or not. If symmetric = FALSE (the default value), x will be checked for symmetry. Eigendecompositions in NIMBLE for symmetric matrices are both faster and more accurate. Additionally, eigendecompostions of non-symmetric matrices can have complex entries, which are not supported by NIMBLE. If a complex entry is detected, NIMBLE will issue a warning and that entry will be set to NaN. The only.values arument defaults to FALSE. If only.values = TRUE, the eigen function will not calculate the eigenvectors of x, leaving the vectors nimbleList element empty. This can reduce calculation time if only the eigenvalues of x are needed.

The svd function takes an \(n \times p\) matrix x as an argument, and returns a nimbleList of type svdNimbleList. nimbleLists of type svdNimbleList have three elements: d, a vector with the singular values of x, u a matrix with the left singular vectors of x, and v, a matrix with the right singular vectors of x. The svd function has an optional argument vectors which defaults to a value of "full". The vectors argument can be used to specify the number of singular vectors that are returned. If vectors = "full", v will be an \(n \times n\) matrix and u will be an \(p \times p\) matrix. If vectors = "thin", v will be an\(n \times m\) matrix, where \(m = \min(n,p)\), and u will be an \(m \times p\) matrix. If vectors = "none", the u and v elements of the returned nimbleList will not be populated.

nimbleLists created by either eigen or svd can be returned from a nimbleFunction, using returnType(eigenNimbleList()) or returnType(svdNimbleList()) respectively. nimbleLists created by eigen and svd can also be used within other nimbleLists by specifying the nimbleList element types as eigenNimbleList() and svdNimbleList(). The below example demonstrates the use of eigen and svd within a nimbleFunction.

eigenListFunctionGenerator <- nimbleFunction(
  setup = function(){
    demoMatrix <- diag(4) + 2
    eigenAndSvdListDef <- nimbleList(demoEigenList = eigenNimbleList(), 
                                     demoSvdList = svdNimbleList())
    eigenAndSvdList <- eigenAndSvdListDef$new()
  },
  run = function(){
    # we will take the eigendecomposition and svd of a symmetric matrix
    eigenAndSvdList$demoEigenList <<- eigen(demoMatrix, symmetric = TRUE, 
                                            only.values = TRUE)
    eigenAndSvdList$demoSvdList <<- svd(demoMatrix, vectors = 'none')
    returnType(eigenAndSvdListDef())
    return(eigenAndSvdList)
  })
eigenListFunction <- eigenListFunctionGenerator()

outputList <-  eigenListFunction$run()
outputList$demoEigenList$values
## [1] 9 1 1 1
outputList$demoSvdList$d
## [1] 9 1 1 1

The eigenvalues and singular values returned from the above function are the same since the matrix being decomposed was symmetric. However, note that both eigendecompositions and singular value decompositions are numerical procedures, and computed solutions may have slight differences even for a symmetric input matrix.


  1. One may want to read this section after an initial reading of Chapter 15.

  2. The configuration for a modelValues object is the same concept as a definition here; in a future release of NIMBLE we may make the usage more consistent between modelValues and nimbleLists.