Item Response Theory (IRT) refers to a family of models that attempt to explain the relationship between latent traits and their manifestation in observations of the individual. The basic assumption is that an individual’s observable behavior is related to this latent trait. Common contexts include educational testing and health status.
In a testing context, the standard setting considers \(I\) items and \(P\) persons for which responses \(y_{pi} \in \{0, 1\}\) are observed with values encoding incorrect and correct answers. The probability of a correct answer for each item/person \(\pi_{pi} = \text{Pr}(y_{pi} = 1)\) is modeled via logistic or probit regression, conditionally on a set of parameters encoding the characteristics of respondents and items. Typically a parameter \(\eta_p\) is defined for each respondent \(p = 1, \ldots, P\) representing the unobserved ability of the respondent. Parameters \(\{\eta_p\}_1^P\) are assumed to be drawn from a common distribution, taking the role of ‘random effects’.
Different models can be defined according to the number of parameters associated with each item. Here we consider two classic models for dichotomous item responses, namely the one-parameter logistic (1PL) model and the two-parameter logistic (2PL) model.
For this example, we use the Law School Admission Test (LSAT) data
from the ltm
package, which is a classic example in
educational testing (Bock and Lieberman
1970). The dataset comprises \(1000\) responses to \(5\) questions.
LSATdata <- ltm::LSAT
The one-parameter logistic model (1PL) assumes the presence of one parameter, \(\beta_i\), for each item \(i = 1, \ldots, I\) encoding the difficulty of the item. We choose a non-informative normal prior \(\mathcal{N}(0, 100)\) for the difficulty parameters, and a uniform distribution for abilities variance.
library(nimble, warn.conflicts = FALSE)
## nimble version 1.3.0 is loaded.
## For more information on NIMBLE and a User Manual,
## please visit https://R-nimble.org.
##
## Note for advanced users who have written their own MCMC samplers:
## As of version 0.13.0, NIMBLE's protocol for handling posterior
## predictive nodes has changed in a way that could affect user-defined
## samplers in some situations. Please see Section 15.5.1 of the User Manual.
code1PL <- nimbleCode({
for(i in 1:I) {
for(p in 1:P) {
y[p,i] ~ dbern(pi[p,i])
logit(pi[p, i]) <- eta[p] - beta[i]
}
beta[i] ~ dnorm(0, var = 100)
}
for(p in 1:P) {
eta[p] ~ dnorm(0, sd = sd_eta)
}
sd_eta ~ dunif(0, 100) # prior for variance components based on Gelman (2006)
})
constants <- list(I = ncol(LSATdata), P = nrow(LSATdata))
data <- list(y = LSATdata)
set.seed(1)
inits <- list(beta = rnorm(constants$I, 0, 1),
eta = rnorm(constants$P, 0, 1),
sd_eta = 3)
monitors = c("beta", "eta", "sd_eta")
model1PL <- nimbleModel(code1PL, constants, data, inits)
## Defining model
## Building model
## Setting data and initial values
## Running calculate on model
## [Note] Any error reports that follow may simply reflect missing values in model variables.
## Checking model sizes and dimensions
cModel1PL <- compileNimble(model1PL)
## Compiling
## [Note] This may take a minute.
## [Note] Use 'showCompilerOutput = TRUE' to see C++ compilation details.
conf1PL <- configureMCMC(model1PL, monitors = monitors)
## ===== Monitors =====
## thin = 1: beta, eta, sd_eta
## ===== Samplers =====
## RW sampler (1006)
## - beta[] (5 elements)
## - sd_eta
## - eta[] (1000 elements)
model1PLMCMC <- buildMCMC(conf1PL)
cModel1PLMCMC <- compileNimble(model1PLMCMC, project = model1PL)
## Compiling
## [Note] This may take a minute.
## [Note] Use 'showCompilerOutput = TRUE' to see C++ compilation details.
Run the model and save posterior samples.
system.time(samples1PL <- runMCMC(cModel1PLMCMC, niter = 20000, nburnin = 10000))
## running chain 1...
## |-------------|-------------|-------------|-------------|
## |-------------------------------------------------------|
## user system elapsed
## 26.225 0.083 26.309
We can obtain a summary of the estimates of item difficulties.
betaCols <- grep("beta", colnames(samples1PL))
sd_etaCol <- grep("sd_eta", colnames(samples1PL))
samplesSummary(samples1PL[, c(betaCols, sd_etaCol)])
## Mean Median St.Dev. 95%CI_low 95%CI_upp
## beta[1] -2.7426819 -2.7411894 0.13037380 -2.9963098 -2.4855966
## beta[2] -1.0011783 -1.0009559 0.07833695 -1.1523951 -0.8495846
## beta[3] -0.2433821 -0.2453595 0.07151934 -0.3847355 -0.1014021
## beta[4] -1.3106872 -1.3104853 0.08184534 -1.4721001 -1.1478006
## beta[5] -2.1100184 -2.1103943 0.10782355 -2.3244911 -1.8996280
## sd_eta 0.7672692 0.7669722 0.06783533 0.6371932 0.9018647
Let’s check posterior convergence. In this case, the variance component appears to be the slowest mixing, and we would probably want to run the MCMC for longer.
par(mfrow = c(1, 3), cex = 1.1)
for(i in 1:5)
ts.plot(samples1PL[ , betaCols[i]], xlab = 'iteration', ylab = colnames(samples1PL)[ betaCols[i]])
ts.plot(samples1PL[ , sd_etaCol], xlab = 'iteration', ylab = colnames(samples1PL)[sd_etaCol])
We can use the posterior samples to estimate the posterior Item Characteristic Curve (ICC) for an interval of ability values.
## plot of ICC
abilityGrid <- seq(-8, 5, length = 1000)
iccVals <- matrix(0, 1000, constants$I)
for(i in seq_len(constants$I)) {
tmp <- sapply(abilityGrid, function(x) expit(x - samples1PL[, betaCols[i]]))
iccVals[,i] <- colMeans(tmp)
}
library(ggplot2)
rownames(iccVals) <- abilityGrid
df <- reshape2::melt(iccVals)
colnames(df) <- c("Ability", "Item", "Probability")
df$Item <- factor(df$Item, levels = 1:5, labels = paste0("Item ", 1:5))
ggplot(df, aes(x = Ability, y = Probability, group = Item)) +
geom_line(aes(color = Item)) + theme_bw() +
ggtitle("Item characteristic curves - 1PL model")
A generalization of the 1PL model is the two-parameter logistic model, which considers an additional parameter, \(\lambda_i\), for each item \(i = 1, \ldots, I\), often referred as discrimination parameter, because items with a large \(\lambda_i\) are better at discriminating between subjects with different abilities. Typically the \(\lambda_i\)’s are assumed to be positive, so we choose independent lognormal distributions as priors.
Given the additional parameter, the metric (location and scale) of the individual parameters is only known up to a linear transformation. In the literature, several types of restrictions are proposed to anchor the metric, either acting on the item or ability parameters. A simple restriction is to assume that abilities are drawn from a \(\mathcal{N}(0,1)\) and freely estimate item parameters.
code2PL <- nimbleCode({
for(i in 1:I) {
for(p in 1:P) {
y[p, i] ~ dbern(pi[p, i])
logit(pi[p, i]) <- lambda[i]*eta[p] + beta[i]
}
}
for(i in 1:I) {
beta[i] ~ dnorm(0, var = 100)
lambda[i] ~ dlnorm(meanlog = 0, sdlog = 6)
}
for(p in 1:P) {
eta[p] ~ dnorm(0, 1)
}
})
constants <- list(I = ncol(LSATdata), P = nrow(LSATdata))
data <- list(y = LSATdata)
set.seed(2)
inits <- list(beta = rnorm(constants$I, 0, 3),
eta = rnorm(constants$P, 0, 3),
lambda = rep(2, constants$I))
monitors = c("beta", "lambda", "eta")
model2PL <- nimbleModel(code2PL, constants, data, inits)
## Defining model
## Building model
## Setting data and initial values
## Running calculate on model
## [Note] Any error reports that follow may simply reflect missing values in model variables.
## Checking model sizes and dimensions
cModel2PL <- compileNimble(model2PL)
## Compiling
## [Note] This may take a minute.
## [Note] Use 'showCompilerOutput = TRUE' to see C++ compilation details.
conf2PL <- configureMCMC(model2PL, monitors = monitors)
## ===== Monitors =====
## thin = 1: beta, eta, lambda
## ===== Samplers =====
## RW sampler (1010)
## - beta[] (5 elements)
## - lambda[] (5 elements)
## - eta[] (1000 elements)
model2PLMCMC <- buildMCMC(conf2PL)
cModel2PLMCMC <- compileNimble(model2PLMCMC, project = model2PL)
## Compiling
## [Note] This may take a minute.
## [Note] Use 'showCompilerOutput = TRUE' to see C++ compilation details.
system.time(samples2PL <- runMCMC(cModel2PLMCMC, niter = 20000, nburnin = 10000))
## running chain 1...
## |-------------|-------------|-------------|-------------|
## |-------------------------------------------------------|
## user system elapsed
## 34.785 0.076 34.862
Similarly to the previous case, we can obtain posterior summaries and check the convergence and the item characteristic curves. Once again we would want to run the MCMC for longer.
betaCols <- grep("beta", colnames(samples2PL))
lambdaCols <- grep("lambda", colnames(samples2PL))
samplesSummary(samples2PL[, c(betaCols, lambdaCols)])
## Mean Median St.Dev. 95%CI_low 95%CI_upp
## beta[1] 2.7649371 2.7352719 0.22614207 2.40438805 3.2748143
## beta[2] 0.9853351 0.9813277 0.09174312 0.81542706 1.1738045
## beta[3] 0.2538296 0.2513119 0.08101397 0.09858091 0.4177354
## beta[4] 1.2798913 1.2760248 0.10112151 1.09499863 1.4976457
## beta[5] 2.0507451 2.0401487 0.14104010 1.80770396 2.3664747
## lambda[1] 0.7497019 0.7373295 0.31388119 0.09284423 1.4224005
## lambda[2] 0.6829034 0.6634760 0.20829358 0.33010676 1.1345390
## lambda[3] 0.9375830 0.8847020 0.31662639 0.48497486 1.7623309
## lambda[4] 0.6498310 0.6327764 0.19970123 0.30185887 1.0783772
## lambda[5] 0.6004614 0.6008976 0.25028211 0.10374343 1.1119834
par(mfrow = c(1, 5), cex = 1.1)
for(i in 1:5)
ts.plot(samples2PL[ , betaCols[i]], xlab = 'iteration', ylab = colnames(samples2PL)[ betaCols[i]])
par(mfrow = c(1, 5), cex = 1.1)
for(i in 1:5)
ts.plot(samples2PL[ , lambdaCols[i]], xlab = 'iteration', ylab = colnames(samples2PL)[ lambdaCols[i]])
abilityGrid <- seq(-8, 12, length = 1000)
iccVals <- matrix(0, 1000, constants$I)
for(i in seq_len(constants$I)) {
## evaluate probability at each ability point
tmp <- sapply(abilityGrid, function(x) expit(samples2PL[, lambdaCols[i]]*(x - samples2PL[, betaCols[i]])))
iccVals[,i] <- apply(tmp, 2, mean)
}
rownames(iccVals) <- abilityGrid
df <- reshape2::melt(iccVals)
colnames(df) <- c("Ability", "Item", "Probability")
df$Item <- factor(df$Item, levels = 1:5, labels = paste0("Item ", 1:5))
ggplot(df, aes(x = Ability, y = Probability, group = Item)) +
geom_line(aes(color = Item)) + theme_bw() +
ggtitle("Item characteristic curves - 2PL model")