First, let’s load the packages we’ll sue

require(rjags)
require(Rgraphviz)
require(data.table)

Create some simulated data:

set.seed(432104)

Simulated RIT Equating Example

First pass at a model

OK. So let’s define a simple model is the basis for our simulated data and our probabilistic model of inference. First we’ll assume that there is in fact a latent ability attribute, \(\theta_i\), for each student \(i \in {1,\dots,.n}\) in our sample. Further, we’ll assume that this latent trait is accurately measured by an unobserved RIT score, which we’ll denote \(r_{i}\). In other words, RIT scores are a direct measure of \(\theta_i\). Luckily for us, students take a test that gives us an estimate of RIT scores, the MAP. So we assume that the observed RIT, \(\hat{r}\) is a realization from a distribution centered on \(r\) Finally, we assume that all other academic measures (interim assessments, Khan objectives, individual test items, even Do Nows/Exit Tickets). So we have observed realization of a RIT score for a student as well as all the other academic indicators. All other RVs and parameters need to be estimated. And our goal here is to make inferences about \(r_i\) given our observation of other academic indicators.

Here’s a graphical representation of our probability model:

rit.graph <- new("graphNEL", 
                 nodes=c("theta[i]",
                         "r[i]",
                         "rhat[i]",
                         "a[i]",
                         "k[i]",
                         "t[i]",
                         "ahat[i]",
                         "khat[i]",
                         "that[i]"
                         ),
                 edgemode="directed")

rit.graph <- addEdge("theta[i]", "r[i]", rit.graph, 1)
rit.graph <- addEdge("r[i]", "rhat[i]" , rit.graph, 1)
rit.graph <- addEdge("rhat[i]", "a[i]" , rit.graph, 1)
rit.graph <- addEdge("rhat[i]", "k[i]" , rit.graph, 1)
rit.graph <- addEdge("rhat[i]", "t[i]" , rit.graph, 1)
rit.graph <- addEdge("a[i]", "ahat[i]" , rit.graph, 1)
rit.graph <- addEdge("k[i]", "khat[i]" , rit.graph, 1)
rit.graph <- addEdge("t[i]", "that[i]" , rit.graph, 1)

plot(rit.graph)

plot of chunk equat_graph

Now, let’s simulate some data based on this basic probability model. The \(\theta_i\)’s need a prior and I’ll use the school level norms (because I’m on a plane and don’t have the student level norms with me; we can change these later) for fall 5th grade math. Thinking about this, we can collapse the \(\theta_i\)s and \(r_i\), since we are assuming theta exists on the the RIT scale. I’ll use \(\theta_i\) to denote the RIT-scaled latent variable.

set.seed(432104)

n <- 90 # number of students

r<-rnorm(n = n, mean = 211.39, sd =5.83)

The \(\theta_i\)s are then noisily measured by the MAP test. I’ll assume a standard deviation of about 3, but that is completely arbitrary.

rhat<-rnorm(n, r, sd=rep(3, times=n))

head(rhat)
## [1] 211.9 226.3 195.8 217.3 201.2 227.7

OK. let’s make some assumptions about how \(\hat{r}\) is related to each of the three realizations of our academic indicators. Let’s start with some arbitrary assessment that is graded on a zero to one scale (or zero to 100%) a realization of which is denoted \(\hat{a}_i\). We’ll keep it simple and assume that the \(\hat{r}_i\) are linear predictors of \(z.a_{i}\) in the inverse-logistic function. That is,

\[ \begin{aligned} z.a_{i} & = \alpha + \beta \cdot \hat{r}_i \\ \hat{a}_{i} & = \frac{1}{1+\exp^{-z.a{i}}} \end{aligned} \]

Assuming \(\alpha = -1\) and \(\beta = 0.01\), we simulating the data like so:

alpha<- -1
beta<-.01
z.a<- alpha + beta*rhat
head(z.a)
## [1] 1.119 1.263 0.958 1.173 1.012 1.277
a<-1/(1+exp(-z.a))
head(a)
## [1] 0.7538 0.7796 0.7227 0.7637 0.7333 0.7820
ahat <- rnorm(n=n, mean=a,sd=.025)
head(ahat)
## [1] 0.7420 0.7609 0.7291 0.7966 0.6997 0.7730

Let’s pretend, for now, that this the only non-MAP assessment that we have. Consequently, all we have observed are the \(\hat{a}_i\) and the current RIT scores \(\hat{r}_i\). Here’s the simplified model in JAGS/BUGS syntax:

model.jags<-"
  model {
    for(i in 1:n) {
    r[i] ~ dnorm(211.39, 1/5.39)
    rhat[i] ~ dnorm(r[i],tau)
    z[i] <- alpha +  beta*rhat[i]
    a[i] <- 1/(1 + exp(-z[i]))
    ahat[i] ~ dnorm(a[i], tau.a)
    }
 
# priors
  beta ~ dnorm(0,.0001)
  alpha ~ dnorm(0, .0001)
  tau   ~ dgamma(1e-3,pow(15,2)*1e-3)
  tau.a   ~ dgamma(1e-3,pow(15,2)*1e-3)
  sigma.a <- pow(1/tau,1/2)
}
"
model.spec<-textConnection(model.jags)

Now run the mode and take a look at \(\alpha\) and \(\beta\) and see that they are relative the estimated values are relatively close to the acutal values (-1 and 0.01, respectivly):

jags <- jags.model(model.spec,
                   data = list('ahat' = ahat,
                               'rhat' = rhat,
                               'n' = n),
                   n.chains=4,
                   n.adapt=100)
## Compiling model graph
##    Resolving undeclared variables
##    Allocating nodes
##    Graph Size: 829
## 
## Initializing model
samps.jags<-jags.samples(jags,
             c('beta', 'alpha'),
             10000)

samps.jags
## $alpha
## mcarray:
## [1] -0.1204
## 
## Marginalizing over: iteration(10000),chain(4) 
## 
## $beta
## mcarray:
## [1] 3.691
## 
## Marginalizing over: iteration(10000),chain(4)

No let’s look at the estimated \(a_i\)’s and \(r_i\)’s (the latter being our variable of interest)

samps.coda <- coda.samples(jags,
             c('a', 'r', 'beta', 'alpha'),
             10000)
summary.coda<-summary(samps.coda)

plot(samps.coda)

plot of chunk coda plot of chunk coda plot of chunk coda plot of chunk coda plot of chunk coda plot of chunk coda plot of chunk coda plot of chunk coda plot of chunk coda plot of chunk coda plot of chunk coda plot of chunk coda plot of chunk coda plot of chunk coda plot of chunk coda plot of chunk coda plot of chunk coda plot of chunk coda plot of chunk coda plot of chunk coda plot of chunk coda plot of chunk coda plot of chunk coda plot of chunk coda plot of chunk coda plot of chunk coda plot of chunk coda plot of chunk coda plot of chunk coda plot of chunk coda plot of chunk coda plot of chunk coda plot of chunk coda plot of chunk coda plot of chunk coda plot of chunk coda plot of chunk coda plot of chunk coda plot of chunk coda plot of chunk coda plot of chunk coda plot of chunk coda plot of chunk coda plot of chunk coda plot of chunk coda plot of chunk coda

Let’s extract the estimated \(r_i\)’s from the coda samples

r.estimated<-as.vector(summary.coda$statistics[93:nrow(summary.coda$statistics),"Mean"]) 

rs<-data.table(Actual=r, Estimated=r.estimated)
rs[,Residual:=Estimated-Actual]
##     Actual Estimated Residual
##  1:  210.5     211.5   0.9988
##  2:  224.7     212.6 -12.0525
##  3:  201.0     210.1   9.0722
##  4:  215.7     211.9  -3.7922
##  5:  203.4     210.6   7.1646
##  6:  229.2     212.7 -16.4674
##  7:  222.6     212.7  -9.9184
##  8:  221.7     212.1  -9.6718
##  9:  222.5     212.2 -10.2730
## 10:  202.0     210.4   8.4501
## 11:  193.5     210.0  16.4241
## 12:  213.0     211.5  -1.4872
## 13:  214.2     211.8  -2.4614
## 14:  213.3     211.4  -1.8927
## 15:  213.1     211.4  -1.6952
## 16:  206.4     211.2   4.7035
## 17:  229.3     213.2 -16.0826
## 18:  209.6     211.0   1.4385
## 19:  216.7     211.9  -4.7118
## 20:  204.7     211.2   6.4459
## 21:  209.2     211.5   2.2283
## 22:  208.2     211.1   2.8972
## 23:  212.6     211.5  -1.0663
## 24:  209.5     211.2   1.7171
## 25:  209.8     211.1   1.2855
## 26:  207.6     211.0   3.4100
## 27:  215.0     212.2  -2.7841
## 28:  208.1     211.0   2.8786
## 29:  208.1     210.8   2.7024
## 30:  208.0     211.1   3.0994
## 31:  207.2     210.9   3.7604
## 32:  215.4     211.8  -3.6098
## 33:  213.8     211.4  -2.3702
## 34:  215.1     212.0  -3.1475
## 35:  215.3     212.4  -2.8601
## 36:  223.7     212.1 -11.5720
## 37:  211.8     211.5  -0.2844
## 38:  212.0     211.7  -0.2179
## 39:  218.1     211.7  -6.4019
## 40:  224.3     212.3 -11.9642
## 41:  211.3     210.9  -0.4545
## 42:  216.6     212.1  -4.5519
## 43:  210.6     211.3   0.7114
## 44:  204.2     210.5   6.2880
## 45:  205.0     210.8   5.7614
## 46:  200.4     210.5  10.0298
## 47:  209.8     211.5   1.6949
## 48:  213.0     211.6  -1.3659
## 49:  218.3     211.9  -6.3417
## 50:  207.9     210.7   2.8495
## 51:  215.0     211.3  -3.6801
## 52:  217.8     212.0  -5.7443
## 53:  211.7     211.6  -0.1424
## 54:  203.4     210.5   7.1612
## 55:  213.2     211.2  -2.0624
## 56:  215.3     211.7  -3.5414
## 57:  215.0     211.4  -3.6403
## 58:  210.6     211.3   0.7520
## 59:  204.3     211.0   6.6555
## 60:  199.7     210.4  10.6870
## 61:  216.5     212.0  -4.4865
## 62:  213.5     211.5  -2.0410
## 63:  216.8     211.9  -4.9349
## 64:  207.0     211.1   4.1449
## 65:  198.1     210.1  11.9984
## 66:  218.8     212.3  -6.4710
## 67:  214.9     212.1  -2.7952
## 68:  209.8     212.0   2.1165
## 69:  214.5     212.1  -2.3816
## 70:  213.6     211.5  -2.0547
## 71:  202.1     210.2   8.0917
## 72:  215.9     211.8  -4.0364
## 73:  206.5     210.7   4.2456
## 74:  215.4     212.0  -3.3714
## 75:  212.4     211.5  -0.9000
## 76:  224.3     212.4 -11.9358
## 77:  211.4     211.3  -0.1713
## 78:  216.2     211.7  -4.4368
## 79:  226.6     212.7 -13.8953
## 80:  215.2     211.6  -3.6171
## 81:  206.2     210.8   4.5578
## 82:  215.6     211.9  -3.6425
## 83:  221.2     212.4  -8.8572
## 84:  215.6     212.0  -3.6298
## 85:  210.8     211.6   0.7580
## 86:  205.4     210.7   5.3451
## 87:  211.9     211.4  -0.4912
## 88:  207.1     211.8   4.7080
## 89:  207.9     210.9   2.9683
## 90:  210.0     211.7   1.7122
##     Actual Estimated Residual
rs
##     Actual Estimated Residual
##  1:  210.5     211.5   0.9988
##  2:  224.7     212.6 -12.0525
##  3:  201.0     210.1   9.0722
##  4:  215.7     211.9  -3.7922
##  5:  203.4     210.6   7.1646
##  6:  229.2     212.7 -16.4674
##  7:  222.6     212.7  -9.9184
##  8:  221.7     212.1  -9.6718
##  9:  222.5     212.2 -10.2730
## 10:  202.0     210.4   8.4501
## 11:  193.5     210.0  16.4241
## 12:  213.0     211.5  -1.4872
## 13:  214.2     211.8  -2.4614
## 14:  213.3     211.4  -1.8927
## 15:  213.1     211.4  -1.6952
## 16:  206.4     211.2   4.7035
## 17:  229.3     213.2 -16.0826
## 18:  209.6     211.0   1.4385
## 19:  216.7     211.9  -4.7118
## 20:  204.7     211.2   6.4459
## 21:  209.2     211.5   2.2283
## 22:  208.2     211.1   2.8972
## 23:  212.6     211.5  -1.0663
## 24:  209.5     211.2   1.7171
## 25:  209.8     211.1   1.2855
## 26:  207.6     211.0   3.4100
## 27:  215.0     212.2  -2.7841
## 28:  208.1     211.0   2.8786
## 29:  208.1     210.8   2.7024
## 30:  208.0     211.1   3.0994
## 31:  207.2     210.9   3.7604
## 32:  215.4     211.8  -3.6098
## 33:  213.8     211.4  -2.3702
## 34:  215.1     212.0  -3.1475
## 35:  215.3     212.4  -2.8601
## 36:  223.7     212.1 -11.5720
## 37:  211.8     211.5  -0.2844
## 38:  212.0     211.7  -0.2179
## 39:  218.1     211.7  -6.4019
## 40:  224.3     212.3 -11.9642
## 41:  211.3     210.9  -0.4545
## 42:  216.6     212.1  -4.5519
## 43:  210.6     211.3   0.7114
## 44:  204.2     210.5   6.2880
## 45:  205.0     210.8   5.7614
## 46:  200.4     210.5  10.0298
## 47:  209.8     211.5   1.6949
## 48:  213.0     211.6  -1.3659
## 49:  218.3     211.9  -6.3417
## 50:  207.9     210.7   2.8495
## 51:  215.0     211.3  -3.6801
## 52:  217.8     212.0  -5.7443
## 53:  211.7     211.6  -0.1424
## 54:  203.4     210.5   7.1612
## 55:  213.2     211.2  -2.0624
## 56:  215.3     211.7  -3.5414
## 57:  215.0     211.4  -3.6403
## 58:  210.6     211.3   0.7520
## 59:  204.3     211.0   6.6555
## 60:  199.7     210.4  10.6870
## 61:  216.5     212.0  -4.4865
## 62:  213.5     211.5  -2.0410
## 63:  216.8     211.9  -4.9349
## 64:  207.0     211.1   4.1449
## 65:  198.1     210.1  11.9984
## 66:  218.8     212.3  -6.4710
## 67:  214.9     212.1  -2.7952
## 68:  209.8     212.0   2.1165
## 69:  214.5     212.1  -2.3816
## 70:  213.6     211.5  -2.0547
## 71:  202.1     210.2   8.0917
## 72:  215.9     211.8  -4.0364
## 73:  206.5     210.7   4.2456
## 74:  215.4     212.0  -3.3714
## 75:  212.4     211.5  -0.9000
## 76:  224.3     212.4 -11.9358
## 77:  211.4     211.3  -0.1713
## 78:  216.2     211.7  -4.4368
## 79:  226.6     212.7 -13.8953
## 80:  215.2     211.6  -3.6171
## 81:  206.2     210.8   4.5578
## 82:  215.6     211.9  -3.6425
## 83:  221.2     212.4  -8.8572
## 84:  215.6     212.0  -3.6298
## 85:  210.8     211.6   0.7580
## 86:  205.4     210.7   5.3451
## 87:  211.9     211.4  -0.4912
## 88:  207.1     211.8   4.7080
## 89:  207.9     210.9   2.9683
## 90:  210.0     211.7   1.7122
##     Actual Estimated Residual

Nicley, the mean residual is -0.7838 and the mean squared residual is 38.6224. Not too shabby!