Specifying a CAT

We use package catR and first explore its capacity to simulate single students. We propose to use mostly the default values implemented in catR for most CAT specification. More precisely, we specify the CAT as follows

The CAT we use is specified as follows

Our code for starting, item selection and stopping, for specific NMAX and SETTHR values:

start <- list(theta = -1, startSelect = "MFI", randomesque=10)
test <- list(method = "BM", itemSelect = "MFI", randomesque=10)
NMAX <- 40
SEMTHR <- .2
stop <- list(rule = c("precision","length"), thr = c(SEMTHR, NMAX))

A cautionary comment on item bank precision

It should be noted that the SE is computed based on population values in the item bank. However, in Adaptvurder, many items were calibrated on only 200-300 students, so there is a lot of noise in the item bank. Hence, the calculated SE from the CAT will be downward biased.

Simulating individual students

Let us simulate three students, at ability levels \(\theta=-1, 0\) and \(+1\). We stop at 40 items, or when the SE of the ability estimate is less than 0.2 The weakest student responses may be simulated as follows

res1 <- randomCAT(trueTheta=-1, itemBank = itempool, genSeed=1,
                  start=start, test=test, stop=stop)
plot(res1, ci=T)

## The plot was not captured!

The medium student responses may be simulated as follows

res1 <- randomCAT(trueTheta=0, itemBank = itempool, genSeed=1,
                  start=start, test=test, stop=stop)
plot(res1, ci=T)

## The plot was not captured!

The strong student responses may be simulated as follows

res1 <- randomCAT(trueTheta=+1, itemBank = itempool, genSeed=1,
                  start=start, test=test, stop=stop)
plot(res1, ci=T)

## The plot was not captured!

Students with high achievement and the lack of difficult items

We see that the high ability student had to answer (correctly) a long sequence of items. This is because item difficulty in the pool is distributed as follows

We should therefor avoid this in our CAT, by stopping if the confidence interval for the current ability level is entirely above, e.g., \(\theta=1\), which would mean that the student above stops the CAT after approx 20 items.

Full scale simulations for stopping rules

We keep the parameters as above, and simulate \(n=10^4\) students in four stopping conditions, obtained by crossing

Response variables that we are interested in

## Simulation process:  0 %
## Simulation process:  10 %
## Simulation process:  20 %
## Simulation process:  30 %
## Simulation process:  40 %
## Simulation process:  50 %
## Simulation process:  60 %
## Simulation process:  70 %
## Simulation process:  80 %
## Simulation process:  90 %
## Simulation process:  100 %
## catR produced plot, NMAX= 25 SE= 0.2

## The plot was not captured!
## Simulation process:  0 %
## Simulation process:  10 %
## Simulation process:  20 %
## Simulation process:  30 %
## Simulation process:  40 %
## Simulation process:  50 %
## Simulation process:  60 %
## Simulation process:  70 %
## Simulation process:  80 %
## Simulation process:  90 %
## Simulation process:  100 %
## catR produced plot, NMAX= 25 SE= 0.3

## The plot was not captured!
## Simulation process:  0 %
## Simulation process:  10 %
## Simulation process:  20 %
## Simulation process:  30 %
## Simulation process:  40 %
## Simulation process:  50 %
## Simulation process:  60 %
## Simulation process:  70 %
## Simulation process:  80 %
## Simulation process:  90 %
## Simulation process:  100 %
## catR produced plot, NMAX= 40 SE= 0.2

## The plot was not captured!
## Simulation process:  0 %
## Simulation process:  10 %
## Simulation process:  20 %
## Simulation process:  30 %
## Simulation process:  40 %
## Simulation process:  50 %
## Simulation process:  60 %
## Simulation process:  70 %
## Simulation process:  80 %
## Simulation process:  90 %
## Simulation process:  100 %
## catR produced plot, NMAX= 40 SE= 0.3

## The plot was not captured!

Plots of true vs. estimated

ggplot(est.df, aes(truetheta, estimatedtheta, color=NMAX))+geom_abline(slope=1)+geom_point()+facet_wrap(NMAX~SE)+xlab("True ability")+ylab("Estimated ability")

Overall summaries

knitr::kable(agg.df)
NMAX SE totitemsused corr bias testlength
25 0.2 159 0.9824933 -0.0225624 18.1659
25 0.3 135 0.9669542 -0.0095131 11.0806
40 0.2 181 0.9843386 -0.0123449 22.7196
40 0.3 147 0.9679494 -0.0117515 12.7949

Conditional Test length

ggplot(cond.df, aes(condTheta, condnItems, color=NMAX, linetype=SE))+geom_line()+xlab("True ability")+ylab("Mean test length")

Conditional Bias

ggplot(cond.df, aes(condTheta, condBias, color=NMAX, linetype=SE))+geom_line()+xlab("True ability")+ylab("Bias")

## Conditional RMSE

ggplot(cond.df, aes(condTheta, condRMSE, color=NMAX, linetype=SE))+geom_line()+xlab("True ability")+ylab("Bias")

Conditional SE

ggplot(cond.df, aes(condTheta, condSE, color=NMAX, linetype=SE))+geom_line()+xlab("True ability")+ylab("Bias")