Specifying a CAT

We use package catR and first explore its capacity to simulate single students. We propose to use mostly the default values implemented in catR for most CAT specification. More precisely, we specify the CAT as follows

The CAT we use is specified as follows

Start, i.e., choosing the first item
- We start by assuming that the student is rather poor, by setting the initial ability to \(\theta=-1.5\), on a standard normal scale. This means that most students will start by answering correctly
- Then, based on the initial \(\theta\), we choose the starting item randomly among the ten most informative items, in terms of the MFI (maximum Fisher info) criterion. This randomesque approach ensures that not all students start with the exact same item.
Selection of next item.
- Ability estimation at each step is done using Bayesian model (BM) estimation (Birnbaum, 1969).
- Then, we use MFI again, and choose among the ten best items. The randomesque approach ensures exposure to more items in the bank, which should improve overall validity of our instrument.
Stop criteria. The test stops when either
- The max number of items NMAX has been reached, or
- The standard error of the current estimate is below a threshold SETHR
- The final ability estimate is done using BM

Our code for starting, item selection and stopping, for specific NMAX and SETTHR values:

start <- list(theta = -1, startSelect = "MFI", randomesque=10)
test <- list(method = "BM", itemSelect = "MFI", randomesque=10)
NMAX <- 40
SEMTHR <- .2
stop <- list(rule = c("precision","length"), thr = c(SEMTHR, NMAX))

A cautionary comment on item bank precision

It should be noted that the SE is computed based on population values in the item bank. However, in Adaptvurder, many items were calibrated on only 200-300 students, so there is a lot of noise in the item bank. Hence, the calculated SE from the CAT will be downward biased.

Simulating individual students

Let us simulate three students, at ability levels \(\theta=-1, 0\) and \(+1\). We stop at 40 items, or when the SE of the ability estimate is less than 0.2 The weakest student responses may be simulated as follows

res1 <- randomCAT(trueTheta=-1, itemBank = itempool, genSeed=1,
                  start=start, test=test, stop=stop)
plot(res1, ci=T)

## The plot was not captured!

The medium student responses may be simulated as follows

res1 <- randomCAT(trueTheta=0, itemBank = itempool, genSeed=1,
                  start=start, test=test, stop=stop)
plot(res1, ci=T)

## The plot was not captured!

The strong student responses may be simulated as follows

res1 <- randomCAT(trueTheta=+1, itemBank = itempool, genSeed=1,
                  start=start, test=test, stop=stop)
plot(res1, ci=T)

## The plot was not captured!

Students with high achievement and the lack of difficult items

We see that the high ability student had to answer (correctly) a long sequence of items. This is because item difficulty in the pool is distributed as follows

We should therefor avoid this in our CAT, by stopping if the confidence interval for the current ability level is entirely above, e.g., \(\theta=1\), which would mean that the student above stops the CAT after approx 20 items.

Full scale simulations for stopping rules

We keep the parameters as above, and simulate \(n=10^4\) students in four stopping conditions, obtained by crossing

Max test length. We include NMAX = 25 and NMAX=50
SE precision threshold. We include SETHR=0.2 and SETHR=0.3

Response variables that we are interested in

Average Test length as a function of ability
Bias as a function of ability
RMSE as a function of ability
Item exposure
Standard error of ability estimate as a function of true ability

## Simulation process:  0 %
## Simulation process:  10 %
## Simulation process:  20 %
## Simulation process:  30 %
## Simulation process:  40 %
## Simulation process:  50 %
## Simulation process:  60 %
## Simulation process:  70 %
## Simulation process:  80 %
## Simulation process:  90 %
## Simulation process:  100 %

## catR produced plot, NMAX= 25 SE= 0.2

## The plot was not captured!

## Simulation process:  0 %
## Simulation process:  10 %
## Simulation process:  20 %
## Simulation process:  30 %
## Simulation process:  40 %
## Simulation process:  50 %
## Simulation process:  60 %
## Simulation process:  70 %
## Simulation process:  80 %
## Simulation process:  90 %
## Simulation process:  100 %

## catR produced plot, NMAX= 25 SE= 0.3

## The plot was not captured!

## Simulation process:  0 %
## Simulation process:  10 %
## Simulation process:  20 %
## Simulation process:  30 %
## Simulation process:  40 %
## Simulation process:  50 %
## Simulation process:  60 %
## Simulation process:  70 %
## Simulation process:  80 %
## Simulation process:  90 %
## Simulation process:  100 %

## catR produced plot, NMAX= 40 SE= 0.2

## The plot was not captured!

## Simulation process:  0 %
## Simulation process:  10 %
## Simulation process:  20 %
## Simulation process:  30 %
## Simulation process:  40 %
## Simulation process:  50 %
## Simulation process:  60 %
## Simulation process:  70 %
## Simulation process:  80 %
## Simulation process:  90 %
## Simulation process:  100 %

## catR produced plot, NMAX= 40 SE= 0.3

## The plot was not captured!

Plots of true vs. estimated

ggplot(est.df, aes(truetheta, estimatedtheta, color=NMAX))+geom_abline(slope=1)+geom_point()+facet_wrap(NMAX~SE)+xlab("True ability")+ylab("Estimated ability")

Overall summaries

knitr::kable(agg.df)

NMAX	SE	totitemsused	corr	bias	testlength
25	0.2	159	0.9824933	-0.0225624	18.1659
25	0.3	135	0.9669542	-0.0095131	11.0806
40	0.2	181	0.9843386	-0.0123449	22.7196
40	0.3	147	0.9679494	-0.0117515	12.7949

Conditional Test length

ggplot(cond.df, aes(condTheta, condnItems, color=NMAX, linetype=SE))+geom_line()+xlab("True ability")+ylab("Mean test length")

Conditional Bias

ggplot(cond.df, aes(condTheta, condBias, color=NMAX, linetype=SE))+geom_line()+xlab("True ability")+ylab("Bias")

## Conditional RMSE

ggplot(cond.df, aes(condTheta, condRMSE, color=NMAX, linetype=SE))+geom_line()+xlab("True ability")+ylab("Bias")

Conditional SE

ggplot(cond.df, aes(condTheta, condSE, color=NMAX, linetype=SE))+geom_line()+xlab("True ability")+ylab("Bias")

Simulations for Ordlesing

N Foldnes

28 aug 2020