CTT Failures

Question 1

Use the ctt_failures source on github to explore what I think is a crucial failure of CTT. Note: This problem is challenging. But, if you stick with it I think you can really get some great insight about why the CTT model is, in some sense, incomplete about what we expect from item responses.

##a helper function we'll use along the way.
##kr20 is a function that computes the kr20 version of cronbach's alpha
kr20<-function(resp) {
    k<-ncol(resp)
    p<-colMeans(resp,na.rm=TRUE)
    q<-1-p
    o<-rowSums(resp)
    (k/(k-1))*(1-sum(p*q)/var(o))
}

The goal here is to get ‘under the hood’ of ctt and illustrate some potential shortcomings of ctt. We’ll do that in two ways, the first mainly as a jumping off point for the second.

First, we’ll flip coins. a single person flips an even coin (p=0.5) N.items times. that is their string of item responses then the next person does the same. we generate a large bank of item responses for many people this way what kind of reliability will this item response data have? Before running the below, generate a hypothesis about what you think will happen.

My hypothesis is that the data will be pretty reliable: people should have a 50% chance of the item landed on heads or tails. I would imagine that cronback’s alpha should have some high values?

par(mfrow=c(2,3),mgp=c(2,1,0),mar=c(3,3,2,1))
for (N.items in seq(20,50,by=15)) { 
    for (N.ppl in c(100,1000)) {
        alph<-numeric()
        for (i in 1:25) {
            resp<-matrix(rbinom(N.items*N.ppl,size=1,pr=.5),N.ppl,N.items)
            alph[i]<-kr20(resp)
        }
        boxplot(alph,main=paste(N.items,N.ppl),ylim=c(-.20,1))
        abline(h=mean(alph))
    }
}
mtext(side=1,line=.5,"from coin flips")

Question: What happened? Did this meet your expectations?

Seems that the value are pretty low? (Which is confusing)

Now, we were kind of cheating with the coins. We didn’t really impose any structure on the thing. Thus, this isn’t really demonstrating something too weird about ctt, just that coin flips aren’t reliable measures!

Let’s now give ourselves a stiffer challenge. Let’s generate item response data that has:

A. a pre-specified reliability (in terms of the percentage of observed variation that is due to true score variation) B. and yet weird inter-workings such that reliability estimates fall apart completely.

Please distinguish between A and B. The data will be simulated such that the true reliability is what it should be. It’s just that estimates of reliability from KR20 require more (without these additional requirements being specified in the CTT model). It’s precisely this bit of wiggle room that we’re going to exploit.

C. [i don’t really emphasize this in the below, but you can also (roughly) specify the distribution of item p-values. see the ‘pr’ argument and my comments about it below.]

Below, I’m going to do in a number of different ways just that using this function:

sim_ctt<-function(N.items, #N.items needs to be even
                  N.ppl,
                  s2.true, #variance of true scores
                  s2.error, #variance of error scores
                  check=FALSE #this will produce a figure that you can use to check that things are working ok
                  ) { 
    ##desired reliability
    reliability<-(s2.true)/(s2.error+s2.true)
    ##sample (unobserved) true scores & errors
    T<-round(rnorm(N.ppl,mean=round(N.items/2),sd=sqrt(s2.true)))
    E<-round(rnorm(N.ppl,mean=0,sd=sqrt(s2.error)))
    O<-T+E #the CTT model in all its glory
    O<-ifelse(O<0,0,O) #kind of cheating here, don't want scores below 0
    O<-ifelse(O>N.items,N.items,O) #or above N.items
    ##note that we already know how many items a person got right (O)
    ##CTT doesn't specify how individual item responses are generated <evil grin>
    ##so we have carte blanche in terms of coming up with item responses. here's how i'm going to do it. 
    pr<-runif(N.items,min=.25,max=.85)
    pr.interval<-c(0,cumsum(pr))/sum(pr)
    resp<-list()
    for (i in 1:N.ppl) { #for each person
        resp.i<-rep(0,N.items) #start them with all 0s
        if (O[i]>0) { #only do the below for people that got at least 1 item right
            for (j in 1:O[i]) { #now for each of the observed correct responses (for those who got at least 1 item right) i'll pick an item to mark correct
                init.val<-1
                while (init.val==1) {
                    ##what comes next is key
                    ##i first randomly pick an item to get the correct response.
                    ##so as to match the imposed distribution of p-values (see 'pr'), i weight the items by the "pr.interval" argument that is random here but could be pre-specified by the user
                    index<-cut(runif(1),pr.interval,labels=FALSE)
                    ##but i want to make sure that i have picked a new item for this person to get right, so the next line checks that (and doesn't break out of the loop until it is a new item)
                    init.val<-resp.i[index]
                    ##what i've done that is a problem for CTT is that i picked an item in a way that imposes no structure.
                    ##i picked responses so that p-value distributions for items match up, but people with relatively few responses are just as likely to get hard items right as people with large numbers of responses. CTT doesn't demand that this happen, just kind of assumes it will.
                }
                resp.i[index]<-1 #assign a correct response to the 'index' item
            }
        }
        resp[[i]]<-resp.i
    }
    resp<-do.call("rbind",resp)
    ##note that we are getting both the p-values and the true score/observed score relationships right!
    if (check) {
        par(mfrow=c(2,1))
        plot(T,rowSums(resp),xlab="true score",ylab="observed scores"); abline(0,1)
        plot(pr,colMeans(resp),xlab="pr",ylab="observed p-values"); abline(0,1)
    }
    obs.rel<-cor(T,O)^2 #we can use this to make sure that the true and observed score variances are right.
    print(c(reliability,obs.rel)) ##for real time checking, these are the pre-specified reliability and what we get from correlation our true and observed scores
    kr<-kr20(resp)
    c(obs.rel,kr) #the function returns true reliability (obs.rel) and kr20 estimate (kr)
}

To illustrate the basic idea here, we’ll generate 10 sets of item response data. For each iteration, we’ll use the same parameters, these are below.

N.items<-50
N.ppl<-500
s2.true<-10
s2.error<-3
alph<-list()
alph[[1]]<-sim_ctt(N.items=N.items,N.ppl=N.ppl,s2.true=s2.true,s2.error=s2.error,check=TRUE) #note, the check option is going to produce a figure that allows us to check the marginals to make sure things are looking ok

## [1] 0.7692308 0.7628387

for (i in 2:10) {
    alph[[i]]<-sim_ctt(N.items=N.items,N.ppl=N.ppl,s2.true=s2.true,s2.error=s2.error,check=TRUE) #note, the check option is going to produce a figure that allows us to check the marginals to make sure things are looking ok
}

## [1] 0.7692308 0.7743875

## [1] 0.7692308 0.7426650

## [1] 0.7692308 0.7522319

## [1] 0.7692308 0.7777324

## [1] 0.7692308 0.7668569

## [1] 0.7692308 0.7991082

## [1] 0.7692308 0.7514431

## [1] 0.7692308 0.7657118

## [1] 0.7692308 0.7611820

alph<-do.call("rbind",alph)
plot(alph,xlim=c(0,1),ylim=c(-.2,1),pch=19,xlab="true correlation",ylab="kr20"); abline(0,1)
abline(v=s2.true/(s2.true+s2.error),col="red")

what you can see here is that the item response datasets have the right property when it comes to the observed correlation between O and T (these correlations are the dots; the red line is the pre-specified reliability) but, kr20 is producing reliability estimates that are way low

Now let’s look at reliability for many different datasets generated by the above wherein we vary the true and error variances

par(mfrow=c(3,3),mgp=c(2,1,0),mar=c(3,3,2,1))
N.ppl<-500
N.items<-50
for (s2.true in N.items*c(.1,.5,.9)) {
    for (s2.error in N.items*c(.1,.25,.5)) {
        alph<-list()
        for (i in 1:10) {
            alph[[i]]<-sim_ctt(N.items=N.items,N.ppl=N.ppl,s2.true=s2.true,s2.error=s2.error,check=FALSE)
        }
        plot(do.call("rbind",alph),xlim=c(-1,1),ylim=c(-1,1),pch=19,xlab="true correlation",ylab="kr20")
        mtext(side=3,line=0,paste("s2.true",round(s2.true,2),"; s2.error",round(s2.error,2)),cex=.7)
        abline(0,1)
        abline(h=(s2.true)/(s2.error+s2.true),col="red",lty=2)
        abline(v=(s2.true)/(s2.error+s2.true),col="red",lty=2)
    }
}

## [1] 0.5000000 0.4425112
## [1] 0.5000000 0.4925531
## [1] 0.5000000 0.4977432
## [1] 0.5000000 0.4946885
## [1] 0.5000000 0.4768846
## [1] 0.5000000 0.5730116
## [1] 0.5000000 0.5125154
## [1] 0.5000000 0.4913843
## [1] 0.5000000 0.4749807
## [1] 0.5000000 0.5243508

## [1] 0.2857143 0.3005898
## [1] 0.2857143 0.3258713
## [1] 0.2857143 0.2677898
## [1] 0.2857143 0.2637229
## [1] 0.2857143 0.2769540
## [1] 0.2857143 0.2792184
## [1] 0.2857143 0.2849275
## [1] 0.2857143 0.2502152
## [1] 0.2857143 0.2807031
## [1] 0.2857143 0.3642264

## [1] 0.1666667 0.1407724
## [1] 0.1666667 0.1500403
## [1] 0.1666667 0.1872684
## [1] 0.1666667 0.1737596
## [1] 0.1666667 0.1669215
## [1] 0.1666667 0.1709464
## [1] 0.1666667 0.1580920
## [1] 0.1666667 0.1714803
## [1] 0.1666667 0.2035769
## [1] 0.1666667 0.1764712

## [1] 0.8333333 0.8296693
## [1] 0.8333333 0.8246529
## [1] 0.8333333 0.8112068
## [1] 0.8333333 0.8625176
## [1] 0.8333333 0.8398710
## [1] 0.8333333 0.8349608
## [1] 0.8333333 0.7919830
## [1] 0.8333333 0.8192472
## [1] 0.8333333 0.8534260
## [1] 0.8333333 0.8408587

## [1] 0.6666667 0.6890174
## [1] 0.6666667 0.6617273
## [1] 0.6666667 0.6653066
## [1] 0.6666667 0.6708947
## [1] 0.6666667 0.6594193
## [1] 0.6666667 0.6851435
## [1] 0.6666667 0.6416509
## [1] 0.6666667 0.6782722
## [1] 0.6666667 0.6701390
## [1] 0.6666667 0.6632063

## [1] 0.5000000 0.5130183
## [1] 0.5000000 0.4808484
## [1] 0.5000000 0.5444271
## [1] 0.5000000 0.4973229
## [1] 0.5000000 0.5084737
## [1] 0.5000000 0.5140099
## [1] 0.5000000 0.4998438
## [1] 0.5000000 0.5323424
## [1] 0.5000000 0.4201369
## [1] 0.5000000 0.5432854

## [1] 0.9000000 0.9010755
## [1] 0.9000000 0.9019645
## [1] 0.9000000 0.9059819
## [1] 0.9000000 0.8882639
## [1] 0.9000000 0.9109764
## [1] 0.9000000 0.9016721
## [1] 0.9000000 0.9060487
## [1] 0.9000000 0.8929188
## [1] 0.9000000 0.8914882
## [1] 0.9000000 0.8958848

## [1] 0.7826087 0.8187778
## [1] 0.7826087 0.7784783
## [1] 0.7826087 0.7905329
## [1] 0.7826087 0.7860616
## [1] 0.7826087 0.7916301
## [1] 0.7826087 0.7881418
## [1] 0.7826087 0.7868941
## [1] 0.7826087 0.7942631
## [1] 0.7826087 0.7744592
## [1] 0.7826087 0.7794992

## [1] 0.6428571 0.6394125
## [1] 0.6428571 0.6418013
## [1] 0.6428571 0.6463038
## [1] 0.6428571 0.6101220
## [1] 0.6428571 0.6797682
## [1] 0.6428571 0.6505415
## [1] 0.6428571 0.6266310
## [1] 0.6428571 0.6766457
## [1] 0.6428571 0.6758072
## [1] 0.6428571 0.6350470

how does the kr20 estimate of reliability behave as a function of the true and error variances?
what does this make you think of the CTT model?

IRT

Question 2

We’ve simulated data using IRT models a few times although I didn’t say much about it. Consider lines 23-38 here.

##let's start with an empirical dataset
resp1<-read.table("https://github.com/ben-domingue/252L/raw/master/data/emp-rasch.txt",header=FALSE)

##code to simulate item response data. just run this block for a moment, don't feel the need to look at it in detail
#############################################################
##feel free to ignore between the above line and the similar below line
ni<-ncol(resp1)
np<-nrow(resp1)
set.seed(12311) #this is to ensure that we all draw the same random numbers downstream
th<-rnorm(np)
th.mat<-matrix(th,np,ni,byrow=FALSE) #these are the true abilities. we don't observe them, which will be a real problem for us downstream. but we'll not worry about that today. 
a<-1 ##for reasons we'll get into in the coming weeks, let's start with a slope equal to 1
b<-rnorm(ni)
b.mat<-matrix(b,np,ni,byrow=TRUE) #these are the item difficulties, also something we don't typically observe with empirical data
inv_logit<-function(x) exp(x)/(1+exp(x))
pr<-inv_logit(a*th.mat+b.mat) ##the probability of a correct response
##watch closely now
test<-matrix(runif(ni*np),np,ni) #what do we have here?
resp2<-ifelse(pr>test,1,0) #what bit of magic was this?
#############################################################

In line 34, we are making a specific assumption about the parametric form linking theta to item response probabilities function. One can imagine other choices being made. Studies have been done on sensitivity to the normal CDF being used, for example. What about other CDFs? Consider the degree to which generating data via something other than the inverse logit link might lead to bias or other abnormalities in parameter estimation. You want to consider links with heavier tails, non-symmetric CDFs (skewed normal), or even non-monotonic links. I’ve started to illustrate some examples in the different_links.R source, but feel free to push past this.

The key questions:

Is there anything magical about the logistic as the function we use to transform theta-b_i onto the unit interval of probabilities?

What happens if there is a discrepancy between the data generating model and the model assumed in parameter recovery (i.e., estimation)?

sim_data<-function( #this will simulate item response data using whatever link you send it
                   b, #item difficulties
                   np=1000,
                   link=function(x) exp(x)/(1+exp(x))
                   )
{
    ni<-length(b)
    th<-rnorm(np)
    th.mat<-matrix(th,np,ni,byrow=FALSE) 
    a<-1
    b.mat<-matrix(b,np,ni,byrow=TRUE) # these are the item difficulties
    ## now the probability of a correct response is:
    pr<-link(a*th.mat+b.mat) ## look very carefully here at how b.mat is being included. is this what you expect? if not, take a look at ?mirt (specifically the form used for the Rash item type).
    test<-matrix(runif(ni*np),np,ni)
    resp<-ifelse(pr>test,1,0)
    colnames(resp)<-paste("i",1:ncol(resp))
    resp
}


##first, the default
b<-rnorm(50)
resp<-sim_data(b=b) # note that this will rely on the default logistic link 
library(mirt)

## Loading required package: stats4

## Loading required package: lattice

m<-mirt(resp,1,itemtype="Rasch") #if you check the mirt documentation, itemtype="Rasch" ends up meaning that we are assuming the correct model for estimation

## 
Iteration: 1, Log-Lik: -27622.731, Max-Change: 0.05273
Iteration: 2, Log-Lik: -27622.282, Max-Change: 0.00145
Iteration: 3, Log-Lik: -27622.281, Max-Change: 0.00051
Iteration: 4, Log-Lik: -27622.281, Max-Change: 0.00015
Iteration: 5, Log-Lik: -27622.281, Max-Change: 0.00007

get_coef<-function(mod) {
    coef(mod)->co
    co[-length(co)]->co
    do.call("rbind",co)
}
plot(b,get_coef(m)[,2],xlab="true diff",ylab="est diff"); abline(0,1) #now we are going to compare true and estimated item parameters

##now, the normal cdf
b<-rnorm(50)
resp<-sim_data(b=b,link=pnorm) #note we are sending a different link. what is this sigmoid?
m<-mirt(resp,1,itemtype="Rasch")

## 
Iteration: 1, Log-Lik: -21426.993, Max-Change: 0.88674
Iteration: 2, Log-Lik: -21106.421, Max-Change: 0.58861
Iteration: 3, Log-Lik: -21045.715, Max-Change: 0.28275
Iteration: 4, Log-Lik: -21035.112, Max-Change: 0.12195
Iteration: 5, Log-Lik: -21033.261, Max-Change: 0.05074
Iteration: 6, Log-Lik: -21032.920, Max-Change: 0.02083
Iteration: 7, Log-Lik: -21032.855, Max-Change: 0.00705
Iteration: 8, Log-Lik: -21032.830, Max-Change: 0.00428
Iteration: 9, Log-Lik: -21032.820, Max-Change: 0.00179
Iteration: 10, Log-Lik: -21032.810, Max-Change: 0.00100
Iteration: 11, Log-Lik: -21032.806, Max-Change: 0.00077
Iteration: 12, Log-Lik: -21032.802, Max-Change: 0.00070
Iteration: 13, Log-Lik: -21032.787, Max-Change: 0.00051
Iteration: 14, Log-Lik: -21032.785, Max-Change: 0.00047
Iteration: 15, Log-Lik: -21032.784, Max-Change: 0.00044
Iteration: 16, Log-Lik: -21032.778, Max-Change: 0.00033
Iteration: 17, Log-Lik: -21032.777, Max-Change: 0.00030
Iteration: 18, Log-Lik: -21032.777, Max-Change: 0.00028
Iteration: 19, Log-Lik: -21032.775, Max-Change: 0.00052
Iteration: 20, Log-Lik: -21032.774, Max-Change: 0.00023
Iteration: 21, Log-Lik: -21032.774, Max-Change: 0.00018
Iteration: 22, Log-Lik: -21032.773, Max-Change: 0.00031
Iteration: 23, Log-Lik: -21032.773, Max-Change: 0.00012
Iteration: 24, Log-Lik: -21032.773, Max-Change: 0.00011
Iteration: 25, Log-Lik: -21032.773, Max-Change: 0.00008

plot(b,get_coef(m)[,2],xlab="true diff",ylab="est diff"); abline(0,1) 
abline(0,1.7,col="red") ##what is this? [https://journals.sagepub.com/doi/10.3102/10769986019003293]

##something with heavy tails
b<-rnorm(50)
resp<-sim_data(b=b,link=function(x) pt(x,df=50)) #an even differenter link!
m<-mirt(resp,1,itemtype="Rasch")

## 
Iteration: 1, Log-Lik: -21122.521, Max-Change: 0.86176
Iteration: 2, Log-Lik: -20798.492, Max-Change: 0.56462
Iteration: 3, Log-Lik: -20733.139, Max-Change: 0.27871
Iteration: 4, Log-Lik: -20720.532, Max-Change: 0.12678
Iteration: 5, Log-Lik: -20718.056, Max-Change: 0.05625
Iteration: 6, Log-Lik: -20717.543, Max-Change: 0.02477
Iteration: 7, Log-Lik: -20717.430, Max-Change: 0.00954
Iteration: 8, Log-Lik: -20717.385, Max-Change: 0.00550
Iteration: 9, Log-Lik: -20717.363, Max-Change: 0.00234
Iteration: 10, Log-Lik: -20717.337, Max-Change: 0.00178
Iteration: 11, Log-Lik: -20717.326, Max-Change: 0.00119
Iteration: 12, Log-Lik: -20717.318, Max-Change: 0.00112
Iteration: 13, Log-Lik: -20717.279, Max-Change: 0.00080
Iteration: 14, Log-Lik: -20717.276, Max-Change: 0.00072
Iteration: 15, Log-Lik: -20717.273, Max-Change: 0.00069
Iteration: 16, Log-Lik: -20717.258, Max-Change: 0.00047
Iteration: 17, Log-Lik: -20717.257, Max-Change: 0.00045
Iteration: 18, Log-Lik: -20717.256, Max-Change: 0.00042
Iteration: 19, Log-Lik: -20717.250, Max-Change: 0.00029
Iteration: 20, Log-Lik: -20717.249, Max-Change: 0.00027
Iteration: 21, Log-Lik: -20717.249, Max-Change: 0.00026
Iteration: 22, Log-Lik: -20717.247, Max-Change: 0.00018
Iteration: 23, Log-Lik: -20717.247, Max-Change: 0.00017
Iteration: 24, Log-Lik: -20717.247, Max-Change: 0.00018
Iteration: 25, Log-Lik: -20717.246, Max-Change: 0.00013
Iteration: 26, Log-Lik: -20717.246, Max-Change: 0.00011
Iteration: 27, Log-Lik: -20717.246, Max-Change: 0.00010
Iteration: 28, Log-Lik: -20717.245, Max-Change: 0.00012
Iteration: 29, Log-Lik: -20717.245, Max-Change: 0.00008

plot(b,get_coef(m)[,2],xlab="true diff",ylab="est diff"); abline(0,1)

##skew
library(sn)

## 
## Attaching package: 'sn'

## The following object is masked from 'package:stats':
## 
##     sd

x<-seq(-3,3,length.out=100)
plot(x,dsn(x,alpha=3),type="l") #note the skew

plot(x,psn(x,alpha=3),type="l") #corresponding icc

b<-rnorm(50)
resp<-sim_data(b=b,link=function(x) psn(x,alpha=3)) #now we've really gone overboard ;)
m<-mirt(resp,1,itemtype="Rasch")

## 
Iteration: 1, Log-Lik: -15424.716, Max-Change: 1.62940
Iteration: 2, Log-Lik: -14441.068, Max-Change: 1.57221
Iteration: 3, Log-Lik: -14126.712, Max-Change: 1.07148
Iteration: 4, Log-Lik: -14026.895, Max-Change: 0.63615
Iteration: 5, Log-Lik: -13993.498, Max-Change: 0.36617
Iteration: 6, Log-Lik: -13981.507, Max-Change: 0.20739
Iteration: 7, Log-Lik: -13976.813, Max-Change: 0.11704
Iteration: 8, Log-Lik: -13974.784, Max-Change: 0.06512
Iteration: 9, Log-Lik: -13973.813, Max-Change: 0.03623
Iteration: 10, Log-Lik: -13973.197, Max-Change: 0.03086
Iteration: 11, Log-Lik: -13972.821, Max-Change: 0.00587
Iteration: 12, Log-Lik: -13972.684, Max-Change: 0.00511
Iteration: 13, Log-Lik: -13972.258, Max-Change: 0.00511
Iteration: 14, Log-Lik: -13972.177, Max-Change: 0.00366
Iteration: 15, Log-Lik: -13972.173, Max-Change: 0.00321
Iteration: 16, Log-Lik: -13972.025, Max-Change: 0.00861
Iteration: 17, Log-Lik: -13972.083, Max-Change: 0.00218
Iteration: 18, Log-Lik: -13972.063, Max-Change: 0.00199
Iteration: 19, Log-Lik: -13971.990, Max-Change: 0.00155
Iteration: 20, Log-Lik: -13971.984, Max-Change: 0.00137
Iteration: 21, Log-Lik: -13971.984, Max-Change: 0.00128
Iteration: 22, Log-Lik: -13971.959, Max-Change: 0.00202
Iteration: 23, Log-Lik: -13971.974, Max-Change: 0.00086
Iteration: 24, Log-Lik: -13971.971, Max-Change: 0.00081
Iteration: 25, Log-Lik: -13971.959, Max-Change: 0.00089
Iteration: 26, Log-Lik: -13971.961, Max-Change: 0.00056
Iteration: 27, Log-Lik: -13971.962, Max-Change: 0.00053
Iteration: 28, Log-Lik: -13971.958, Max-Change: 0.00058
Iteration: 29, Log-Lik: -13971.962, Max-Change: 0.00036
Iteration: 30, Log-Lik: -13971.962, Max-Change: 0.00034
Iteration: 31, Log-Lik: -13971.960, Max-Change: 0.00030
Iteration: 32, Log-Lik: -13971.961, Max-Change: 0.00023
Iteration: 33, Log-Lik: -13971.962, Max-Change: 0.00023
Iteration: 34, Log-Lik: -13971.961, Max-Change: 0.00021
Iteration: 35, Log-Lik: -13971.963, Max-Change: 0.00015
Iteration: 36, Log-Lik: -13971.963, Max-Change: 0.00014
Iteration: 37, Log-Lik: -13971.963, Max-Change: 0.00012
Iteration: 38, Log-Lik: -13971.963, Max-Change: 0.00010

plot(b,get_coef(m)[,2],xlab="true diff",ylab="est diff"); abline(0,1) #hm, what is going on here? can we make sense of why parameter estimation is quite bad when b is large

Question 3

What if you simulate data via IRT (e.g., via a change to the sim_data function of nonzero_floor source) so that the probability never goes below a certain value (i.e., the ogive has a lower bound, remember also problem 2 from ps1).

What kind of real life human behavior might this capture? What would you expect it to do to estimates of item parameters and abilities? Can you verify whether estimates are indeed affected as hypothesized?

sim_data<-function( #this will simulate item response data using whatever link you send it
                   b, #item difficulties
                   np=1000,
                   link=function(x) exp(x)/(1+exp(x))
                   )
{
    ni<-length(b)
    th<-rnorm(np)
    th.mat<-matrix(th,np,ni,byrow=FALSE) 
    a<-1
    b.mat<-matrix(b,np,ni,byrow=TRUE) #these are the item difficulties
    ##now the probability of a correct response is:
    pr<-link(a*th.mat+b.mat) ##look very carefully here at how b.mat is being included. is this what you expect? if not, take a look at ?mirt (specifically the form used for the Rash itemtype).
    test<-matrix(runif(ni*np),np,ni)
    resp<-ifelse(pr>test,1,0)
    colnames(resp)<-paste("i",1:ncol(resp))
    resp
}

b<-rnorm(50)
link<-function(x) .1+(1-.1)*(exp(x)/(1+exp(x)))
resp<-sim_data(b=b,link=link) #note that this will rely on the default logistic link 
library(mirt)
m<-mirt(resp,1,itemtype="Rasch") #if you check the mirt documentation, itemtype="Rasch" ends up meaning that we are assuming the correct model for estimation

## 
Iteration: 1, Log-Lik: -29506.015, Max-Change: 0.21209
Iteration: 2, Log-Lik: -29482.477, Max-Change: 0.05736
Iteration: 3, Log-Lik: -29480.355, Max-Change: 0.01747
Iteration: 4, Log-Lik: -29480.147, Max-Change: 0.00527
Iteration: 5, Log-Lik: -29480.109, Max-Change: 0.00213
Iteration: 6, Log-Lik: -29480.104, Max-Change: 0.00071
Iteration: 7, Log-Lik: -29480.103, Max-Change: 0.00028
Iteration: 8, Log-Lik: -29480.102, Max-Change: 0.00025
Iteration: 9, Log-Lik: -29480.102, Max-Change: 0.00021
Iteration: 10, Log-Lik: -29480.100, Max-Change: 0.00013
Iteration: 11, Log-Lik: -29480.100, Max-Change: 0.00004

get_coef<-function(mod) {
    coef(mod)->co
    co[-length(co)]->co
    do.call("rbind",co)
}
plot(-1*b,-1*get_coef(m)[,2],xlab="true difficulties",ylab="estimated difficulties",pch=19); abline(0,1) #now we are going to compare true and estimated item parameters

EDUC 252L - Problem Set #2

Gabriel Reyes

1/30/2022

CTT Failures

Question 1

IRT

Question 2

Question 3