1 Introduction

The data are the responses of a sample of 150 individuals to 9 multiple-choice items from a General Certificate of Education O-level mathematics paper containing 60 items in all. The responses are coded 1 for correct answer and 0 for incorrect.

The items appear to test ability in two-dimensional Euclidean geometry. The data have one column with 1,350 records ordered by item within individuals. There are no missing data.

Source: Goldstein, H., & Wood, R. (1989). Five decades of item response modelling, British Journal of Mathematical and Statistical Psychology, 42, 139-67.

Column 1: Subject ID Column 2: Item ID Column 3: Answer, 1 = Correct, 0 = Incorrect

2 Data mamagement

#load packages
pacman::p_load(dplyr, tidyr, ltm, eRm, lme4)

#input data
dta <- read.table("C:/Users/Ching-Fang Wu/Documents/data/starter.txt", h=T)
head(dta,10)
##    sid item resp
## 1    1    1    1
## 2    1    2    1
## 3    1    3    1
## 4    1    4    1
## 5    1    5    1
## 6    1    6    1
## 7    1    7    1
## 8    1    8    1
## 9    1    9    0
## 10   2    1    1
# 重新命名
names(dta) <- c("ID", "Item", "Answer")
#轉換資料形式
dtaW <- dta %>%
        spread(key = Item, value = Answer) #spread()將長資料轉換成寬資料
names(dtaW) <- c("ID","Item 1","Item 2","Item 3","Item 4 ", "Item 5", "Item 6","Item 7","Item 8", "Item 9")
head(dtaW)
##   ID Item 1 Item 2 Item 3 Item 4  Item 5 Item 6 Item 7 Item 8 Item 9
## 1  1      1      1      1       1      1      1      1      1      0
## 2  2      1      1      1       0      0      0      1      0      0
## 3  3      1      1      1       1      0      1      1      0      1
## 4  4      1      1      1       1      1      0      1      0      0
## 5  5      1      1      1       0      0      1      1      0      0
## 6  6      0      1      1       1      0      0      0      0      0
dtaW <- subset(dtaW, select = -1)

3 Correlated binary responses

knitr::kable(cor(dtaW))
Item 1 Item 2 Item 3 Item 4 Item 5 Item 6 Item 7 Item 8 Item 9
Item 1 1.0000000 0.0982946 0.0542415 0.0444554 0.1769309 0.0611080 0.1176913 0.0236666 0.0521286
Item 2 0.0982946 1.0000000 0.2069345 0.2261335 0.3180732 0.1572070 0.1561738 0.2742122 0.0353553
Item 3 0.0542415 0.2069345 1.0000000 0.2380182 0.2442222 0.2247635 0.1659775 0.1390151 0.0541944
Item 4 0.0444554 0.2261335 0.2380182 1.0000000 0.1874462 0.1357355 0.1883526 0.2601335 0.2025407
Item 5 0.1769309 0.3180732 0.2442222 0.1874462 1.0000000 0.1326535 0.1399923 0.3144546 0.0715628
Item 6 0.0611080 0.1572070 0.2247635 0.1357355 0.1326535 1.0000000 0.1993889 0.0550560 0.1515848
Item 7 0.1176913 0.1561738 0.1659775 0.1883526 0.1399923 0.1993889 1.0000000 0.0898275 0.1840525
Item 8 0.0236666 0.2742122 0.1390151 0.2601335 0.3144546 0.0550560 0.0898275 1.0000000 0.1702513
Item 9 0.0521286 0.0353553 0.0541944 0.2025407 0.0715628 0.1515848 0.1840525 0.1702513 1.0000000

item2和item5相關係數為0.318; item9和item5相關係數為0.314, 相關係數值介於0.3~0.7之間,為中度相關。

plot(descript(dtaW), 
     ylim=c(0,1), 
     type='b',
     main="Item response curves")
grid()

得分6分的人,答對第九題的比率最低,所以第九題的難度最高。

但是得分1分的人,答對第九題(難度高的題目)的比率高於第五題(難度低的題目),顯示這份試題無法區辦受試者在two-dimensional Euclidean geometry的能力。

4 Item response probabilties

knitr::kable(descript(dtaW)$perc[c(1, 7, 2, 3, 4, 5, 6, 8, 9),])
0 1 logit
Item 1 0.0800000 0.9200000 2.4423470
Item 7 0.1800000 0.8200000 1.5163475
Item 2 0.2000000 0.8000000 1.3862944
Item 3 0.2533333 0.7466667 1.0809127
Item 4 0.2666667 0.7333333 1.0116009
Item 5 0.3066667 0.6933333 0.8157495
Item 6 0.3200000 0.6800000 0.7537718
Item 8 0.4600000 0.5400000 0.1603427
Item 9 0.6666667 0.3333333 -0.6931472
dta$Item <- factor(dta$Item)

5 Parameter estimates

sjPlot::tab_model(m0 <- glmer(Answer ~ -1 + Item + (1 | ID), data=dta, family=binomial), show.obs=F, show.ngroups=F, transform=NULL, show.se=T, show.r2=F,show.icc=F)
## Warning in checkConv(attr(opt, "derivs"), opt$par, ctrl = control$checkConv, :
## Model failed to converge with max|grad| = 0.28301 (tol = 0.002, component 1)
## Warning in checkConv(attr(opt, "derivs"), opt$par, ctrl = control$checkConv, : Model is nearly unidentifiable: very large eigenvalue
##  - Rescale variables?
  Answer
Predictors Log-Odds std. Error CI p
Item [1] 2.90 0.00 2.90 – 2.91 <0.001
Item [2] 1.70 0.00 1.70 – 1.71 <0.001
Item [3] 1.34 0.00 1.34 – 1.34 <0.001
Item [4] 1.21 0.21 0.79 – 1.63 <0.001
Item [5] 1.00 0.21 0.60 – 1.41 <0.001
Item [6] 0.92 0.20 0.52 – 1.33 <0.001
Item [7] 1.86 0.00 1.86 – 1.87 <0.001
Item [8] 0.20 0.00 0.20 – 0.21 <0.001
Item [9] -0.88 0.21 -1.28 – -0.48 <0.001
Random Effects
σ2 3.29
τ00 ID 1.16
unlist(summary(m0)$varcor)/(unlist(summary(m0)$varcor)+pi^2/3)
##        ID 
## 0.2614461

6 Fit the Rasch model

coef(rm0 <- ltm::rasch(dtaW, constraint=cbind(ncol(dtaW)+1, 1)))[,1]
##     Item 1     Item 2     Item 3    Item 4      Item 5     Item 6     Item 7 
## -2.8480443 -1.6661044 -1.3091179 -1.2271704 -0.9939141 -0.9196116 -1.8160398 
##     Item 8     Item 9 
## -0.1993647  0.8417333
plot(rm0)
grid()

7 Item characteristic curves

rm0 <- RM(dtaW)
eRm::plotICC(rm0, item.subset=1:4, ask=F, empICC=list("raw"), empCI=list(lty="solid"))

8 The End