1 Description

The data are the responses of a sample of 150 individuals to 9 multiple-choice items from a General Certificate of Education O-level mathematics paper containing 60 items in all. The responses are coded 1 for correct answer and 0 for incorrect. The items appear to test ability in two-dimensional Euclidean geometry. The data have one column with 1,350 records ordered by item within individuals. There are no missing data.

Source: Goldstein, H., & Wood, R. (1989). Five decades of item response modelling, British Journal of Mathematical and Statistical Psychology, 42, 139-67.

Column 1: Subject ID
Column 2: Item ID
Column 3: Answer, 1 = Correct, 0 = Incorrect

2 load package

library(pacman)

3 Input Data

dta <- read.table("C:/Users/HANK/Desktop/HOMEWORK/starter.txt", h=T)

str(dta)

## 'data.frame':    1350 obs. of  3 variables:
##  $ sid : int  1 1 1 1 1 1 1 1 1 2 ...
##  $ item: int  1 2 3 4 5 6 7 8 9 1 ...
##  $ resp: int  1 1 1 1 1 1 1 1 0 1 ...

head(dta)

##   sid item resp
## 1   1    1    1
## 2   1    2    1
## 3   1    3    1
## 4   1    4    1
## 5   1    5    1
## 6   1    6    1

4 Data Visualization

dtaW <- dta %>%
  gather(key, value, -item, -sid) %>%
  unite(col, key, item) %>%
  spread(col, value) %>%
  select(-sid)

plot(descript(dtaW), 
     ylim=c(0,1), 
     type='b',
     main="Item response curves")

將各題目的正確比例以圖示呈現。
本圖顯示，題目1應較簡單(答對比例最高)；題目9應較難(答對比例低)
其中，題目6與題目8較不太合適(在總分只有3分時，題目6與題目8答對率差距大，但其實兩題的難度應是差不多)

5 Item response probabilties

descript(dtaW)$perc[c(1,7,2,3,4,5,6,8,9), ]

##                0         1      logit
## resp_1 0.0800000 0.9200000  2.4423470
## resp_7 0.1800000 0.8200000  1.5163475
## resp_2 0.2000000 0.8000000  1.3862944
## resp_3 0.2533333 0.7466667  1.0809127
## resp_4 0.2666667 0.7333333  1.0116009
## resp_5 0.3066667 0.6933333  0.8157495
## resp_6 0.3200000 0.6800000  0.7537718
## resp_8 0.4600000 0.5400000  0.1603427
## resp_9 0.6666667 0.3333333 -0.6931472

6 Correlated binary responses

cor(dtaW)

##            resp_1     resp_2     resp_3     resp_4     resp_5     resp_6
## resp_1 1.00000000 0.09829464 0.05424148 0.04445542 0.17693086 0.06110804
## resp_2 0.09829464 1.00000000 0.20693453 0.22613351 0.31807321 0.15720704
## resp_3 0.05424148 0.20693453 1.00000000 0.23801815 0.24422223 0.22476353
## resp_4 0.04445542 0.22613351 0.23801815 1.00000000 0.18744615 0.13573552
## resp_5 0.17693086 0.31807321 0.24422223 0.18744615 1.00000000 0.13265351
## resp_6 0.06110804 0.15720704 0.22476353 0.13573552 0.13265351 1.00000000
## resp_7 0.11769133 0.15617376 0.16597755 0.18835264 0.13999231 0.19938887
## resp_8 0.02366657 0.27421222 0.13901505 0.26013352 0.31445456 0.05505601
## resp_9 0.05212860 0.03535534 0.05419437 0.20254068 0.07156282 0.15158477
##            resp_7     resp_8     resp_9
## resp_1 0.11769133 0.02366657 0.05212860
## resp_2 0.15617376 0.27421222 0.03535534
## resp_3 0.16597755 0.13901505 0.05419437
## resp_4 0.18835264 0.26013352 0.20254068
## resp_5 0.13999231 0.31445456 0.07156282
## resp_6 0.19938887 0.05505601 0.15158477
## resp_7 1.00000000 0.08982753 0.18405254
## resp_8 0.08982753 1.00000000 0.17025131
## resp_9 0.18405254 0.17025131 1.00000000

題目彼此間的相關性不高。

7 m0

dta$item <- factor(dta$item)
sjPlot::tab_model(m0 <- glmer(resp ~ -1 + item + (1 | sid), data=dta, family=binomial), show.obs=F, show.ngroups=F, transform=NULL, show.se=T, show.r2=F,show.icc=F)

## Warning in checkConv(attr(opt, "derivs"), opt$par, ctrl = control$checkConv, :
## Model failed to converge with max|grad| = 0.28301 (tol = 0.002, component 1)

## Warning in checkConv(attr(opt, "derivs"), opt$par, ctrl = control$checkConv, : Model is nearly unidentifiable: very large eigenvalue
##  - Rescale variables?

	resp
Predictors	Log-Odds	std. Error	CI	p
item [1]	2.90	0.00	2.90 – 2.91	<0.001
item [2]	1.70	0.00	1.70 – 1.71	<0.001
item [3]	1.34	0.00	1.34 – 1.34	<0.001
item [4]	1.21	0.21	0.79 – 1.63	<0.001
item [5]	1.00	0.21	0.60 – 1.41	<0.001
item [6]	0.92	0.20	0.52 – 1.33	<0.001
item [7]	1.86	0.00	1.86 – 1.87	<0.001
item [8]	0.20	0.00	0.20 – 0.21	<0.001
item [9]	-0.88	0.21	-1.28 – -0.48	<0.001
Random Effects
σ²	3.29
τ₀₀ _sid	1.16

unlist(summary(m0)$varcor)/(unlist(summary(m0)$varcor)+pi^2/3)

##       sid 
## 0.2614461

8 Fit the Rasch model with rasch{ltm}

coef(rm0 <- ltm::rasch(LSAT, constraint=cbind(ncol(LSAT)+1, 1)))[,1]

##     Item 1     Item 2     Item 3     Item 4     Item 5 
## -2.8719712 -1.0630294 -0.2576109 -1.3880588 -2.2187785

plot(rm0)

9 Item characteristic curves

library(eRm)
rm0 <- RM(LSAT)
eRm::plotICC(rm0, item.subset=1:5, ask=F, empICC=list("raw"), empCI=list(lty="solid"))

10 The item-person map

plotPImap(rm0)

結果算是不錯的，因為各個item間的距離幾乎呈現等距，表示能很平均地區別出不同程度的受試者，亦可說具有鑑別度。

1123_In Class Exercise 1

Pei-Yun,Li

2020-11-26