MarsIB IRT Analysis

If the aim is to reduce the size of item bank, consider removing items with poor fit among those with similar difficulty levels. Alternatively, to decrease the answering time without excluding items, you can limit the number of questions per subject, as adopted in this article (Each person answered approximately 25% of all items).

mars <- read.csv('mars_calibration.csv')
# data: N = 1584; Nitems = 64; Each person answered approximately 16 items.

data_wide <- mars %>%
  pivot_wider(id_cols = subject, names_from = item, values_from = accuracy)


# Fit the 2PL model
model_2PL <- mirt(data_wide[,-1], model = 1, itemtype = '2PL')

## 
Iteration: 1, Log-Lik: -18410.929, Max-Change: 1.89856
Iteration: 2, Log-Lik: -16148.070, Max-Change: 1.36334
Iteration: 3, Log-Lik: -15349.501, Max-Change: 0.67197
Iteration: 4, Log-Lik: -14988.870, Max-Change: 0.32785
Iteration: 5, Log-Lik: -14823.037, Max-Change: 0.20700
Iteration: 6, Log-Lik: -14753.658, Max-Change: 0.13657
Iteration: 7, Log-Lik: -14722.402, Max-Change: 0.11298
Iteration: 8, Log-Lik: -14708.199, Max-Change: 0.08112
Iteration: 9, Log-Lik: -14701.507, Max-Change: 0.05804
Iteration: 10, Log-Lik: -14698.261, Max-Change: 0.03956
Iteration: 11, Log-Lik: -14696.655, Max-Change: 0.02895
Iteration: 12, Log-Lik: -14695.863, Max-Change: 0.02436
Iteration: 13, Log-Lik: -14695.102, Max-Change: 0.00871
Iteration: 14, Log-Lik: -14695.061, Max-Change: 0.00472
Iteration: 15, Log-Lik: -14695.038, Max-Change: 0.00231
Iteration: 16, Log-Lik: -14695.023, Max-Change: 0.00256
Iteration: 17, Log-Lik: -14695.018, Max-Change: 0.00124
Iteration: 18, Log-Lik: -14695.017, Max-Change: 0.00079
Iteration: 19, Log-Lik: -14695.015, Max-Change: 0.00032
Iteration: 20, Log-Lik: -14695.014, Max-Change: 0.00015
Iteration: 21, Log-Lik: -14695.014, Max-Change: 0.00014
Iteration: 22, Log-Lik: -14695.014, Max-Change: 0.00011
Iteration: 23, Log-Lik: -14695.014, Max-Change: 0.00009

plot(model_2PL, type = 'trace')

params_2PL <- coef(model_2PL, IRTpars = TRUE, simplify=TRUE)$items
params_2PL <- data.frame(Item = rownames(params_2PL), 
                         Difficulty = params_2PL[, "b"],
                         Discrimination = params_2PL[, "a"])
print(arrange(params_2PL, Difficulty))

##    Item  Difficulty Discrimination
## 70   70 -2.19189691      1.3129979
## 62   62 -1.78220482      1.8025815
## 49   49 -1.69464612      2.0055408
## 22   22 -1.57887607      0.8705519
## 25   25 -1.51211902      1.2205723
## 16   16 -1.50076160      1.0529234
## 6     6 -1.41585553      0.8707243
## 51   51 -1.41308846      1.3244522
## 10   10 -1.39321190      0.9148399
## 69   69 -1.37486634      1.2215672
## 77   77 -1.27684988      1.1661944
## 50   50 -1.24767983      1.2239997
## 20   20 -1.24414736      1.0844711
## 71   71 -1.18466839      1.8222926
## 65   65 -1.17463888      1.5258446
## 23   23 -1.17405243      1.2340458
## 61   61 -1.14321824      0.9718075
## 67   67 -1.05079990      1.1064733
## 28   28 -0.97893907      0.7967182
## 58   58 -0.89350705      1.5541590
## 30   30 -0.76695879      1.2473463
## 15   15 -0.75659442      1.0821100
## 19   19 -0.75255189      1.2193325
## 55   55 -0.73184392      0.7633925
## 45   45 -0.68956644      1.1714637
## 31   31 -0.67996008      0.8278018
## 56   56 -0.67221391      0.5130391
## 40   40 -0.65207751      0.4796388
## 39   39 -0.63192576      0.7587799
## 27   27 -0.48858472      0.9426140
## 29   29 -0.43918294      1.1262326
## 73   73 -0.42765670      0.9964999
## 11   11 -0.42596807      0.6103020
## 18   18 -0.38167812      0.8951739
## 13   13 -0.35240718      1.0031640
## 17   17 -0.29271797      0.8338692
## 42   42 -0.27954647      1.0543078
## 64   64 -0.20178996      1.0442980
## 35   35 -0.14940166      0.8251580
## 72   72 -0.09897074      0.8525807
## 66   66 -0.07234274      0.7648120
## 36   36 -0.06613105      0.5690134
## 60   60 -0.00455273      0.6830374
## 53   53 -0.00130876      0.5453776
## 46   46  0.02922168      0.3946747
## 76   76  0.10105419      0.7402127
## 75   75  0.15136482      0.6574479
## 52   52  0.18059882      0.6275071
## 80   80  0.19289313      0.9929688
## 74   74  0.21860842      0.9891272
## 79   79  0.26938163      0.7984311
## 34   34  0.27072962      0.8202199
## 21   21  0.27895730      0.7370795
## 24   24  0.32712164      0.8305408
## 47   47  0.34516517      0.5514635
## 14   14  0.41662788      0.4405986
## 63   63  0.47977244      1.2326436
## 12   12  0.51703275      0.7084649
## 44   44  0.80286133      0.8959740
## 78   78  0.80374104      0.3542823
## 54   54  1.05560439      0.7715168
## 59   59  1.14382737      0.5983133
## 26   26  1.24460496      0.5454429
## 37   37  1.34020300      0.3763875

ggplot(params_2PL, aes(x = Difficulty, y = Discrimination, label = Item)) +
  geom_text(check_overlap = F) +  # Use text instead of points
  theme_classic() +
  labs(x = "Difficulty", y = "Discrimination")

# Get item fit information
item_fit <- itemfit(model_2PL, fit_stats=c('X2', 'G2', 'infit'))

## Warning in sqrt(colSums(pf$C - pf$W^2)/colSums(pf$W)^2): NaNs produced

print(item_fit[, !names(item_fit) %in% 'z.infit'])

##    item outfit z.outfit infit     X2 df.X2 RMSEA.X2  p.X2     G2 df.G2 RMSEA.G2
## 1    22  0.893   -1.137 0.235 14.622     8    0.023 0.067 16.056     8    0.025
## 2    28  0.933   -1.272 0.217  9.976     8    0.012 0.267 10.641     8    0.014
## 3    70  0.865   -0.398 0.273 15.119     8    0.024 0.057 16.487     5    0.038
## 4    51  0.753   -1.629 0.229 10.267     8    0.013 0.247 11.835     7    0.021
## 5    11  0.944   -2.476 0.218 11.518     8    0.017 0.174 12.423     8    0.019
## 6    31  0.900   -2.203 0.235 17.647     8    0.028 0.024 20.490     8    0.031
## 7    17  0.903   -2.551 0.223 18.761     8    0.029 0.016 20.682     8    0.032
## 8    62  0.719   -0.687 0.222 15.217     8    0.024 0.055 18.308     4    0.048
## 9    42  0.855   -2.735 0.227 11.607     8    0.017 0.170 13.280     8    0.020
## 10   60  0.931   -3.274 0.216 16.721     8    0.026 0.033 18.839     8    0.029
## 11   52  0.944   -1.699 0.247  7.381     8    0.000 0.496  7.768     8    0.000
## 12   63  0.813   -2.728 0.204 13.721     8    0.021 0.089 16.969     8    0.027
## 13   46  0.976   -1.188 0.243 13.335     8    0.021 0.101 13.959     8    0.022
## 14   54  0.922   -1.355 0.223 14.894     8    0.023 0.061 16.028     8    0.025
## 15   78  0.984   -0.661 0.243 19.095     8    0.030 0.014 20.251     8    0.031
## 16   30  0.836   -1.994 0.219 16.361     8    0.026 0.037 19.873     8    0.031
## 17   39  0.919   -2.149 0.218 16.733     8    0.026 0.033 17.826     8    0.028
## 18   65  0.735   -1.828 0.213 10.414     8    0.014 0.237 14.023     6    0.029
## 19   61  0.891   -1.284 0.229 15.254     8    0.024 0.054 18.066     8    0.028
## 20   18  0.893   -2.321 0.224  9.855     8    0.012 0.275 10.800     8    0.015
## 21   34  0.901   -2.781 0.221 14.108     8    0.022 0.079 15.909     8    0.025
## 22   58  0.731   -2.178 0.217 12.544     8    0.019 0.129 16.020     7    0.029
## 23   16  0.874   -1.090 0.228 12.316     8    0.018 0.138 12.755     8    0.019
## 24   23  0.801   -1.711 0.238 11.442     8    0.016 0.178 14.527     7    0.026
## 25   79  0.914   -2.512 0.212 14.810     8    0.023 0.063 16.443     8    0.026
## 26   59  0.942   -1.243 0.227 13.085     8    0.020 0.109 13.511     8    0.021
## 27   64  0.862   -2.764 0.214 11.321     8    0.016 0.184 12.465     8    0.019
## 28   76  0.927   -2.149 0.234 10.879     8    0.015 0.209 11.373     8    0.016
## 29   35  0.907   -2.389 0.234 12.988     8    0.020 0.112 13.955     8    0.022
## 30   25  0.857   -0.945 0.230 19.427     8    0.030 0.013 20.715     7    0.035
## 31   71  0.586   -1.975 0.205 13.899     8    0.022 0.084 18.928     4    0.049
## 32   49  0.922   -0.053 0.214 17.134     8    0.027 0.029 18.605     4    0.048
## 33   77  0.837   -1.431 0.242 27.742     8    0.039 0.001 27.150     8    0.039
## 34   40  0.968   -1.493 0.230  9.237     8    0.010 0.323  9.535     8    0.011
## 35   14  0.971   -1.235 0.245  8.235     8    0.004 0.411  8.599     8    0.007
## 36   44  0.909   -1.655 0.217 25.330     8    0.037 0.001 29.246     8    0.041
## 37   29  0.835   -2.855 0.214 14.995     8    0.024 0.059 17.241     8    0.027
## 38   20  0.841   -1.583 0.243  8.290     8    0.005 0.406  9.352     8    0.010
## 39   15  0.840   -2.242 0.227 19.543     8    0.030 0.012 21.739     8    0.033
## 40   69  0.877   -0.929 0.227 15.350     8    0.024 0.053 16.542     8    0.026
## 41   72  0.902   -2.386 0.234  9.279     8    0.010 0.319  9.969     8    0.012
## 42   45  0.814   -2.500 0.224 13.054     8    0.020 0.110 14.607     8    0.023
## 43   19  0.818   -2.260 0.220 18.480     8    0.029 0.018 19.587     8    0.030
## 44   50  0.862   -1.218 0.228 16.622     8    0.026 0.034 17.715     8    0.028
## 45   10  0.888   -1.268 0.231 10.389     8    0.014 0.239 13.561     7    0.024
## 46   37  0.978   -0.820 0.232  7.875     8    0.000 0.446  7.987     8    0.000
## 47   73  0.861   -2.766 0.229 19.217     8    0.030 0.014 22.136     8    0.033
## 48   80  0.863   -3.370 0.209 15.305     8    0.024 0.053 17.860     8    0.028
## 49   75  0.941   -1.867 0.243 15.447     8    0.024 0.051 16.308     8    0.026
## 50    6  0.930   -0.824 0.227 15.418     8    0.024 0.052 16.503     8    0.026
## 51   74  0.867   -3.054 0.213 10.549     8    0.014 0.229 11.926     8    0.018
## 52   36  0.953   -2.113 0.232 15.380     8    0.024 0.052 16.702     8    0.026
## 53   66  0.919   -2.907 0.217 19.220     8    0.030 0.014 24.852     8    0.036
## 54   47  0.957   -1.566 0.235  8.916     8    0.009 0.349  9.376     8    0.010
## 55   56  0.960   -1.420 0.235  8.859     8    0.008 0.354  8.990     8    0.009
## 56   53  0.955   -3.892 0.217 14.520     8    0.023 0.069 15.622     8    0.025
## 57   67  0.824   -1.902 0.239 16.065     8    0.025 0.041 20.945     7    0.035
## 58   55  0.917   -1.796 0.232 10.477     8    0.014 0.233 11.425     8    0.016
## 59   24  0.910   -2.062 0.235 17.650     8    0.028 0.024 19.303     8    0.030
## 60   27  0.874   -2.469 0.219  9.439     8    0.011 0.307 11.680     8    0.017
## 61   26  0.963   -0.886 0.222 13.022     8    0.020 0.111 13.388     8    0.021
## 62   12  0.925   -2.049 0.228 13.436     8    0.021 0.098 14.772     8    0.023
## 63   13  0.868   -2.781 0.224 15.598     8    0.024 0.049 17.372     8    0.027
## 64   21  0.920   -2.439 0.225 18.271     8    0.028 0.019 21.251     8    0.032
##     p.G2
## 1  0.042
## 2  0.223
## 3  0.006
## 4  0.106
## 5  0.133
## 6  0.009
## 7  0.008
## 8  0.001
## 9  0.103
## 10 0.016
## 11 0.456
## 12 0.030
## 13 0.083
## 14 0.042
## 15 0.009
## 16 0.011
## 17 0.023
## 18 0.029
## 19 0.021
## 20 0.213
## 21 0.044
## 22 0.025
## 23 0.121
## 24 0.043
## 25 0.036
## 26 0.095
## 27 0.132
## 28 0.181
## 29 0.083
## 30 0.004
## 31 0.001
## 32 0.001
## 33 0.001
## 34 0.299
## 35 0.377
## 36 0.000
## 37 0.028
## 38 0.314
## 39 0.005
## 40 0.035
## 41 0.267
## 42 0.067
## 43 0.012
## 44 0.023
## 45 0.060
## 46 0.435
## 47 0.005
## 48 0.022
## 49 0.038
## 50 0.036
## 51 0.155
## 52 0.033
## 53 0.002
## 54 0.312
## 55 0.343
## 56 0.048
## 57 0.004
## 58 0.179
## 59 0.013
## 60 0.166
## 61 0.099
## 62 0.064
## 63 0.026
## 64 0.007

# z.infit has many missing values because the data has lots of missing values 
# but we can use infit and outfit

Chi-Squared Statistics (p.X2 and p.G2): These values indicate whether the discrepancies between observed and expected responses are statistically significant. A value below 0.05 usually suggests poor fit.

Outfit : This is an outlier-sensitive fit statistic. It is the average of the squared standardized residuals and is more sensitive to unexpected responses to items that are far from the person’s ability level. Values significantly greater than 1 indicate noise or outliers, while values significantly less than 1 suggest overfit, possibly due to redundancy. Infit: This is an information-weighted fit statistic, less sensitive to outliers than outfit. It focuses more on items around the person’s ability level. Like outfit, values significantly greater than 1 indicate noise, and values significantly less than 1 suggest overfit. Values above 1.3 suggest underfitting, where items may be behaving unpredictably and not conforming to the model expectations.

MarsIB IRT Analysis

Lijin Zhang

2024-05-13