Estimating Polytomous IRT Models in R

Partial & Generalized Partial Credit Models
Rating Scale Model
Graded Response Model
Nominal Response Model
Visualizing Polytomous IRT Models

To demonstrate the estimation of polytomous IRT models in R, we will use the science data set. This data set comes from the Consumer Protection and Perceptions of Science and Technology section of the 1992 Euro-Barometer Survey (Karlheinz & Melich, 1992) based on a sample from England. The questions asked are given below:

Comfort: Science and technology are making our lives healthier, easier and more comfortable.
Environment: Scientific and technological research cannot play an important role in protecting the environment and repairing it.
Work: The application of science and new technology will make work more interesting.
Future: Thanks to science and technology, there will be more opportunities for the future generations.
Technology: New technology does not depend on basic scientific research.
Industry: Scientific and technological research do not play an important role in industrial development.
Benefit: The benefits of science are greater than any harmful effect it may have.

All of the items are measured on a four-point scale with response categories “1=strongly disagree”, “2=disagree to some extent”, “3=agree to some extent” and “4=strongly agree”.

Partial & Generalized Partial Credit Models

We will first read the science data set in R. Because the science data set is a .csv file, I am using read.csv function in R to read the data. Also, because the data set is currently in my working directory, I don’t have to specify the full path to the data set. I use header=TRUE in the command since the first row of the data set includes the variable names (you can open the data using Microsoft Excel and see the variable names). Using the head function, we can view the first six rows of the data set to check if the data set has been properly imported into R.

science <- read.csv("https://raw.githubusercontent.com/okanbulut/myrfunctions/master/science.csv", header=TRUE)
head(science)

  Comfort Environment Work Future Technology Industry Benefit
1       4           4    4      3          4        3       2
2       3           4    3      3          3        3       3
3       3           2    2      2          4        4       3
4       3           3    2      2          4        4       3
5       3           1    4      4          2        3       1
6       4           3    4      3          3        4       3

Next, we will use the mirt function in the mirt package (Chalmers et al., 2012) to estimate item parameters for the Partial Credit Model. In the estimation process, we first begin with defining the latent trait (I called it “liking.science”) based on the seven items in the science data set. Then we use this model in the mirt function to estimate the item parameters (you can use ?mirt to read the help page). Although we set itemtype as Rasch, the mirt function recognizes that the items have more than two response categories, and thus uses the Partial Credit Model to estimate item parameters. If you haven’t installed the mirt package before, we need to run the following command first: install.packages("mirt").

library("mirt")
model.pcm <- 'liking.science = 1-7' 
results.pcm <- mirt(data=science, model=model.pcm, itemtype="Rasch", SE=TRUE, verbose=FALSE)
coef.pcm <- coef(results.pcm, IRTpars=TRUE, simplify=TRUE)

We can save the estimated item parameters as a new data frame called items.pcm and print the results using the following commands:

items.pcm <- as.data.frame(coef.pcm$items)
print(items.pcm)

            a        b1         b2        b3
Comfort     1 -2.276123 -2.2819490 1.1949025
Environment 1 -1.472595 -0.5901824 0.2536060
Work        1 -1.387293 -0.8031010 1.5158989
Future      1 -2.004430 -1.1972516 0.9102602
Technology  1 -1.978474 -0.6714361 0.3400289
Industry    1 -1.972769 -1.4914126 0.1268971
Benefit     1 -1.887187 -0.7457849 1.0711518

In the output, we see a data frame of the estimated item parameters. The names of the items in the science data set are printed as the row names in the output. The first column shows the discrimination parameter, which is equal to “1”" for all items because the Partial Credit Model, similar to the Rasch model, constrains the discrimination parameter to be “1”. In the following columns (b1 to b3), we see the estimated thresholds or step parameters. Because the item in the science data set have four response categories, the Partial Credit Model estimates three threshold parameters (b1 to b2) for each item.

Using the same commands above, we can also estimate item parameters for the Generalized Partial Credit Model in which item discrimination is freely estimated for the items. Instead of itemtype="Rasch", we need to use itemtype="gpcm" to tell R that we want to estimate the “Generalized” Partial Credit Model this time.

model.gpcm <- 'liking.science = 1-7' 
results.gpcm <- mirt(data=science, model=model.gpcm, itemtype="gpcm", SE=TRUE, verbose=FALSE)
coef.gpcm <- coef(results.gpcm, IRTpars=TRUE, simplify=TRUE)
items.gpcm <- as.data.frame(coef.gpcm$items)
print(items.gpcm)

                      a         b1          b2         b3
Comfort      0.87538955  -3.250347  -2.8567249  1.5202484
Environment -0.03501584  32.393912  13.6362126 -3.5801431
Work         0.83702216  -2.037048  -1.0364378  2.0640667
Future       2.18522588  -2.086705  -0.9827835  0.8367083
Technology  -0.03801980  42.679630  14.3641172 -5.8043006
Industry     0.12953982 -12.168178 -10.1545248  0.5402662
Benefit      0.73353673  -2.875516  -1.0956900  1.6135753

In the printed output, we can see that the column “a” is not fixed to 1 anymore, indicating that the items have different item discrimination parameters. However, some of our threshold parameters have turned into very extreme numbers. This might be an indication that the Generalized Partial Credit Model is probably not a good option of the science data set.

Rating Scale Model

This time we will estimate item parameters for the science data set using the Rating Scale Model. All of the items in the data set have four categories, which allows us to estimate the same number of category threshold parameters for the items. Differently from the Partial Credit Model, we use itemtype="rsm" to estimate item parameters for the Rating Scale Model. We save the estimated item parameters in the data frame named as items.rsm and print the output using the print function.

model.rsm <- 'liking.science = 1-7' 
results.rsm <- mirt(data=science, model=model.rsm, itemtype="rsm", verbose=FALSE)
coef.rsm <- coef(results.rsm, simplify=TRUE)
items.rsm <- as.data.frame(coef.rsm$items)
print(items.rsm)

            a1        b1        b2        b3          c
Comfort      1 -2.105444 -1.343347 0.5124721  0.0000000
Environment  1 -2.105444 -1.343347 0.5124721 -0.3244602
Work         1 -2.105444 -1.343347 0.5124721 -0.7136978
Future       1 -2.105444 -1.343347 0.5124721 -0.2499954
Technology   1 -2.105444 -1.343347 0.5124721 -0.2358478
Industry     1 -2.105444 -1.343347 0.5124721  0.2555616
Benefit      1 -2.105444 -1.343347 0.5124721 -0.5215846

In the output, a1 is the item discrimination parameter, which is fixed to 1 for all items because RSM, as a polytomous form of the Rasch model, requires all items have the same discrimination parameter (i.e., \(a1=1\)). The following columns (i.e., ak0, ak1, ak2, ak3, ak4) showing the number of categories for the items. The essential part of the output begins with the column labeled as “d1”. d1 through d4 represent the category threshold parameters, which are the same for all of the items, and “c” is the location parameter estimated uniquely for each item in the science data set. The estimated location parameters (the column “c” in items.rsm) show that the easiest item is Work and the hardest item is Comfort.

Graded Response Model

To estimate the Graded Response Model, we need to change the itemtype to itemtype="graded" in the mirt function so that R can estimate the item parameters for the Graded Response Model. In the output, some item parameters are extremely large, suggesting that the Graded Response Model doesn’t seem to fit the data well. This result is not surprising given that we have already seen some extreme results when we tried to estimate item parameters with the Generalized Partial Credit Model.

model.grm <- 'liking.science = 1-7' 
results.grm <- mirt(data=science, model=model.grm, itemtype="graded", SE=TRUE, verbose=FALSE)
coef.grm <- coef(results.grm, IRTpars=TRUE, simplify=TRUE)
items.grm <- as.data.frame(coef.grm$items)
print(items.grm)

                     a         b1          b2         b3
Comfort     1.11185062  -4.438832  -2.4181197  1.3462905
Environment 0.06229055 -40.634260 -13.4096480 11.5906085
Work        1.17055741  -2.463243  -0.7617422  1.9050371
Future      2.14668567  -2.342952  -0.9922735  0.8778317
Technology  0.05882352 -51.633868 -16.2840234 12.6841326
Industry    0.35143873 -10.532348  -5.1825245  1.0036080
Benefit     1.13606611  -2.981573  -0.8887580  1.5019365

Nominal Response Model

Although the Nominal Response Model may not appropriate for data sets like science, we will use the mirt function again to demonstrate the estimation of item parameters according to the Nominal Response Model. We will use itemtype="nominal" in the command to request the Nominal Response Model.

model.nrm <- 'liking.science = 1-7' 
results.nrm <- mirt(data=science, model=model.nrm, itemtype="nominal", SE=TRUE, verbose=FALSE)
coef.nrm <- coef(results.nrm, IRTpars=TRUE, simplify=TRUE)
items.nrm <- as.data.frame(coef.nrm$items)
print(items.nrm)

                   a1         a2         a3        a4         c1
Comfort     0.2479192 -0.4393110 -0.3188852 0.5102770 -2.1664458
Environment 0.6834829 -0.8405704 -0.7274799 0.8845673 -1.0695872
Work        0.2551751 -0.6666461 -0.3092425 0.7207136 -0.8110968
Future      0.2005983 -0.4946839 -0.4579307 0.7520163 -1.5027516
Technology  0.6254204 -0.7270765 -0.6622835 0.7639396 -1.4773630
Industry    0.9480691 -0.9591746 -1.1987801 1.2098856 -1.7992223
Benefit     0.2973991 -0.5917818 -0.4148605 0.7092432 -1.2440319
                     c2        c3         c4
Comfort     -0.28557185 1.8583455  0.5936722
Environment  0.12548444 0.6538681  0.2902346
Work         0.27818439 1.1068668 -0.5739544
Future       0.13920603 1.2190123  0.1445333
Technology   0.26284009 0.8337212  0.3808017
Industry    -0.09618228 1.0822432  0.8131614
Benefit      0.35709035 1.0589349 -0.1719934

Visualizing Polytomous IRT Models

We will use the plot function in the mirt package to examine the items visually. Differently from dichotomous IRT models, polytomous IRT models – such as the Partial Credit Model – have option characteristic curves (OCCs), which can be considered as an extension of item characteristic curves (ICCs) for polytomous items. Because the items have more than two response categories, OCCs have multiple curves in the same plot. Each curve represents the probability of selecting a particular response option as a function of the latent trait.

In the plot function, we first enter the mirt-class object that stores the item parameters (i.e., result.pcm) and specify the type of plot using type = 'trace'. Assuming we want to see the OCCs for items 1 and 3, we include which.items = c(1,3) in the script. By eliminating this option from the plot function, we can draw the OCCs for all items in a single plot. main = "" eliminates printing a title above the plot. Next, we choose the settings for the plot inside par.settings. To make the curves thicker, we use the option of lwd = 2. Assigning “3” or a higher value to lwd can make the OCCs thicker. In addition, we use lty=1:4 to request different types of lines for each OCC. The auto.key command adjusts the legend settings. We want the legend to show which lines represent the OCCs. The following figure shows the OCCs for items 1 (Comfort) and item 3 (Work) in the science data.

plot(results.pcm, type = 'trace', which.items = c(1, 3), 
     main = "", par.settings = simpleTheme(lty=1:4,lwd=2),
     auto.key=list(points=FALSE,lines=TRUE, columns=4))

In the figure, each item has four OCCs representing the four response options. The response categories are labeled as “P1” to “P4”. For both items, the OCCs follow the same order as the response categories. The OCC for the first response option (P1) is on the very left side of the plot, whereas the last response option (P4) is located on the right side of the plot. The other OCCs in the middle also are ordered properly.

Next, we will draw a plot of item information function (IIF) to show the amount of information that each item can explain across different levels of the latent trait. We use the same plot function again but this time we request the IIFs by changing type to “infotrace”. In the par.settings option, we only specify the thickness of the IIFs.

plot(results.pcm, type = 'infotrace', which.items = c(1, 3), 
     main = "", par.settings = simpleTheme(lwd=2))

In addition to IIFs for particular items, we can show the total amount of information available from the items (i.e., TIF) and the distribution of conditional standard error of measurement (i.e., cSEM) in the science data set. In the following example, we draw a plot of TIF and cSEM using the range of -4 to 4 for the latent trait. We use type = 'info' for the TIF plot and type = 'SE' for the cSEM plot.

plot(results.pcm, type = 'info', theta_lim = c(-4,4), lwd=2)

plot(results.pcm, type = 'SE', theta_lim = c(-4,4), lwd=2)