Not long ago, I was asked an interesting question: imagine you have a dataset with 1000 rows—each row representing a test taker—and their answers to a 10-question test, recorded as either correct (1) or incorrect (0). How would you analyze this data?
A natural first step might be to score each person by counting their total correct answers, and then, for each question, calculate the fraction of test takers who got it right. But we can go further. With this kind of information, we can ask deeper questions: for each test item, what is the probability that someone answers it correctly? Thinking about this brought me back to when I first learned how logistic regression could be used to study binary outcomes in a course on categorical data analysis.
This idea—using logistic regression to study test responses—forms the foundation of a psychometric framework known as Item Response Theory (IRT). At its core, IRT assumes that each person has an underlying trait, such as knowledge or ability, that we cannot observe directly. What we can observe are their answers to test questions, and IRT provides a way to model how this hidden trait influences the probability of getting each question right.
For an arbitrary question, one popular model choice is the two-parameter logistic model that can be expressed as
where \(z\) is a latent trait (for example, latent knowledge or ability) and \(discrimination\) and \(difficulty\) are two model parameters to be estimated from the data. In typical applications, \(z\) is assumed to follow a Normal Standard distribution. Once these two parameters are estimated, we can allow \(z\) to vary across a range of values (for example, from -6 to 6) to get a visualization known as Item Characteristic Curve (ICC).
Note that when \(z \space = \space difficulty\), the probability of a correct answer becomes 0.5. Therefore, the \(difficulty\) parameter –usually denoted by \(\beta\)– can be interpreted as the ability level required to obtain a 0.5 probability of answering correctly. The \(discrimination\) parameter –usually denoted by \(\alpha\)– is directly related to the steepness of the ICC and measures how effectively an item (i.e., a question) differentiates between examinees with different ability levels. In practice, high discrimination levels are preferred because the probability of answering correctly changes rapidly with small changes in ability around the item’s difficulty level.
Does IRT make sense in genomics or other research areas?
In genetics, a similar data structure can be found:
Each item = a genetic marker (e.g., SSRs, SNPs, mutations, gene variants).
Each test taker = a study participant.
The latent trait = an underlying biological risk, susceptibility, or expression profile.
Just like test items vary in difficulty and discrimination, genetic variants differ in how strongly they associate with traits or diseases. For example:
An SSR or a SNP with a high “discrimination parameter” could sharply separate high-risk individuals from low-risk ones.
An SSR or a SNP with a high “difficulty parameter” might represent a rare allele, only present in people with a high level of genetic susceptibility.
Similar analogies can be established with other research areas.
Data simulation from a two-parameter logistic model
Now, let me show you one of the possible ways to simulate data from a two-parameter logistic model using R. We start by making sure a folder called “images” is created (used to save script-generated *.png files). We also load some of the tidyverse R packages, together with bslib (for bootstrap styling and theming), plotly (for interactive visualizations), and ltm/mirt to peform IRT analyses. Colorblind friendly colors are also set.
Let’s simulate data for a 10 independent items (nItems = 10) instrument and 1000 examinees (nPersons = 1000). A set of 10 discrimination values are drawn from a Uniform distribution between 0.25 and 2. Likewise, 10 difficulty values are drawn from a Normal Standard distribution.
Let’s take a look at the simulated difficulty and discrimination parameters for each of the 10 items.
Show code
##### Create objects for simulation# Simulate data for a 2PL modelset.seed(37573)nItems =10nPersons =1000# Generate item parametersdiscrimination =round(runif(nItems, 0.25, 2), 3) # Discrimination adifficulty =round(rnorm(nItems, 0, 1),3) # Difficulty bprint(data.frame(difficulty,discrimination))
Next, we simulate 1000 ability values that also follow a Normal Standard distribution. Two ways of simulating data are illustrated. The first option is using the simdata function from mirt; the second option is to use base R functions and purrr to generate a data frame called baseSim which is used to illustrate model fitting functions (mirt and ltm).
Parameter estimates using mirt are shown below.
Show code
# Generate person abilitiesability =round(rnorm(nPersons, 0, 1), 3)# Simulate response data using mirt's simdatamirtSim =simdata(a = discrimination, d = difficulty, itemtype ="2PL", Theta = ability)# Simulate response data using base R# Compute probabilities and simulate responsesmyfxGetResponses =function(eaITEM) {data.frame(response =rbinom(nPersons, size =1,prob =1/ (1+exp(-discrimination[eaITEM] * (ability - difficulty[eaITEM]))) ) ) %>%set_names(paste0("Item",eaITEM))}baseSim =map_dfc(1:nItems, ~myfxGetResponses(.))##### Model fitting for base R simulated datamirtMod =mirt(baseSim, 1, "2PL", verbose=F, SE=T)ltmMod =ltm(baseSim ~ z1)coefMirt =coef(mirtMod, simplify=TRUE)$itemscoefLtm =coef(ltmMod, prob=T, order=F)data.frame(difficulty=-coefMirt[,2]/coefMirt[,1], discrimination=coefMirt[, 1]) %>%mutate(across(everything(), ~round(.x, 3)))
We can use parameter estimates from ltm to perform some data wrangling to generate a data frame called parameters which is used for storing item annotations, and a pointsDF data frame to store the columns needed to be able to overlay points on top of a faceted ggplot2 visualization.
Show code
##### Calculate predicted 2PL functions and build elegant visualizations# Define the 2-Parameter Logistic functionmyfx2PL =function(x, a, b) {1/ (1+exp(-a * (x - b)))}# Create a data frame with two sets of parameters estimated using ltmparameters =data.frame(Item =1:nItems,a =as.numeric(coefLtm[,2]), # Discriminationb =as.numeric(coefLtm[,1]) # Difficulty) %>%arrange(b) %>%mutate(Color = Colors,Label =paste0("Item ", Item, " [a = ", round(a, 2), "; b = ", round(b, 2), "]"))# Define the ability rangexValues =seq(-6, 6, by =0.1)# Evaluate the function for each itemmyfxCalc =function(eaITEM) {data.frame(Item = parameters$Item[eaITEM],Color = parameters$Color[eaITEM],Label = parameters$Label[eaITEM],a = parameters$a[eaITEM],b = parameters$b[eaITEM],x = xValues,y =myfx2PL(xValues,a = parameters$a[eaITEM], b = parameters$b[eaITEM]) )}#Generate line plotting dataplotDF =map_df(1:nrow(parameters), ~myfxCalc(.))plotDF = plotDF %>%mutate(tooltipText =paste0("Item ", Item, " [Ability = ", round(x, 2), "; Prob = ", round(y, 2), "]"))#Generate point plotting datapointsDF = parameters %>% dplyr::select(Item, Color, b, Label) %>%mutate(x = b,y =0.5,tooltipText =paste0("Item ", Item, " [Ability = ", round(x, 2), "; Prob = ", round(y, 2), "]"))#Static item-faceted visualizationfacetPlot =ggplot(plotDF, aes(x = x, y = y, color =I(Color), text=tooltipText, group=1))+geom_line(linewidth =1.2) +facet_wrap(~factor(Label, levels=unique(Label)), ncol =5) +# facet_wrap(~factor(Label, levels=unique(Label)), ncol = 5, as.table=F, scales="free_x") +geom_vline(aes(xintercept=b),linewidth=0.15,colour="gray50") +geom_hline(aes(yintercept=0.5),linewidth=0.15,colour="gray50") +geom_point(data=pointsDF, aes(color=I(Color), shape=intToUtf8(9679)), size=6) +geom_label(data=pointsDF, aes(label=Item, fill=I("white")), color="black", size=3.25, label.size=NA,label.padding=unit(0.01, "lines"), fontface="bold") +ylab(paste0("Probability of correct answer")) +xlab("Ability") +theme_classic()+theme(axis.line.x=element_line(color="gray20", linewidth=1.0),axis.line.y=element_line(color="gray20", linewidth=0.5),axis.text.x=element_text(hjust =1),# panel.grid.major.x=element_line(size=0.05, color="#B3B3B3"),panel.grid.minor=element_blank(), panel.border=element_rect(colour="gray50",fill=NA,linewidth=1.0),panel.background =element_rect(colour ="gray50", linewidth=1.0),panel.spacing =unit(0.2, "lines"),text=element_text(size=14),strip.text=element_text(size=11,face ="bold"),legend.position ="none")ggsave("images/facetedPlot.png",facetPlot, width=11, height=11/1.618, units="in", dpi=320)
Once the facetedPlot.png file is saved inside the images folder, we can display it within our document.
For the sake of interpretation, item plots are ordered anscendingly by the estimated \(difficulty\) parameter. Reading left-to-right and top-to-bottom we can see that item ➆ was the least difficult one (\(difficulty\) = -1.13), while item ➅ was the most difficult (\(difficulty\) = 1.74). We can also see that the ICC for item ➆ was the flattest, indicating that this was the least discriminant item, while the ICC for item ➃ was the steepest, which makes it a highly discriminant item.
The facet-annoated visualization is a fancy static visualization. We can also make a few minor modifications to the facetPlot object in order to generate a version that can be displayed as an interactive visualization (called facetPlotly) within the html document.
Show code
#Interactive item-faceted visualization#wrapped in suppressWarnings because geom_label has not been implemented in plotlyfacetPlot =ggplot(plotDF, aes(x = x, y = y, color =I(Color), text=tooltipText, group=1))+geom_line(linewidth =0.75) +facet_wrap(~factor(Label, levels=unique(Label)), ncol =5) +# facet_wrap(~factor(Label, levels=unique(Label)), ncol = 5, as.table=F, scales="free_x") +geom_vline(aes(xintercept=b),linewidth=0.15,colour="gray50") +geom_hline(aes(yintercept=0.5),linewidth=0.15,colour="gray50") +geom_point(data=pointsDF, aes(color=I(Color), shape=intToUtf8(9679)), size=2) +geom_label(data=pointsDF, aes(label=Item, fill=I("white")), color="black", size=3.25, label.size=NA,label.padding=unit(0.01, "lines"), fontface="bold") +expand_limits(y =c(0,1)) +ylab(paste0("Probability of correct answer")) +xlab("Ability") +theme_classic()+theme(axis.line.x=element_line(color="gray20", linewidth=1.0),axis.line.y=element_line(color="gray20", linewidth=0.5),axis.text.x=element_text(hjust =1),# panel.grid.major.x=element_line(size=0.05, color="#B3B3B3"),panel.grid.minor=element_blank(), panel.border=element_rect(colour="gray50",fill=NA,linewidth=1.0),panel.background =element_rect(colour ="gray50", linewidth=1.0),panel.spacing =unit(0.2, "lines"),text=element_text(size=10),strip.text=element_text(size=7.75,face ="bold"),legend.position ="none")facetPlotly =ggplotly(facetPlot, tooltip="text") %>%config(displayModeBar=F)card(align ="center",style ="border: 4px double #b3b3b3; background: mintcream; border-radius: 15px; padding: 2px; margin: 2.5%",card_header(style="background: aquamarine", "Interactive Visualization of Faceted Item Characteristic Curves"),full_screen = F, height=750, facetPlotly )
Interactive Visualization of Faceted Item Characteristic Curves
We can also generate a static visualization to display all ICC on a single plot, with a properly built annotated legend as shown below.
Show code
#Static visualization of all models fittedfitsPlot =ggplot(plotDF, aes(x = x, y = y, color =factor(Label, levels=unique(Label)), data_id = Item)) +geom_line(linewidth =1.2) +scale_color_manual(name="Parameter estimates",values = Colors) +geom_hline(aes(yintercept=0.5),linewidth=0.15,colour="gray50") +geom_vline(aes(xintercept=b),linewidth=0.15,colour="gray50") +geom_point(data=pointsDF, aes(color=I(Color), shape=intToUtf8(9679)), size=3, show.legend=F) +ylab(paste0("Probability of correct answer")) +xlab("Ability") +theme_classic()+theme(axis.line.x=element_line(color="gray20", linewidth=1.0),axis.line.y=element_line(color="gray20", linewidth=0.5),# panel.grid.major.x=element_line(size=0.05, color="#B3B3B3"),panel.grid.minor=element_blank(), panel.border=element_rect(colour="gray50",fill=NA,linewidth=1.0),panel.background =element_rect(colour ="gray50", linewidth=1.0),panel.spacing =unit(0.2, "lines"),text=element_text(size=14),strip.text=element_text(size=12,face ="bold"),legend.position ="right",legend.key =element_rect(fill =NA, color =NA))ggsave("images/fitsPlot.png",fitsPlot, width=11, height=11/1.618, units="in", dpi=320)
A few changes can be made to the ggplot2 code that generated the previous visualization, and then pass it to ggplotly to get the interactive version shown below.
Show code
#Interactive item-faceted visualization#wrapped in suppressWarnings because geom_label has not been implemented in plotlyfitsPlotly =ggplot(plotDF, aes(x = x, y = y, color =factor(Label, levels=unique(Label)), data_id = Item, group=1,text=tooltipText)) +geom_line(linewidth =0.75) +geom_point(data=pointsDF, aes(color =factor(Label, levels=unique(Label)), shape=intToUtf8(9679)), size=2, show.legend=F) +scale_color_manual(name="Parameter estimates",values = Colors) +geom_hline(aes(yintercept=0.5),linewidth=0.15,colour="gray50") +ylab(paste0("Probability of correct answer")) +xlab("Ability") +theme_classic()+theme(axis.line.x=element_line(color="gray20", linewidth=1.0),axis.line.y=element_line(color="gray20", linewidth=0.5),# panel.grid.major.x=element_line(size=0.05, color="#B3B3B3"),panel.grid.minor=element_blank(), panel.border=element_rect(colour="gray50",fill=NA,linewidth=1.0),panel.background =element_rect(colour ="gray50", linewidth=1.0),panel.spacing =unit(0.2, "lines"),text=element_text(size=12),strip.text=element_text(size=12,face ="bold"),legend.position ="right",legend.key =element_rect(fill =NA, color =NA))fitsggPlotly =ggplotly(fitsPlotly, tooltip="text", height =420) %>%style(showlegend =FALSE, traces =1:nItems) %>%config(displayModeBar=F)card(align ="center",style ="border: 4px double #b3b3b3; background: mintcream; border-radius: 20px; padding: 2px; margin: 2.5%",card_header(style="background: aquamarine", "Interactive Item Characteristic Curves"),full_screen = F, fitsggPlotly )
Interactive Item Characteristic Curves
A few more utilities from ltm
When we simulated the dataset of responses (i.e., baseSim) it was assumed that the probability of a correct answer was driven by a single latent trait (ability). The unidimTest() function from ltm provides a way to check whether that assumption is reasonable for the simulated data (we know, it should). P-values higher than a significance level (say 0.05) support the null hypothesis of a single latent trait. For the baseSim data the function return a p-value of 0.505, so the null hypothesis holds.
Show code
##### Some other features available in ltm and mirt# Test for unidimensionality# Ho: All items can be explained by a single latent ability factor# P-values higher than 0.05 support HounidimTest(ltmMod)
Unidimensionality Check using Modified Parallel Analysis
Call:
ltm(formula = baseSim ~ z1)
Matrix of tertachoric correlations
Item1 Item2 Item3 Item4 Item5 Item6 Item7 Item8 Item9 Item10
Item1 1.0000 0.2055 0.2099 0.2815 0.2369 0.2222 0.1247 0.1865 0.1569 0.0188
Item2 0.2055 1.0000 0.4501 0.5121 0.4021 0.4334 0.2405 0.5421 0.2488 0.2798
Item3 0.2099 0.4501 1.0000 0.4737 0.5497 0.4183 0.1737 0.4163 0.4202 0.3176
Item4 0.2815 0.5121 0.4737 1.0000 0.4907 0.4247 0.1890 0.5080 0.3439 0.3643
Item5 0.2369 0.4021 0.5497 0.4907 1.0000 0.4160 0.1063 0.5321 0.3476 0.3096
Item6 0.2222 0.4334 0.4183 0.4247 0.4160 1.0000 0.2522 0.4833 0.1958 0.3032
Item7 0.1247 0.2405 0.1737 0.1890 0.1063 0.2522 1.0000 0.1985 0.1437 0.1482
Item8 0.1865 0.5421 0.4163 0.5080 0.5321 0.4833 0.1985 1.0000 0.2077 0.2771
Item9 0.1569 0.2488 0.4202 0.3439 0.3476 0.1958 0.1437 0.2077 1.0000 0.3151
Item10 0.0188 0.2798 0.3176 0.3643 0.3096 0.3032 0.1482 0.2771 0.3151 1.0000
Alternative hypothesis: the second eigenvalue of the observed data is substantially larger
than the second eigenvalue of data under the assumed IRT model
Second eigenvalue in the observed data: 0.576
Average of second eigenvalues in Monte Carlo samples: 0.5807
Monte Carlo samples: 100
p-value: 0.505
ltm comes with a plot() function that allows us to build different kinds of visualizations. If we pass the output from descript(baseSim) to the plot function and use custom Unicode symblos as plotting characters, then we can obtain a visualization for the proportion of correctly answered items at each level of Total Score. The following visualization shows that for all Total Score levels, items ➆ and ➈ (the “easy” ones) have higher proportions than items ➁ and ➅ (the “difficult” ones).
Show code
# Get fancy Unicode characters for plottingunicodePC =data.frame(Item=1:10, unicodePC =map_chr(10122:10131, intToUtf8))pointsDF = pointsDF %>%left_join(unicodePC) %>%arrange(Item)# Plotting observed total scoresinvisible(capture.output({png("images/plotDescript.png", width=7.75, height=7.75/1.618, units="in", res=320, type="cairo-png",family="Segoe UI Symbol")plot(descript(baseSim), pc=pointsDF$unicodePC, col=pointsDF$Color, cex=1.25,main="Proportion of Item Correct Responses at each Total Score Level")dev.off()}))
By default, the ltm::plot() function produces a visualization for all ICCs. Using base R, this visualization can be enhanced with Unicode symbols and a legend to make it more elegant. This is illustrated in the following static visualization. The drawback is that, currently, there is not a direct way to convert base R plots into interactive visualizations.
Finally, the ltm::plot() function can be used to visualize an Item Information Curve (IIC). An IIC shows how much statistical information (precision) a test item provides about a person’s latent trait (ability) at different levels of the \(difficulty\) parameter. The higher the curve at a given \(difficulty\), the more precisely the item measures ability around that level.
The following visualization shows the performance of the two most extreme items (namely, items ➆ and ➅). Between these, we can see that only item ➅ is highly informative around the estimated \(difficulty\) value of ~1.74.
Show code
# Specify the item to plot (e.g., Item 1)# Use itemNumber = c(7,9) to plot items 7 and 9# Use itemNumber = 7:9 to plot items 7 to 9# The following line selects items with the minimum and maximum difficultiesitemNumber =c(which(pointsDF$b==min(pointsDF$b)), which(pointsDF$b==max(pointsDF$b)))# Set up a 1x2 plotting layoutinvisible(capture.output({png("images/itemFits.png", width=7.75, height=7.75/1.618, units="in", res=320, type="cairo-png",family="Segoe UI Symbol")par(mar =c(4.25, 4.25, 2, 0.5))par(mfrow =c(1, 2))# Plot the Item Characteristic Curve (ICC)plot(ltmMod, type ="ICC", items = itemNumber, zrange=c(-6,6),main ="Item Characteristic Curve for selected Item(s)", cex.main=0.9,col=pointsDF$Color[itemNumber], lwd=2, labels=rep(NA, length(itemNumber)))abline(h=0.5, col="gray50", lwd=1)abline(v=pointsDF$x[itemNumber], col="gray50", lwd=1)points(pointsDF$x[itemNumber], pointsDF$y[itemNumber],pc=rep(intToUtf8(9679),length(itemNumber)), col=rep("white", length(itemNumber)), cex=1.25)points(pointsDF$x[itemNumber], pointsDF$y[itemNumber],pc=pointsDF$unicodePC[itemNumber], col=pointsDF$Color[itemNumber], cex=1.25)# Plot the Item Information Curve (IIC)plot(ltmMod, type ="IIC", items = itemNumber, zrange=c(-6,6),main ="Item Information Curves", cex.main=0.9,col=pointsDF$Color[itemNumber], lwd=2, labels=rep(NA, length(itemNumber)))abline(v=pointsDF$x[itemNumber], col="gray50", lwd=1)dev.off()}))# Reset default 1x1 frame and graphical marginspar(mfrow =c(1, 1))par(mar =c(5, 4, 4, 2) +0.1)
Closing remarks
I invite you to keep learning about the essence of IRT and extend its application to your research area. Should you need additional assistance with ideas captured in this short publication, please do not hesitate reaching out to me.
Enjoy🛸!
Source Code
---title: "A Gentle Introduction to Item Response Theory"author: James Silva Garciadate: "`r Sys.Date()`"link-external-newwindow: trueformat: html: mainfont: Segoe UI Symbol mainfontoptions: - BoldFont= Segoe UI Symbol Bold fontsize: 0.85em code-tools: caption: "👀 code" toggle: false source: true code-fold: true code-summary: "Show code" code-copy: true code-line-numbers: trueexecute: eval: true include: true echo: true warning: falsetoc: truetoc-depth: 2toc-location: bodytoc-title: Contents---## Introductory ideasNot long ago, I was asked an interesting question: imagine you have a dataset with 1000 rows—each row representing a test taker—and their answers to a 10-question test, recorded as either correct (1) or incorrect (0). How would you analyze this data?A natural first step might be to score each person by counting their total correct answers, and then, for each question, calculate the fraction of test takers who got it right. But we can go further. With this kind of information, we can ask deeper questions: for each test item, what is the probability that someone answers it correctly? Thinking about this brought me back to when I first learned how logistic regression could be used to study binary outcomes in a course on categorical data analysis.This idea—using logistic regression to study test responses—forms the foundation of a psychometric framework known as **Item Response Theory** (**IRT**). At its core, IRT assumes that each person has an underlying trait, such as knowledge or ability, that we cannot observe directly. What we can observe are their answers to test questions, and IRT provides a way to model how this hidden trait influences the probability of getting each question right.For an arbitrary question, one popular model choice is the two-parameter logistic model that can be expressed as$${\sf Pr(Correct \space answer)} = \frac{1}{1 \space + \space exp[-discrimination (z \space - \space difficulty)]}$$where $z$ is a latent trait (for example, latent knowledge or ability) and $discrimination$ and $difficulty$ are two model parameters to be estimated from the data. In typical applications, $z$ is assumed to follow a Normal Standard distribution. Once these two parameters are estimated, we can allow $z$ to vary across a range of values (for example, from -6 to 6) to get a visualization known as **Item Characteristic Curve** (**ICC**).Note that when $z \space = \space difficulty$, the probability of a correct answer becomes 0.5. Therefore, the $difficulty$ parameter –usually denoted by $\beta$– can be interpreted as the ability level required to obtain a 0.5 probability of answering correctly. The $discrimination$ parameter –usually denoted by $\alpha$– is directly related to the steepness of the ICC and measures how effectively an item (i.e., a question) differentiates between examinees with different ability levels. In practice, high discrimination levels are preferred because the probability of answering correctly changes rapidly with small changes in ability around the item's difficulty level.## Does IRT make sense in genomics or other research areas?In genetics, a similar data structure can be found:- Each *item* = a genetic marker (e.g., SSRs, SNPs, mutations, gene variants).- Each *test taker* = a study participant.- The *latent trait* = an underlying biological risk, susceptibility, or expression profile.Just like test items vary in difficulty and discrimination, genetic variants differ in how strongly they associate with traits or diseases. For example:- An SSR or a SNP with a high “discrimination parameter” could sharply separate high-risk individuals from low-risk ones.- An SSR or a SNP with a high “difficulty parameter” might represent a rare allele, only present in people with a high level of genetic susceptibility.Similar analogies can be established with other research areas.## Data simulation from a two-parameter logistic modelNow, let me show you one of the possible ways to simulate data from a two-parameter logistic model using R. We start by making sure a folder called "images" is created (used to save script-generated \*.png files). We also load some of the `tidyverse` R packages, together with `bslib` (for bootstrap styling and theming), `plotly` (for interactive visualizations), and `ltm`/`mirt` to peform IRT analyses. Colorblind friendly colors are also set.```{r}#| eval: true#| message: false# setwd("C:/Users/james/workspace/quartoHub/IRTintro")if (!dir.exists("images")) dir.create("images")library(dplyr)library(ggplot2)library(purrr)library(bslib)library(plotly)library(ltm)library(mirt)# Specify colorblind tones for plottingColors =c("#33A02C", "#66C2A5", "#1F78B4", "#7570B3", "#666666", "#FFD92F", "#6A3D9A", "#E7298A", "#FF7F00", "#E31A1C")```Let's simulate data for a 10 independent items (nItems = 10) instrument and 1000 examinees (nPersons = 1000). A set of 10 discrimination values are drawn from a Uniform distribution between 0.25 and 2. Likewise, 10 difficulty values are drawn from a Normal Standard distribution.Let's take a look at the simulated difficulty and discrimination parameters for each of the 10 items.```{r}#| message: true##### Create objects for simulation# Simulate data for a 2PL modelset.seed(37573)nItems =10nPersons =1000# Generate item parametersdiscrimination =round(runif(nItems, 0.25, 2), 3) # Discrimination adifficulty =round(rnorm(nItems, 0, 1),3) # Difficulty bprint(data.frame(difficulty,discrimination))```Next, we simulate 1000 ability values that also follow a Normal Standard distribution. Two ways of simulating data are illustrated. The first option is using the `simdata` function from `mirt`; the second option is to use `base R` functions and `purrr` to generate a data frame called `baseSim` which is used to illustrate model fitting functions (`mirt` and `ltm`).Parameter estimates using `mirt` are shown below.```{r}#| message: true# Generate person abilitiesability =round(rnorm(nPersons, 0, 1), 3)# Simulate response data using mirt's simdatamirtSim =simdata(a = discrimination, d = difficulty, itemtype ="2PL", Theta = ability)# Simulate response data using base R# Compute probabilities and simulate responsesmyfxGetResponses =function(eaITEM) {data.frame(response =rbinom(nPersons, size =1,prob =1/ (1+exp(-discrimination[eaITEM] * (ability - difficulty[eaITEM]))) ) ) %>%set_names(paste0("Item",eaITEM))}baseSim =map_dfc(1:nItems, ~myfxGetResponses(.))##### Model fitting for base R simulated datamirtMod =mirt(baseSim, 1, "2PL", verbose=F, SE=T)ltmMod =ltm(baseSim ~ z1)coefMirt =coef(mirtMod, simplify=TRUE)$itemscoefLtm =coef(ltmMod, prob=T, order=F)data.frame(difficulty=-coefMirt[,2]/coefMirt[,1], discrimination=coefMirt[, 1]) %>%mutate(across(everything(), ~round(.x, 3)))```And here are the nearly identical parameter estimates obtained with `ltm`.```{r}coefLtm %>%as.data.frame() %>%mutate(across(everything(), ~round(.x, 3)))```## `r paste0('Sm', intToUtf8(127280),'rt')` IRT model visualizationsWe can use parameter estimates from `ltm` to perform some data wrangling to generate a data frame called `parameters` which is used for storing item annotations, and a `pointsDF` data frame to store the columns needed to be able to overlay points on top of a faceted `ggplot2` visualization.```{r}##### Calculate predicted 2PL functions and build elegant visualizations# Define the 2-Parameter Logistic functionmyfx2PL =function(x, a, b) {1/ (1+exp(-a * (x - b)))}# Create a data frame with two sets of parameters estimated using ltmparameters =data.frame(Item =1:nItems,a =as.numeric(coefLtm[,2]), # Discriminationb =as.numeric(coefLtm[,1]) # Difficulty) %>%arrange(b) %>%mutate(Color = Colors,Label =paste0("Item ", Item, " [a = ", round(a, 2), "; b = ", round(b, 2), "]"))# Define the ability rangexValues =seq(-6, 6, by =0.1)# Evaluate the function for each itemmyfxCalc =function(eaITEM) {data.frame(Item = parameters$Item[eaITEM],Color = parameters$Color[eaITEM],Label = parameters$Label[eaITEM],a = parameters$a[eaITEM],b = parameters$b[eaITEM],x = xValues,y =myfx2PL(xValues,a = parameters$a[eaITEM], b = parameters$b[eaITEM]) )}#Generate line plotting dataplotDF =map_df(1:nrow(parameters), ~myfxCalc(.))plotDF = plotDF %>%mutate(tooltipText =paste0("Item ", Item, " [Ability = ", round(x, 2), "; Prob = ", round(y, 2), "]"))#Generate point plotting datapointsDF = parameters %>% dplyr::select(Item, Color, b, Label) %>%mutate(x = b,y =0.5,tooltipText =paste0("Item ", Item, " [Ability = ", round(x, 2), "; Prob = ", round(y, 2), "]"))#Static item-faceted visualizationfacetPlot =ggplot(plotDF, aes(x = x, y = y, color =I(Color), text=tooltipText, group=1))+geom_line(linewidth =1.2) +facet_wrap(~factor(Label, levels=unique(Label)), ncol =5) +# facet_wrap(~factor(Label, levels=unique(Label)), ncol = 5, as.table=F, scales="free_x") +geom_vline(aes(xintercept=b),linewidth=0.15,colour="gray50") +geom_hline(aes(yintercept=0.5),linewidth=0.15,colour="gray50") +geom_point(data=pointsDF, aes(color=I(Color), shape=intToUtf8(9679)), size=6) +geom_label(data=pointsDF, aes(label=Item, fill=I("white")), color="black", size=3.25, label.size=NA,label.padding=unit(0.01, "lines"), fontface="bold") +ylab(paste0("Probability of correct answer")) +xlab("Ability") +theme_classic()+theme(axis.line.x=element_line(color="gray20", linewidth=1.0),axis.line.y=element_line(color="gray20", linewidth=0.5),axis.text.x=element_text(hjust =1),# panel.grid.major.x=element_line(size=0.05, color="#B3B3B3"),panel.grid.minor=element_blank(), panel.border=element_rect(colour="gray50",fill=NA,linewidth=1.0),panel.background =element_rect(colour ="gray50", linewidth=1.0),panel.spacing =unit(0.2, "lines"),text=element_text(size=14),strip.text=element_text(size=11,face ="bold"),legend.position ="none")ggsave("images/facetedPlot.png",facetPlot, width=11, height=11/1.618, units="in", dpi=320)```Once the `facetedPlot.png` file is saved inside the `images` folder, we can display it within our document.For the sake of interpretation, item plots are ordered anscendingly by the estimated $difficulty$ parameter. Reading left-to-right and top-to-bottom we can see that [**item `r intToUtf8(10118)`**]{style="color:#33A02C"} was the least difficult one ($difficulty$ = -1.13), while [**item `r intToUtf8(10117)`**]{style="color:#E31A1C"} was the most difficult ($difficulty$ = 1.74). We can also see that the ICC for [**item `r intToUtf8(10118)`**]{style="color:#33A02C"} was the flattest, indicating that this was the least discriminant item, while the ICC for [**item `r intToUtf8(10115)`**]{style="color:#7570B3"} was the steepest, which makes it a highly discriminant item.The facet-annoated visualization is a fancy static visualization. We can also make a few minor modifications to the `facetPlot` object in order to generate a version that can be displayed as an interactive visualization (called `facetPlotly`) within the html document.```{r}#| column: page#Interactive item-faceted visualization#wrapped in suppressWarnings because geom_label has not been implemented in plotlyfacetPlot =ggplot(plotDF, aes(x = x, y = y, color =I(Color), text=tooltipText, group=1))+geom_line(linewidth =0.75) +facet_wrap(~factor(Label, levels=unique(Label)), ncol =5) +# facet_wrap(~factor(Label, levels=unique(Label)), ncol = 5, as.table=F, scales="free_x") +geom_vline(aes(xintercept=b),linewidth=0.15,colour="gray50") +geom_hline(aes(yintercept=0.5),linewidth=0.15,colour="gray50") +geom_point(data=pointsDF, aes(color=I(Color), shape=intToUtf8(9679)), size=2) +geom_label(data=pointsDF, aes(label=Item, fill=I("white")), color="black", size=3.25, label.size=NA,label.padding=unit(0.01, "lines"), fontface="bold") +expand_limits(y =c(0,1)) +ylab(paste0("Probability of correct answer")) +xlab("Ability") +theme_classic()+theme(axis.line.x=element_line(color="gray20", linewidth=1.0),axis.line.y=element_line(color="gray20", linewidth=0.5),axis.text.x=element_text(hjust =1),# panel.grid.major.x=element_line(size=0.05, color="#B3B3B3"),panel.grid.minor=element_blank(), panel.border=element_rect(colour="gray50",fill=NA,linewidth=1.0),panel.background =element_rect(colour ="gray50", linewidth=1.0),panel.spacing =unit(0.2, "lines"),text=element_text(size=10),strip.text=element_text(size=7.75,face ="bold"),legend.position ="none")facetPlotly =ggplotly(facetPlot, tooltip="text") %>%config(displayModeBar=F)card(align ="center",style ="border: 4px double #b3b3b3; background: mintcream; border-radius: 15px; padding: 2px; margin: 2.5%",card_header(style="background: aquamarine", "Interactive Visualization of Faceted Item Characteristic Curves"),full_screen = F, height=750, facetPlotly )```We can also generate a static visualization to display all ICC on a single plot, with a properly built annotated legend as shown below.```{r}#Static visualization of all models fittedfitsPlot =ggplot(plotDF, aes(x = x, y = y, color =factor(Label, levels=unique(Label)), data_id = Item)) +geom_line(linewidth =1.2) +scale_color_manual(name="Parameter estimates",values = Colors) +geom_hline(aes(yintercept=0.5),linewidth=0.15,colour="gray50") +geom_vline(aes(xintercept=b),linewidth=0.15,colour="gray50") +geom_point(data=pointsDF, aes(color=I(Color), shape=intToUtf8(9679)), size=3, show.legend=F) +ylab(paste0("Probability of correct answer")) +xlab("Ability") +theme_classic()+theme(axis.line.x=element_line(color="gray20", linewidth=1.0),axis.line.y=element_line(color="gray20", linewidth=0.5),# panel.grid.major.x=element_line(size=0.05, color="#B3B3B3"),panel.grid.minor=element_blank(), panel.border=element_rect(colour="gray50",fill=NA,linewidth=1.0),panel.background =element_rect(colour ="gray50", linewidth=1.0),panel.spacing =unit(0.2, "lines"),text=element_text(size=14),strip.text=element_text(size=12,face ="bold"),legend.position ="right",legend.key =element_rect(fill =NA, color =NA))ggsave("images/fitsPlot.png",fitsPlot, width=11, height=11/1.618, units="in", dpi=320)```A few changes can be made to the `ggplot2` code that generated the previous visualization, and then pass it to `ggplotly` to get the interactive version shown below.```{r}#| column: page#Interactive item-faceted visualization#wrapped in suppressWarnings because geom_label has not been implemented in plotlyfitsPlotly =ggplot(plotDF, aes(x = x, y = y, color =factor(Label, levels=unique(Label)), data_id = Item, group=1,text=tooltipText)) +geom_line(linewidth =0.75) +geom_point(data=pointsDF, aes(color =factor(Label, levels=unique(Label)), shape=intToUtf8(9679)), size=2, show.legend=F) +scale_color_manual(name="Parameter estimates",values = Colors) +geom_hline(aes(yintercept=0.5),linewidth=0.15,colour="gray50") +ylab(paste0("Probability of correct answer")) +xlab("Ability") +theme_classic()+theme(axis.line.x=element_line(color="gray20", linewidth=1.0),axis.line.y=element_line(color="gray20", linewidth=0.5),# panel.grid.major.x=element_line(size=0.05, color="#B3B3B3"),panel.grid.minor=element_blank(), panel.border=element_rect(colour="gray50",fill=NA,linewidth=1.0),panel.background =element_rect(colour ="gray50", linewidth=1.0),panel.spacing =unit(0.2, "lines"),text=element_text(size=12),strip.text=element_text(size=12,face ="bold"),legend.position ="right",legend.key =element_rect(fill =NA, color =NA))fitsggPlotly =ggplotly(fitsPlotly, tooltip="text", height =420) %>%style(showlegend =FALSE, traces =1:nItems) %>%config(displayModeBar=F)card(align ="center",style ="border: 4px double #b3b3b3; background: mintcream; border-radius: 20px; padding: 2px; margin: 2.5%",card_header(style="background: aquamarine", "Interactive Item Characteristic Curves"),full_screen = F, fitsggPlotly )```## A few more utilities from `ltm`When we simulated the dataset of responses (i.e., `baseSim`) it was assumed that the probability of a correct answer was driven by a single latent trait (ability). The `unidimTest()` function from `ltm` provides a way to check whether that assumption is reasonable for the simulated data (we know, it should). P-values higher than a significance level (say 0.05) support the null hypothesis of a single latent trait. For the `baseSim` data the function return a p-value of 0.505, so the null hypothesis holds.```{r}##### Some other features available in ltm and mirt# Test for unidimensionality# Ho: All items can be explained by a single latent ability factor# P-values higher than 0.05 support HounidimTest(ltmMod)````ltm` comes with a `plot()` function that allows us to build different kinds of visualizations. If we pass the output from `descript(baseSim)` to the `plot` function and use custom Unicode symblos as plotting characters, then we can obtain a visualization for the proportion of correctly answered items at each level of Total Score. The following visualization shows that for all Total Score levels, items [**`r intToUtf8(10118)`**]{style="color:#33A02C"} and [**`r intToUtf8(10120)`**]{style="color:#66C2A5"} (the "easy" ones) have higher proportions than items [**`r intToUtf8(10113)`**]{style="color:#FF7F00"} and [**`r intToUtf8(10117)`**]{style="color:#E31A1C"} (the "difficult" ones).```{r}# Get fancy Unicode characters for plottingunicodePC =data.frame(Item=1:10, unicodePC =map_chr(10122:10131, intToUtf8))pointsDF = pointsDF %>%left_join(unicodePC) %>%arrange(Item)# Plotting observed total scoresinvisible(capture.output({png("images/plotDescript.png", width=7.75, height=7.75/1.618, units="in", res=320, type="cairo-png",family="Segoe UI Symbol")plot(descript(baseSim), pc=pointsDF$unicodePC, col=pointsDF$Color, cex=1.25,main="Proportion of Item Correct Responses at each Total Score Level")dev.off()}))```By default, the `ltm::plot()` function produces a visualization for all ICCs. Using `base` R, this visualization can be enhanced with Unicode symbols and a legend to make it more elegant. This is illustrated in the following static visualization. The drawback is that, currently, there is not a direct way to convert `base` R plots into interactive visualizations.```{r}# Plotting Item Characteristic Curvesinvisible(capture.output({png("images/ltmFits.png", width=7.75, height=7.75/1.618, units="in", res=320, type="cairo-png",family="Segoe UI Symbol")par(mar =c(5, 4, 4, 12.5) +0.1)plot(ltmMod, col=pointsDF$Color, lwd=2.5, zrange=c(-6,6), labels=rep(NA,nItems))abline(h=0.5, col="gray50", lwd=1)points(pointsDF$x, pointsDF$y,pc=rep(intToUtf8(9679), nItems), col=rep("white", nItems), cex=1.25)points(pointsDF$x, pointsDF$y, pc=pointsDF$unicodePC, col=pointsDF$Color, cex=1.25)legend(6.9,1.1, legend=pointsDF$Label, col=pointsDF$Color, pt.cex=1.25, cex=1, bty="n", pc=pointsDF$unicodePC, xpd=T, x.intersp=1, y.intersp=1.5)dev.off()}))# Reset default graphical marginspar(mar =c(5, 4, 4, 2) +0.1)```Finally, the `ltm::plot()` function can be used to visualize an **Item Information Curve** (**IIC**). An IIC shows how much statistical information (precision) a test item provides about a person’s latent trait (ability) at different levels of the $difficulty$ parameter. The higher the curve at a given $difficulty$, the more precisely the item measures ability around that level.The following visualization shows the performance of the two most extreme items (namely, items [**`r intToUtf8(10118)`**]{style="color:#33A02C"} and [**`r intToUtf8(10117)`**]{style="color:#E31A1C"}). Between these, we can see that only item [**`r intToUtf8(10117)`**]{style="color:#E31A1C"} is highly informative around the estimated $difficulty$ value of ~1.74.```{r}# Specify the item to plot (e.g., Item 1)# Use itemNumber = c(7,9) to plot items 7 and 9# Use itemNumber = 7:9 to plot items 7 to 9# The following line selects items with the minimum and maximum difficultiesitemNumber =c(which(pointsDF$b==min(pointsDF$b)), which(pointsDF$b==max(pointsDF$b)))# Set up a 1x2 plotting layoutinvisible(capture.output({png("images/itemFits.png", width=7.75, height=7.75/1.618, units="in", res=320, type="cairo-png",family="Segoe UI Symbol")par(mar =c(4.25, 4.25, 2, 0.5))par(mfrow =c(1, 2))# Plot the Item Characteristic Curve (ICC)plot(ltmMod, type ="ICC", items = itemNumber, zrange=c(-6,6),main ="Item Characteristic Curve for selected Item(s)", cex.main=0.9,col=pointsDF$Color[itemNumber], lwd=2, labels=rep(NA, length(itemNumber)))abline(h=0.5, col="gray50", lwd=1)abline(v=pointsDF$x[itemNumber], col="gray50", lwd=1)points(pointsDF$x[itemNumber], pointsDF$y[itemNumber],pc=rep(intToUtf8(9679),length(itemNumber)), col=rep("white", length(itemNumber)), cex=1.25)points(pointsDF$x[itemNumber], pointsDF$y[itemNumber],pc=pointsDF$unicodePC[itemNumber], col=pointsDF$Color[itemNumber], cex=1.25)# Plot the Item Information Curve (IIC)plot(ltmMod, type ="IIC", items = itemNumber, zrange=c(-6,6),main ="Item Information Curves", cex.main=0.9,col=pointsDF$Color[itemNumber], lwd=2, labels=rep(NA, length(itemNumber)))abline(v=pointsDF$x[itemNumber], col="gray50", lwd=1)dev.off()}))# Reset default 1x1 frame and graphical marginspar(mfrow =c(1, 1))par(mar =c(5, 4, 4, 2) +0.1)```## Closing remarksI invite you to keep learning about the essence of IRT and extend its application to your research area. Should you need additional assistance with ideas captured in this short publication, please do not hesitate reaching out to me.Enjoy🛸!