Simulating and Assessing DIF in Items (via 2PL Model)

Introduction

This is an RMarkdown document displaying R code for simulating and assessing DIF. Specifically, item responses for three different tests are simulated (both item parameters and respondent abilities were simulated). Each test has 20 items and 4,000 respondents. There are two different DIF comparisons, one according to group membership G1 and another according to group membership G2. In each case, there is 2,000 in each group (with minimized overlap). The three tests are intended to differ due to DIF presence. Specifically:

Test 1: No DIF Test 2: DIF due to G1 for three items Test 3: DIF due to G2 for three items

For Tests 2 and 3, one item will have uniform DIF only, one item will have nonuniform DIF only, and one item will have both.

The second phase of code assesses and organizes DIF results using a logistic regression approach. Three different models are fit:

Model 1: Total score as a predictor of item performance. Model 2: Total score + group membership as predictors of item performance. Model 3: Total score + group membership + an interaction term as predictors of item performance.

By comparing chi-squared test statistics and nagelkerke R-squared values between models, the following types of DIF are assessed: uniform, nonuniform, and overall. A function is created that fits, extracts, and compares the results of the three different models for a single item. Furthermore, a letter grade is assigned reflecting severity of DIF: A denotes little to no DIF, B denotes moderate DIF, C denotes large DIF. Also, a plus or minus is given denoting which group is favored when DIF is graded B or C.

Then, some code is provided to organize and clean the results, as well as filter the results into two sets of items that contain DIF (one for G1 and one for G2).

Simulating 2PL Item Characteristics and Respondent Abilities for Three Tests

The first block of code accomplished the following:

Load appropriate R packages.
Simulate respondent abilities.
Simulated item characteristics (both discrimination and difficulty).
Create an indicator code for both group membership variables (G1 and G2).

In the interest of brevity, the code is only shown for one of the three tests.

library(dplyr)
library(rcompanion)

abil <- rnorm(4000, 0, 1)
itm_a <- runif(20, 1, 2.5)
itm_b <- rnorm(20, 0, 0.9)

abil <- as.data.frame(abil)
itm_a <- as.data.frame(itm_a)
itm_b <- as.data.frame(itm_b)

Test <- rep(1, 4000)
G1 <- c(rep(1, 2000), rep(2, 2000))
G2 <- rep(c(rep(1, 1000), rep(2, 1000)), 2)

Test <- as.data.frame(Test)
G1 <- as.data.frame(G1)
G2 <- as.data.frame(G2)

Info_set <- cbind(Test, G1, G2)

Simulating item responses (for Test 1; No DIF)

Using the simulated abilities and item characteristics, responses were simulated using the 2PL IRT model. Randomized values between 0 and 1 were compared against probabilities of success based on the known information.

resp <- matrix(0, 4000, 21)
rand_p <- matrix(0, 4000, 20)
corr_p <- matrix(0, 4000, 20)
resp <- as.data.frame(resp)
rand_p <- as.data.frame(rand_p)
corr_p <- as.data.frame(corr_p)

for (i in 1:4000) {
    for (j in 1:20) {
        rand_p[i, j] = runif(1, 0, 1)
        corr_p[i, j] = (1/(1+exp(-itm_a[j, 1]*(abil[i, 1] - itm_b[j, 1]))))
        resp[i, j] = ifelse(rand_p[i, j] < corr_p[i, j], 1, 0)
    }
}

resp <- resp %>% mutate(Total = select(., V1:V20) %>% rowSums(na.rm = TRUE))

sim_set1 <- cbind(Info_set, resp)

head(sim_set1)

##   Test G1 G2 V1 V2 V3 V4 V5 V6 V7 V8 V9 V10 V11 V12 V13 V14 V15 V16 V17
## 1    1  1  1  0  0  0  1  1  1  0  1  1   1   0   1   1   1   0   1   1
## 2    1  1  1  1  0  1  1  1  1  0  1  1   1   0   1   1   1   0   1   1
## 3    1  1  1  1  0  1  1  1  1  1  1  1   1   0   1   0   1   1   1   1
## 4    1  1  1  1  1  1  1  1  1  0  1  1   1   0   1   1   1   1   1   1
## 5    1  1  1  1  0  1  1  0  1  0  0  1   1   0   1   1   1   1   1   1
## 6    1  1  1  0  0  0  0  0  1  0  0  1   1   0   1   0   1   0   1   1
##   V18 V19 V20 V21 Total
## 1   1   1   0   0    13
## 2   1   1   1   0    16
## 3   0   1   1   0    16
## 4   0   1   1   0    17
## 5   1   1   1   0    15
## 6   0   1   0   0     8

Simulating item responses (for Tests 2 (DIF for G1) and 3 (DIF for G2))

Similar to the previous displayed code, item responses are simulated in nearly the same manner. However, in each of these tests 3 predetermined items will display some manner of DIF. The following is the list of items and the type of DIF that are attempted to be simulated:

Test 2: Item 8 (Uniform DIF) Item 15 (Nonuniform DIF) Item 17 (Uniform DIF and Nonuniform DIF)

Test 3: Item 10 (Uniform DIF) Item 15 (Nonuniform DIF) Item 19 (Uniform DIF and Nonuniform DIF)

resp2 <- matrix(0, 4000, 21)
rand_p <- matrix(0, 4000, 20)
corr_p <- matrix(0, 4000, 20)
resp2 <- as.data.frame(resp2)
rand_p <- as.data.frame(rand_p)
corr_p <- as.data.frame(corr_p)

for (i in 1:4000) {
    for (j in 1:20) {
        rand_p[i, j] = runif(1, 0, 1)
        corr_p[i, j] = (1/(1+exp(-itm_a[j, 1]*(abil[i, 1] - itm_b[j, 1]))))
        if (i > 2000 & j == 8) {
            corr_p[i, j] = (1/(1+exp(-itm_a[j, 1]*(abil[i, 1] - itm_b[j, 1] - .7))))
        }
        if (i > 2000 & j == 15) {
            corr_p[i, j] = (1/(1+exp(-3*itm_a[j, 1]*(abil[i, 1] - itm_b[j, 1]))))
        }
        if (i > 2000 & j == 17) {
            corr_p[i, j] = (1/(1+exp(-3*itm_a[j, 1]*(abil[i, 1] - itm_b[j, 1] - .7))))
        }           
        resp2[i, j] = ifelse(rand_p[i, j] < corr_p[i, j], 1, 0)
    }
}

resp2 <- resp2 %>% mutate(Total = select(., V1:V20) %>% rowSums(na.rm = TRUE))

sim_set2 <- cbind(Info_set, resp2)

resp3 <- matrix(0, 4000, 21)
rand_p <- matrix(0, 4000, 20)
corr_p <- matrix(0, 4000, 20)
resp3 <- as.data.frame(resp3)
rand_p <- as.data.frame(rand_p)
corr_p <- as.data.frame(corr_p)

for (i in 1:4000) {
    for (j in 1:20) {
        rand_p[i, j] = runif(1, 0, 1)
        corr_p[i, j] = (1/(1+exp(-itm_a[j, 1]*(abil[i, 1] - itm_b[j, 1]))))
        if (i %in% c(seq(1001,2000,1),seq(3001,4000,1)) & j == 10) {
            corr_p[i, j] = (1/(1+exp(-itm_a[j, 1]*(abil[i, 1] - itm_b[j, 1] - .7))))
        }
        if (i %in% c(seq(1001,2000,1),seq(3001,4000,1)) & j == 15) {
            corr_p[i, j] = (1/(1+exp(-3*itm_a[j, 1]*(abil[i, 1] - itm_b[j, 1]))))
        }
        if (i %in% c(seq(1001,2000,1),seq(3001,4000,1)) & j == 19) {
            corr_p[i, j] = (1/(1+exp(-3*itm_a[j, 1]*(abil[i, 1] - itm_b[j, 1] - .7))))
        }           
        resp3[i, j] = ifelse(rand_p[i, j] < corr_p[i, j], 1, 0)
    }
}

resp3 <- resp3 %>% mutate(Total = select(., V1:V20) %>% rowSums(na.rm = TRUE))

sim_set3 <- cbind(Info_set, resp3)

Creating Functions for Fitting Logistic Regression Models and Extracting/Storing Results

Next, a set of functions are created for running the necessary logistic regression models, and extracting their corresponding results. The end result is to return a vector containing the following: model parameter estimates, chi-squared differences between models, p-values corresponding to the chi-squared differences, nagelkerke R-squared differences, and a letter grade on presence and direction of DIF. This was done for uniform, nonuniform, and overall DIF.

The function is only shown for the analysis of G1 DIF for the first test (there are 5 other functions with the same structure suppressed for brevity)

lrdif_G1_test1 <- function(dv1) {
difmod1 <- glm(formula = dv1 ~ Total, family="binomial", data = sim_set1)
difmod2 <- glm(formula = dv1 ~ Total + G1, family="binomial", data = sim_set1)
difmod3 <- glm(formula = dv1 ~ Total + G1 + Total*G1, family="binomial", data = sim_set1)
sumdif1 <- summary(difmod1)
sumdif2 <- summary(difmod2)
sumdif3 <- summary(difmod3)
nk1 <- nagelkerke(difmod1)
nk2 <- nagelkerke(difmod2)
nk3 <- nagelkerke(difmod3)

temp <- rep(0,14)
temp2 <- rep(0,6)

temp[1] <- sumdif3$coefficients[3,4]
temp[2] <- sumdif3$coefficients[4,4]

temp2[1] <- sumdif1$deviance
temp2[2] <- sumdif2$deviance
temp2[3] <- sumdif3$deviance

temp2[4] <- nk1$Pseudo.R.squared.for.model.vs.null[3]
temp2[5] <- nk2$Pseudo.R.squared.for.model.vs.null[3]
temp2[6] <- nk3$Pseudo.R.squared.for.model.vs.null[3]

temp[3] <- temp2[1] - temp2[2]
temp[4] <- temp2[2] - temp2[3]
temp[5] <- temp2[1] - temp2[3]

temp[6] <- 1 - pchisq(temp[3],1,FALSE)
temp[7] <- 1 - pchisq(temp[4],1,FALSE)
temp[8] <- 1 - pchisq(temp[5],2,FALSE)

temp[9] <- temp2[5] - temp2[4]
temp[10] <- temp2[6] - temp2[5]
temp[11] <- temp2[6] - temp2[4]

temp[12] <-     ifelse(temp[6] > .05, "A",
        ifelse(temp[9] < .035, "A",
        ifelse(temp[9] < .07,
        ifelse(temp[1] < 0, "B-", "B+"),
        ifelse(temp[1] < 0, "C-", "C+"))))

temp[13] <-     ifelse(temp[7] > .05, "A",
        ifelse(temp[10] < .035, "A",
        ifelse(temp[10] < .07,
        ifelse(temp[2] < 0, "B-", "B+"),
        ifelse(temp[2] < 0, "C-", "C+"))))

temp[14] <-     ifelse(temp[8] > .05, "A",
        ifelse(temp[11] < .035, "A",
        ifelse(temp[11] < .07,

        ifelse(temp[1] < 0, 
        ifelse(temp[2] < 0, "B-", "B"),
        ifelse(temp[2] < 0, "B", "B+")),

        ifelse(temp[1] < 0, 
        ifelse(temp[2] < 0, "C-", "C"),
        ifelse(temp[2] < 0, "C", "C+")))))

return(temp)
}

Generation and Extraction of DIF Results via Pre-defined Functions

The functions created in the previous step are then used to create an array of DIF results for G1 and G2. The code below demonstrates the code used for generating G1 DIF results.

g1_array <- lrdif_G1_test1(sim_set1$V1)  
g1_array <- rbind(g1_array, lrdif_G1_test1(sim_set1$V2))  
g1_array <- rbind(g1_array, lrdif_G1_test1(sim_set1$V3))  
g1_array <- rbind(g1_array, lrdif_G1_test1(sim_set1$V4))  
g1_array <- rbind(g1_array, lrdif_G1_test1(sim_set1$V5))  
g1_array <- rbind(g1_array, lrdif_G1_test1(sim_set1$V6))  
g1_array <- rbind(g1_array, lrdif_G1_test1(sim_set1$V7))  
g1_array <- rbind(g1_array, lrdif_G1_test1(sim_set1$V8))  
g1_array <- rbind(g1_array, lrdif_G1_test1(sim_set1$V9))  
g1_array <- rbind(g1_array, lrdif_G1_test1(sim_set1$V10))  
g1_array <- rbind(g1_array, lrdif_G1_test1(sim_set1$V11))  
g1_array <- rbind(g1_array, lrdif_G1_test1(sim_set1$V12))  
g1_array <- rbind(g1_array, lrdif_G1_test1(sim_set1$V13))  
g1_array <- rbind(g1_array, lrdif_G1_test1(sim_set1$V14))  
g1_array <- rbind(g1_array, lrdif_G1_test1(sim_set1$V15))  
g1_array <- rbind(g1_array, lrdif_G1_test1(sim_set1$V16))  
g1_array <- rbind(g1_array, lrdif_G1_test1(sim_set1$V17))  
g1_array <- rbind(g1_array, lrdif_G1_test1(sim_set1$V18))  
g1_array <- rbind(g1_array, lrdif_G1_test1(sim_set1$V19))  
g1_array <- rbind(g1_array, lrdif_G1_test1(sim_set1$V20))  

g1_array <- rbind(g1_array, lrdif_G1_test2(sim_set2$V1))  
g1_array <- rbind(g1_array, lrdif_G1_test2(sim_set2$V2))  
g1_array <- rbind(g1_array, lrdif_G1_test2(sim_set2$V3))  
g1_array <- rbind(g1_array, lrdif_G1_test2(sim_set2$V4))  
g1_array <- rbind(g1_array, lrdif_G1_test2(sim_set2$V5))  
g1_array <- rbind(g1_array, lrdif_G1_test2(sim_set2$V6))  
g1_array <- rbind(g1_array, lrdif_G1_test2(sim_set2$V7))  
g1_array <- rbind(g1_array, lrdif_G1_test2(sim_set2$V8))  
g1_array <- rbind(g1_array, lrdif_G1_test2(sim_set2$V9))  
g1_array <- rbind(g1_array, lrdif_G1_test2(sim_set2$V10))  
g1_array <- rbind(g1_array, lrdif_G1_test2(sim_set2$V11))  
g1_array <- rbind(g1_array, lrdif_G1_test2(sim_set2$V12))  
g1_array <- rbind(g1_array, lrdif_G1_test2(sim_set2$V13))  
g1_array <- rbind(g1_array, lrdif_G1_test2(sim_set2$V14))  
g1_array <- rbind(g1_array, lrdif_G1_test2(sim_set2$V15))  
g1_array <- rbind(g1_array, lrdif_G1_test2(sim_set2$V16))  
g1_array <- rbind(g1_array, lrdif_G1_test2(sim_set2$V17))  
g1_array <- rbind(g1_array, lrdif_G1_test2(sim_set2$V18))  
g1_array <- rbind(g1_array, lrdif_G1_test2(sim_set2$V19))  
g1_array <- rbind(g1_array, lrdif_G1_test2(sim_set2$V20))  

g1_array <- rbind(g1_array, lrdif_G1_test3(sim_set3$V1))  
g1_array <- rbind(g1_array, lrdif_G1_test3(sim_set3$V2))  
g1_array <- rbind(g1_array, lrdif_G1_test3(sim_set3$V3))  
g1_array <- rbind(g1_array, lrdif_G1_test3(sim_set3$V4))  
g1_array <- rbind(g1_array, lrdif_G1_test3(sim_set3$V5))  
g1_array <- rbind(g1_array, lrdif_G1_test3(sim_set3$V6))  
g1_array <- rbind(g1_array, lrdif_G1_test3(sim_set3$V7))  
g1_array <- rbind(g1_array, lrdif_G1_test3(sim_set3$V8))  
g1_array <- rbind(g1_array, lrdif_G1_test3(sim_set3$V9))  
g1_array <- rbind(g1_array, lrdif_G1_test3(sim_set3$V10))  
g1_array <- rbind(g1_array, lrdif_G1_test3(sim_set3$V11))  
g1_array <- rbind(g1_array, lrdif_G1_test3(sim_set3$V12))  
g1_array <- rbind(g1_array, lrdif_G1_test3(sim_set3$V13))  
g1_array <- rbind(g1_array, lrdif_G1_test3(sim_set3$V14))  
g1_array <- rbind(g1_array, lrdif_G1_test3(sim_set3$V15))  
g1_array <- rbind(g1_array, lrdif_G1_test3(sim_set3$V16))  
g1_array <- rbind(g1_array, lrdif_G1_test3(sim_set3$V17))  
g1_array <- rbind(g1_array, lrdif_G1_test3(sim_set3$V18))  
g1_array <- rbind(g1_array, lrdif_G1_test3(sim_set3$V19))  
g1_array <- rbind(g1_array, lrdif_G1_test3(sim_set3$V20))

Test and Item Variables Created and Data Frame Creation

Vectors are then created for Test and Item variables. These are combined along with the arrays of DIF results created. This all results in a single data frame named “dif_results.”

vec1 <- c(rep(1, 20), rep(2, 20), rep(3, 20))
vec2 <- rep(seq(1, 20, 1), 3)
test_results <- cbind(vec1, vec2)
test_results <- as.data.frame(test_results)
names(test_results) <- c("Test", "Item")

g1_array <- as.data.frame(g1_array)
g2_array <- as.data.frame(g2_array)

names(g1_array) <- c("G1_Unif_Est", "G1_Nonun_Est", "G1_Unif_ChiDiff", "G1_Nonun_ChiDiff", "G1_Ovr_ChiDiff", "G1_Unif_Pval", "G1_Nonun_Pval", "G1_Ovr_Pval", "G1_Unif_NagDiff", "G1_Nonun_NagDiff", "G1_Ovr_NagDiff", "G1_Unif_Grade", "G1_Nonun_Grade", "G1_Ovr_Grade")

names(g2_array) <- c("G2_Unif_Est", "G2_Nonun_Est", "G2_Unif_ChiDiff", "G2_Nonun_ChiDiff", "G2_Ovr_ChiDiff", "G2_Unif_Pval", "G2_Nonun_Pval", "G2_Ovr_Pval", "G2_Unif_NagDiff", "G2_Nonun_NagDiff", "G2_Ovr_NagDiff", "G2_Unif_Grade", "G2_Nonun_Grade", "G2_Ovr_Grade")

dif_results <- cbind(test_results, g1_array, g2_array)

Class Manipulation and Reformatting

The different variables need to be coerced into the correct class. Afterwards, some formatting is done to limit the number of decimal places shown by the numeric variables.

dif_results$Test <- as.factor(dif_results$Test)
dif_results$Item <- as.factor(dif_results$Item)
dif_results$G1_Unif_Grade <- as.factor(dif_results$G1_Unif_Grade)
dif_results$G1_Nonun_Grade <- as.factor(dif_results$G1_Nonun_Grade)
dif_results$G1_Ovr_Grade <- as.factor(dif_results$G1_Ovr_Grade)
dif_results$G2_Unif_Grade <- as.factor(dif_results$G2_Unif_Grade)
dif_results$G2_Nonun_Grade <- as.factor(dif_results$G2_Nonun_Grade)
dif_results$G2_Ovr_Grade <- as.factor(dif_results$G2_Ovr_Grade)

dif_results$G1_Unif_Est <- as.numeric(as.character(dif_results$G1_Unif_Est))
dif_results$G1_Nonun_Est <- as.numeric(as.character(dif_results$G1_Nonun_Est))
dif_results$G2_Unif_Est <- as.numeric(as.character(dif_results$G2_Unif_Est))
dif_results$G2_Nonun_Est <- as.numeric(as.character(dif_results$G2_Nonun_Est))

dif_results$G1_Unif_Est <- format(round(dif_results$G1_Unif_Est, 3), nsmall = 3)
dif_results$G1_Nonun_Est <- format(round(dif_results$G1_Nonun_Est, 3), nsmall = 3)
dif_results$G2_Unif_Est <- format(round(dif_results$G2_Unif_Est, 3), nsmall = 3)
dif_results$G2_Nonun_Est <- format(round(dif_results$G2_Nonun_Est, 3), nsmall = 3)


dif_results$G1_Unif_ChiDiff <- as.numeric(as.character(dif_results$G1_Unif_ChiDiff))
dif_results$G1_Nonun_ChiDiff <- as.numeric(as.character(dif_results$G1_Nonun_ChiDiff))
dif_results$G1_Ovr_ChiDiff <- as.numeric(as.character(dif_results$G1_Ovr_ChiDiff))
dif_results$G2_Unif_ChiDiff <- as.numeric(as.character(dif_results$G2_Unif_ChiDiff))
dif_results$G2_Nonun_ChiDiff <- as.numeric(as.character(dif_results$G2_Nonun_ChiDiff))
dif_results$G2_Ovr_ChiDiff <- as.numeric(as.character(dif_results$G2_Ovr_ChiDiff))

dif_results$G1_Unif_ChiDiff <- format(round(dif_results$G1_Unif_ChiDiff, 3), nsmall = 3)
dif_results$G1_Nonun_ChiDiff <- format(round(dif_results$G1_Nonun_ChiDiff, 3), nsmall = 3)
dif_results$G1_Ovr_ChiDiff <- format(round(dif_results$G1_Ovr_ChiDiff, 3), nsmall = 3)
dif_results$G2_Unif_ChiDiff <- format(round(dif_results$G2_Unif_ChiDiff, 3), nsmall = 3)
dif_results$G2_Nonun_ChiDiff <- format(round(dif_results$G2_Nonun_ChiDiff, 3), nsmall = 3)
dif_results$G2_Ovr_ChiDiff <- format(round(dif_results$G2_Ovr_ChiDiff, 3), nsmall = 3)

dif_results$G1_Unif_Pval <- as.numeric(as.character(dif_results$G1_Unif_Pval))
dif_results$G1_Nonun_Pval <- as.numeric(as.character(dif_results$G1_Nonun_Pval))
dif_results$G1_Ovr_Pval <- as.numeric(as.character(dif_results$G1_Ovr_Pval))
dif_results$G2_Unif_Pval <- as.numeric(as.character(dif_results$G2_Unif_Pval))
dif_results$G2_Nonun_Pval <- as.numeric(as.character(dif_results$G2_Nonun_Pval))
dif_results$G2_Ovr_Pval <- as.numeric(as.character(dif_results$G2_Ovr_Pval))

dif_results$G1_Unif_Pval <- format(round(dif_results$G1_Unif_Pval, 3), nsmall = 3)
dif_results$G1_Nonun_Pval <- format(round(dif_results$G1_Nonun_Pval, 3), nsmall = 3)
dif_results$G1_Ovr_Pval <- format(round(dif_results$G1_Ovr_Pval, 3), nsmall = 3)
dif_results$G2_Unif_Pval <- format(round(dif_results$G2_Unif_Pval, 3), nsmall = 3)
dif_results$G2_Nonun_Pval <- format(round(dif_results$G2_Nonun_Pval, 3), nsmall = 3)
dif_results$G2_Ovr_Pval <- format(round(dif_results$G2_Ovr_Pval, 3), nsmall = 3)

dif_results$G1_Unif_NagDiff <- as.numeric(as.character(dif_results$G1_Unif_NagDiff))
dif_results$G1_Nonun_NagDiff <- as.numeric(as.character(dif_results$G1_Nonun_NagDiff))
dif_results$G1_Ovr_NagDiff <- as.numeric(as.character(dif_results$G1_Ovr_NagDiff))
dif_results$G2_Unif_NagDiff <- as.numeric(as.character(dif_results$G2_Unif_NagDiff))
dif_results$G2_Nonun_NagDiff <- as.numeric(as.character(dif_results$G2_Nonun_NagDiff))
dif_results$G2_Ovr_NagDiff <- as.numeric(as.character(dif_results$G2_Ovr_NagDiff))

dif_results$G1_Unif_NagDiff <- format(round(dif_results$G1_Unif_NagDiff, 3), nsmall = 3)
dif_results$G1_Nonun_NagDiff <- format(round(dif_results$G1_Nonun_NagDiff, 3), nsmall = 3)
dif_results$G1_Ovr_NagDiff <- format(round(dif_results$G1_Ovr_NagDiff, 3), nsmall = 3)
dif_results$G2_Unif_NagDiff <- format(round(dif_results$G2_Unif_NagDiff, 3), nsmall = 3)
dif_results$G2_Nonun_NagDiff <- format(round(dif_results$G2_Nonun_NagDiff, 3), nsmall = 3)
dif_results$G2_Ovr_NagDiff <- format(round(dif_results$G2_Ovr_NagDiff, 3), nsmall = 3)

head(dif_results)

##          Test Item G1_Unif_Est G1_Nonun_Est G1_Unif_ChiDiff
## g1_array    1    1       0.128        0.222           2.075
## X           1    2       0.727        0.602           1.118
## X.1         1    3       0.141        0.098           0.214
## X.2         1    4       0.087        0.123           0.561
## X.3         1    5       0.581        0.943           3.518
## X.4         1    6       0.846        0.923           0.135
##          G1_Nonun_ChiDiff G1_Ovr_ChiDiff G1_Unif_Pval G1_Nonun_Pval
## g1_array            1.494          3.569        0.150         0.222
## X                   0.272          1.390        0.290         0.602
## X.1                 2.750          2.964        0.644         0.097
## X.2                 2.390          2.951        0.454         0.122
## X.3                 0.005          3.523        0.061         0.943
## X.4                 0.009          0.144        0.713         0.923
##          G1_Ovr_Pval G1_Unif_NagDiff G1_Nonun_NagDiff G1_Ovr_NagDiff
## g1_array       0.168           0.000            0.000          0.001
## X              0.499           0.000            0.000          0.000
## X.1            0.227           0.000            0.001          0.001
## X.2            0.229           0.000            0.001          0.001
## X.3            0.172           0.001            0.000          0.001
## X.4            0.930           0.000            0.000          0.000
##          G1_Unif_Grade G1_Nonun_Grade G1_Ovr_Grade G2_Unif_Est
## g1_array             A              A            A       0.717
## X                    A              A            A       0.358
## X.1                  A              A            A       0.489
## X.2                  A              A            A       0.508
## X.3                  A              A            A       0.609
## X.4                  A              A            A       0.583
##          G2_Nonun_Est G2_Unif_ChiDiff G2_Nonun_ChiDiff G2_Ovr_ChiDiff
## g1_array        0.916           1.245            0.011          1.256
## X               0.396           0.270            0.724          0.994
## X.1             0.576           0.329            0.313          0.642
## X.2             0.678           0.541            0.173          0.714
## X.3             0.890           2.151            0.019          2.170
## X.4             0.661           0.213            0.192          0.405
##          G2_Unif_Pval G2_Nonun_Pval G2_Ovr_Pval G2_Unif_NagDiff
## g1_array        0.265         0.916       0.534           0.000
## X               0.603         0.395       0.608           0.000
## X.1             0.566         0.576       0.725           0.000
## X.2             0.462         0.678       0.700           0.000
## X.3             0.143         0.890       0.338           0.001
## X.4             0.645         0.661       0.817           0.000
##          G2_Nonun_NagDiff G2_Ovr_NagDiff G2_Unif_Grade G2_Nonun_Grade
## g1_array            0.000          0.000             A              A
## X                   0.000          0.000             A              A
## X.1                 0.000          0.000             A              A
## X.2                 0.000          0.000             A              A
## X.3                 0.000          0.001             A              A
## X.4                 0.000          0.000             A              A
##          G2_Ovr_Grade
## g1_array            A
## X                   A
## X.1                 A
## X.2                 A
## X.3                 A
## X.4                 A

Filtering into Data Frames Containing only Items with DIF

Finally, if one wanted to summarize only the items who exhibited some form of DIF the proper filtering is then undertaken.

G1_dif_items <- dif_results %>% filter(G1_Unif_Grade != "A" | G1_Nonun_Grade != "A" | G1_Ovr_Grade != "A")

G2_dif_items <- dif_results %>% filter(G2_Unif_Grade != "A" | G2_Nonun_Grade != "A" | G2_Ovr_Grade != "A")

G1_dif_items

##   Test Item G1_Unif_Est G1_Nonun_Est G1_Unif_ChiDiff G1_Nonun_ChiDiff
## 1    2    8       0.001        0.471         152.168            0.520
## 2    2   17       0.000        0.000         426.448          136.070
##   G1_Ovr_ChiDiff G1_Unif_Pval G1_Nonun_Pval G1_Ovr_Pval G1_Unif_NagDiff
## 1        152.688        0.000         0.471       0.000           0.037
## 2        562.518        0.000         0.000       0.000           0.075
##   G1_Nonun_NagDiff G1_Ovr_NagDiff G1_Unif_Grade G1_Nonun_Grade
## 1            0.000          0.037            B+              A
## 2            0.022          0.098            C+              A
##   G1_Ovr_Grade G2_Unif_Est G2_Nonun_Est G2_Unif_ChiDiff G2_Nonun_ChiDiff
## 1           B+       0.809        0.240           6.201            1.379
## 2           C+       0.119        0.228           2.001            1.460
##   G2_Ovr_ChiDiff G2_Unif_Pval G2_Nonun_Pval G2_Ovr_Pval G2_Unif_NagDiff
## 1          7.580        0.013         0.240       0.023           0.002
## 2          3.461        0.157         0.227       0.177           0.000
##   G2_Nonun_NagDiff G2_Ovr_NagDiff G2_Unif_Grade G2_Nonun_Grade
## 1            0.000          0.002             A              A
## 2            0.000          0.001             A              A
##   G2_Ovr_Grade
## 1            A
## 2            A

G2_dif_items

##   Test Item G1_Unif_Est G1_Nonun_Est G1_Unif_ChiDiff G1_Nonun_ChiDiff
## 1    3   10       0.550        0.405           0.133            0.694
## 2    3   15       0.413        0.608           0.722            0.263
## 3    3   19       0.375        0.296           0.057            1.093
##   G1_Ovr_ChiDiff G1_Unif_Pval G1_Nonun_Pval G1_Ovr_Pval G1_Unif_NagDiff
## 1          0.827        0.715         0.405       0.661           0.000
## 2          0.985        0.395         0.608       0.611           0.000
## 3          1.150        0.811         0.296       0.563           0.000
##   G1_Nonun_NagDiff G1_Ovr_NagDiff G1_Unif_Grade G1_Nonun_Grade
## 1            0.000          0.000             A              A
## 2            0.000          0.000             A              A
## 3            0.000          0.000             A              A
##   G1_Ovr_Grade G2_Unif_Est G2_Nonun_Est G2_Unif_ChiDiff G2_Nonun_ChiDiff
## 1            A       0.000        0.081         207.447            3.084
## 2            A       0.000        0.000          43.595          134.435
## 3            A       0.000        0.000         241.306          189.246
##   G2_Ovr_ChiDiff G2_Unif_Pval G2_Nonun_Pval G2_Ovr_Pval G2_Unif_NagDiff
## 1        210.530        0.000         0.079       0.000           0.046
## 2        178.030        0.000         0.000       0.000           0.010
## 3        430.552        0.000         0.000       0.000           0.051
##   G2_Nonun_NagDiff G2_Ovr_NagDiff G2_Unif_Grade G2_Nonun_Grade
## 1            0.001          0.046            B+              A
## 2            0.029          0.038             A              A
## 3            0.038          0.089            B+             B+
##   G2_Ovr_Grade
## 1           B+
## 2           B+
## 3           C+

As seen here, the listed items are some subset of the ones designated as DIF items in the simulation conditions.