In statistical data analysis, we often deal with wide-format structures where persons represent rows and variables represent columns. Below we read the “wideformat.csv” data set into R and print the first six rows of the data using the head
function. This data set hasID, gender, age, and Item.1 through Item.5 for each person.
widedata <- read.csv("https://raw.githubusercontent.com/okanbulut/myrfunctions/master/wideformat.csv", header = TRUE)
head(widedata, 10)
ID gender age Item.1 Item.2 Item.3 Item.4 Item.5
1 person1 Male 40 0 0 0 0 0
2 person2 Female 27 0 0 0 0 0
3 person3 Male 13 0 0 0 0 0
4 person4 Female 17 0 0 0 0 1
5 person5 Female 30 0 0 0 0 1
6 person6 Female 46 0 0 0 0 1
7 person7 Male 3 0 0 0 0 1
8 person8 Male 5 0 0 0 0 1
9 person9 Female 17 0 0 0 0 1
10 person10 Male 27 0 0 0 1 0
nrow(widedata)
[1] 1000
ncol(widedata)
[1] 8
Sometimes we prefer to reorganize our wide-format data sets in a long-format where variables are nested within persons. Such transformations between wide and long format structures can be done using the reshape2 package in R. The package has the melt
function that can transform wide-format data to long-format data very easily.
In the following example, we first install and activate the reshape2 package. Then, we will transform widedata
from that wide format to long format and save it as longdata
. We will use the melt
function for this transformation.
install.packages("reshape2")
library("reshape2")
longdata <- melt(widedata, id.vars=c("ID","gender","age"), variable.name = "items",
value.name = "responses")
In the code above, id.vars
allows us to specify person-related variables that we DON’T want to be nested in the data. All other variables except for those listed under id.vars
will be transformed into the long structure. We can also specify which variables should be nested using measure.vars
. You can see the melt
function’s help page for more details.
?melt.data.frame
When we print the first 10 rows of the new data set (called longdata
), we see that item 1 has been repeated 10 items (i.e., one for each person).
head(longdata, 10)
ID gender age items responses
1 person1 Male 40 Item.1 0
2 person2 Female 27 Item.1 0
3 person3 Male 13 Item.1 0
4 person4 Female 17 Item.1 0
5 person5 Female 30 Item.1 0
6 person6 Female 46 Item.1 0
7 person7 Male 3 Item.1 0
8 person8 Male 5 Item.1 0
9 person9 Female 17 Item.1 0
10 person10 Male 27 Item.1 0
nrow(longdata)
[1] 5000
ncol(longdata)
[1] 5
Let’s re-order the data set by ID
and items
and view the data set again.
longdata <- longdata[order(longdata$ID, longdata$items),]
head(longdata, 5)
ID gender age items responses
1 person1 Male 40 Item.1 0
1001 person1 Male 40 Item.2 0
2001 person1 Male 40 Item.3 0
3001 person1 Male 40 Item.4 0
4001 person1 Male 40 Item.5 0
In the longdata
data set, each person has five rows (corresponding to five items in widedata
). We don’t have separate columns for items. Instead, we have a longer data set listing all the items as individual rows.
For estimating explanatory IRT models, we will use the lme4 package in R. lme4 stands for linear mixed-effects models using Eigen and S4. For more details about this package, you can check out the package manual (http://cran.r-project.org/web/packages/lme4/lme4.pdf) and the vignette (http://cran.r-project.org/web/packages/lme4/vignettes/lmer.pdf).
The lme4 package is capable of estimating generalized linear mixed models (GLMMs) with various types of variables. Because explanatory IRT models are based on GLMMs, we will use this package to estimate explanatory IRT models that we talked about last week. Below we install and activate the lme4 package. In addition, we install and activate the optimx package (to accelerate the estimation in lme4).
install.packages("lme4")
install.packages("optimx")
library("lme4")
library("optimx")
De Boeck et al. (2011) provides a detailed description of the IRT estimation using lme4 (see http://www.jstatsoft.org/v39/i12). We will use glmer
function to estimate the explanatory IRT models.
The structure of the glmer function
Below Figure 1 shows the structure of the glmer
function. The arguments necessary for the glmer
function are:
In the following example, we will use the verbal aggression dataset, which is called VerbAgg
in lme4, to demonstrate the estimation of explanatory IRT models. This data set consists of responses of 316 subjects (243 women and 73 men) to a questionnaire of 24 items about verbal aggression. All items describe a frustrating situation together with a verbal aggression response. A correct answer responses is coded as 0 and 1, a value of one meaning that the subject would respond to the frustrating situation in an aggressive way.
There are four scenarios shown to each subject:
For each scenario, subjects decide:
The VerbAgg
data set consists of the following variables:
Anger: The subject’s Trait Anger score as measured on the State-Trait Anger Expression Inventory
Gender: The subject’s gender - a factor with levels M for male and F for female
item: The item on the questionnaire, as a factor
resp: The subject’s polytomous response to the items (an ordered factor with levels no, perhaps, yes)
id: The subject identifier, as a factor
btype: Behavior type - a factor with levels curse, scold, or shout
situ: Situation type - a factor with levels other or self indicating other-to-blame and self-to-blame
mode: Behavior mode - a factor with levels want and do
r2: The subject’s dichotomous response to the items with N as No amd Y as Yes (recoded as Yes or perhaps as Yes, No as No)
The VerbAgg
data set comes with the lme4 package. We can activate this data set and print the first six rows using the following code:
data("VerbAgg")
head(VerbAgg)
Anger Gender item resp id btype situ mode r2
1 20 M S1WantCurse no 1 curse other want N
2 11 M S1WantCurse no 2 curse other want N
3 17 F S1WantCurse perhaps 3 curse other want Y
4 21 F S1WantCurse perhaps 4 curse other want Y
5 17 F S1WantCurse perhaps 5 curse other want Y
6 21 F S1WantCurse yes 6 curse other want Y
str(VerbAgg)
'data.frame': 7584 obs. of 9 variables:
$ Anger : int 20 11 17 21 17 21 39 21 24 16 ...
$ Gender: Factor w/ 2 levels "F","M": 2 2 1 1 1 1 1 1 1 1 ...
$ item : Factor w/ 24 levels "S1WantCurse",..: 1 1 1 1 1 1 1 1 1 1 ...
$ resp : Ord.factor w/ 3 levels "no"<"perhaps"<..: 1 1 2 2 2 3 3 1 1 3 ...
$ id : Factor w/ 316 levels "1","2","3","4",..: 1 2 3 4 5 6 7 8 9 10 ...
$ btype : Factor w/ 3 levels "curse","scold",..: 1 1 1 1 1 1 1 1 1 1 ...
$ situ : Factor w/ 2 levels "other","self": 1 1 1 1 1 1 1 1 1 1 ...
$ mode : Factor w/ 2 levels "want","do": 1 1 1 1 1 1 1 1 1 1 ...
$ r2 : Factor w/ 2 levels "N","Y": 1 1 2 2 2 2 2 1 1 2 ...
Before we begin the analysis, we need to adjust some settings first. These are not vital but very helpful especially for complex models. Therefore, we will include it in all of the models.
control=glmerControl(optimizer = "optimx", calc.derivs = FALSE,
optCtrl = list(method = "nlminb", starttests = FALSE, kkt = FALSE))
Next, we will estimate the Rasch model using the glmer
function as described earlier. This model has no covariates. It only estimates item easiness parameters for the items.
model.rasch <- glmer(r2~-1+item+(1|id),family=binomial("logit"), data=VerbAgg,
control=control)
We can see the results using the summary function. This output shows some fit statistics, a summary of random effects, and the estimated fixed effects (i.e., item parameters).
summary(model.rasch)
Generalized linear mixed model fit by maximum likelihood (Laplace
Approximation) [glmerMod]
Family: binomial ( logit )
Formula: r2 ~ -1 + item + (1 | id)
Data: VerbAgg
Control: control
AIC BIC logLik deviance df.resid
8128.5 8301.9 -4039.3 8078.5 7559
Scaled residuals:
Min 1Q Median 3Q Max
-6.2631 -0.6090 -0.1621 0.6131 15.4853
Random effects:
Groups Name Variance Std.Dev.
id (Intercept) 1.9 1.378
Number of obs: 7584, groups: id, 316
Fixed effects:
Estimate Std. Error z value Pr(>|z|)
itemS1WantCurse 1.22107 0.16110 7.579 3.47e-14 ***
itemS1WantScold 0.56477 0.15251 3.703 0.000213 ***
itemS1WantShout 0.08009 0.15046 0.532 0.594502
itemS2WantCurse 1.74879 0.17378 10.063 < 2e-16 ***
itemS2WantScold 0.70772 0.15378 4.602 4.18e-06 ***
itemS2WantShout 0.01172 0.15045 0.078 0.937909
itemS3WantCurse 0.52947 0.15224 3.478 0.000506 ***
itemS3WantScold -0.68637 0.15422 -4.451 8.56e-06 ***
itemS3WantShout -1.52694 0.16924 -9.022 < 2e-16 ***
itemS4wantCurse 1.08204 0.15867 6.819 9.15e-12 ***
itemS4WantScold -0.34938 0.15150 -2.306 0.021098 *
itemS4WantShout -1.04402 0.15908 -6.563 5.28e-11 ***
itemS1DoCurse 1.22107 0.16110 7.579 3.47e-14 ***
itemS1DoScold 0.38962 0.15137 2.574 0.010056 *
itemS1DoShout -0.87122 0.15647 -5.568 2.58e-08 ***
itemS2DoCurse 0.87264 0.15565 5.606 2.07e-08 ***
itemS2DoScold -0.05668 0.15050 -0.377 0.706488
itemS2DoShout -1.48186 0.16810 -8.815 < 2e-16 ***
itemS3DoCurse -0.21104 0.15087 -1.399 0.161876
itemS3DoScold -1.50431 0.16867 -8.919 < 2e-16 ***
itemS3DoShout -2.97500 0.23336 -12.748 < 2e-16 ***
itemS4DoCurse 0.70772 0.15378 4.602 4.18e-06 ***
itemS4DoScold -0.38422 0.15170 -2.533 0.011316 *
itemS4DoShout -1.99947 0.18386 -10.875 < 2e-16 ***
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
In the output, the fixed effects refer to the estimated item (easiness) parameters for the items. Based on the output, we can say that item itemS2WantCurse
is the easiest item and item itemS3DoShout
is the most difficult item. We can extract the item parameters and plot them using the following code:
rasch_params <- data.frame(easiness = fixef(model.rasch))
hist(rasch_params$easiness, xlab = "Item Easiness")
We can also extract the random effects (i.e., person ability parameters) from the same model.
rasch_ability <- coef(model.rasch)$id[, 1]
hist(rasch_ability, xlab = "Person Ability")
This time, we will use behavior type, mode, and situation to estimate item parameters instead of estimating unique item parameters for each item. In the first model, we use btype
, mode
, and situ
as the predictors of item difficulty (i.e., easiness).
model.lltm1 <- glmer(r2 ~ -1 + btype + mode + situ + (1|id), family=binomial("logit"),
data=VerbAgg, control=control)
summary(model.lltm1)
Generalized linear mixed model fit by maximum likelihood (Laplace
Approximation) [glmerMod]
Family: binomial ( logit )
Formula: r2 ~ -1 + btype + mode + situ + (1 | id)
Data: VerbAgg
Control: control
AIC BIC logLik deviance df.resid
8249.9 8291.5 -4119.0 8237.9 7578
Scaled residuals:
Min 1Q Median 3Q Max
-7.6685 -0.6333 -0.1965 0.6357 9.3272
Random effects:
Groups Name Variance Std.Dev.
id (Intercept) 1.793 1.339
Number of obs: 7584, groups: id, 316
Fixed effects:
Estimate Std. Error z value Pr(>|z|)
btypecurse 1.74370 0.10145 17.188 < 2e-16 ***
btypescold 0.68853 0.09753 7.060 1.67e-12 ***
btypeshout -0.29850 0.09741 -3.065 0.00218 **
modedo -0.67158 0.05621 -11.948 < 2e-16 ***
situself -1.02789 0.05689 -18.067 < 2e-16 ***
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Correlation of Fixed Effects:
btypcr btypsc btypsh modedo
btypescold 0.767
btypeshout 0.732 0.749
modedo -0.325 -0.300 -0.266
situself -0.344 -0.301 -0.248 0.062
The output shows that btypecurse
is the easiest item type (referring to behavior type=curse), whereas situself
is the most difficult item type (referring to situation=self-blame). Compared to the Rasch model, this model only involves five parameters in total to explain item easiness in the VerbAgg
data set.
In the second model, we use btype
, mode
, and their interaction as the predictors of item difficulty (i.e., easiness). btype*mode
gives us unique effects for btype
and mode
as well as their interaction.
model.lltm2 <- glmer(r2 ~ -1 + btype*mode + (1|id), family=binomial("logit"),
data=VerbAgg, control=control)
summary(model.lltm2)
Generalized linear mixed model fit by maximum likelihood (Laplace
Approximation) [glmerMod]
Family: binomial ( logit )
Formula: r2 ~ -1 + btype * mode + (1 | id)
Data: VerbAgg
Control: control
AIC BIC logLik deviance df.resid
8553.5 8602.0 -4269.8 8539.5 7577
Scaled residuals:
Min 1Q Median 3Q Max
-5.2173 -0.6608 -0.2130 0.6982 7.3223
Random effects:
Groups Name Variance Std.Dev.
id (Intercept) 1.617 1.271
Number of obs: 7584, groups: id, 316
Fixed effects:
Estimate Std. Error z value Pr(>|z|)
btypecurse 1.08311 0.09958 10.876 < 2e-16 ***
btypescold 0.05209 0.09586 0.543 0.587
btypeshout -0.56907 0.09694 -5.870 4.36e-09 ***
modedo -0.47569 0.09429 -5.045 4.54e-07 ***
btypescold:modedo 0.07200 0.13021 0.553 0.580
btypeshout:modedo -0.61934 0.13782 -4.494 6.99e-06 ***
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Correlation of Fixed Effects:
btypcr btypsc btypsh modedo btypsc:
btypescold 0.541
btypeshout 0.528 0.557
modedo -0.501 0.002 0.004
btypscld:md 0.360 -0.322 -0.002 -0.723
btypsht:mdd 0.333 -0.005 -0.317 -0.682 0.495
The output shows again that btypecurse
is the easiest item type. From the estimated interactions, btypescold:modedo
is not statistically significant (see the p-value=0.584). However, btypeshout:modedo
is statistically significant, indicating that when behaviour type is shouting and behaviour mode is doing, the items tend to be more difficult than the items related to cursing.
To estimate a latent regression model, we will use person-related variables as predictors. In the first model, we use Gender
as a predictor for persons in addition to item
for predicting item easiness parameters.
model.lrm1 <- glmer(r2~-1+item+Gender+(1|id), family=binomial("logit"),
data=VerbAgg, control=control)
summary(model.lrm1)
Generalized linear mixed model fit by maximum likelihood (Laplace
Approximation) [glmerMod]
Family: binomial ( logit )
Formula: r2 ~ -1 + item + Gender + (1 | id)
Data: VerbAgg
Control: control
AIC BIC logLik deviance df.resid
8128.0 8308.3 -4038.0 8076.0 7558
Scaled residuals:
Min 1Q Median 3Q Max
-6.2099 -0.6080 -0.1629 0.6150 15.5626
Random effects:
Groups Name Variance Std.Dev.
id (Intercept) 1.883 1.372
Number of obs: 7584, groups: id, 316
Fixed effects:
Estimate Std. Error z value Pr(>|z|)
itemS1WantCurse 1.149734 0.167062 6.882 5.90e-12 ***
itemS1WantScold 0.493327 0.158883 3.105 0.00190 **
itemS1WantShout 0.008617 0.156976 0.055 0.95622
itemS2WantCurse 1.677588 0.179246 9.359 < 2e-16 ***
itemS2WantScold 0.636293 0.160088 3.975 7.05e-05 ***
itemS2WantShout -0.059758 0.156970 -0.381 0.70343
itemS3WantCurse 0.458019 0.158631 2.887 0.00389 **
itemS3WantScold -0.757814 0.160657 -4.717 2.39e-06 ***
itemS3WantShout -1.598240 0.175182 -9.123 < 2e-16 ***
itemS4wantCurse 1.010670 0.164741 6.135 8.52e-10 ***
itemS4WantScold -0.420862 0.158012 -2.663 0.00773 **
itemS4WantShout -1.115426 0.165356 -6.746 1.52e-11 ***
itemS1DoCurse 1.149734 0.167062 6.882 5.90e-12 ***
itemS1DoScold 0.318162 0.157812 2.016 0.04379 *
itemS1DoShout -0.942647 0.162831 -5.789 7.08e-09 ***
itemS2DoCurse 0.801233 0.161863 4.950 7.42e-07 ***
itemS2DoScold -0.128155 0.157029 -0.816 0.41443
itemS2DoShout -1.553179 0.174079 -8.922 < 2e-16 ***
itemS3DoCurse -0.282517 0.157399 -1.795 0.07267 .
itemS3DoScold -1.575620 0.174623 -9.023 < 2e-16 ***
itemS3DoShout -3.045729 0.237741 -12.811 < 2e-16 ***
itemS4DoCurse 0.636293 0.160088 3.975 7.05e-05 ***
itemS4DoScold -0.455696 0.158209 -2.880 0.00397 **
itemS4DoShout -2.070639 0.189360 -10.935 < 2e-16 ***
GenderM 0.307790 0.195476 1.575 0.11536
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
The output shows that the gender effect for M (i.e., males) is positive but it is not statistically significant. Therefore, we can conclude that there is no gender difference in the data set.
In the second model, we use both Anger
and Gender
as predictors.
model.lrm2 <- glmer(r2~-1+item+Anger+Gender+(1|id), family=binomial("logit"),
data=VerbAgg, control=control)
summary(model.lrm2)
Generalized linear mixed model fit by maximum likelihood (Laplace
Approximation) [glmerMod]
Family: binomial ( logit )
Formula: r2 ~ -1 + item + Anger + Gender + (1 | id)
Data: VerbAgg
Control: control
AIC BIC logLik deviance df.resid
8118.5 8305.7 -4032.3 8064.5 7557
Scaled residuals:
Min 1Q Median 3Q Max
-6.3576 -0.6114 -0.1647 0.6151 15.8165
Random effects:
Groups Name Variance Std.Dev.
id (Intercept) 1.811 1.346
Number of obs: 7584, groups: id, 316
Fixed effects:
Estimate Std. Error z value Pr(>|z|)
itemS1WantCurse -0.006699 0.375521 -0.018 0.985767
itemS1WantScold -0.663222 0.372493 -1.780 0.074995 .
itemS1WantShout -1.147896 0.372039 -3.085 0.002033 **
itemS2WantCurse 0.521366 0.380617 1.370 0.170752
itemS2WantScold -0.520244 0.372898 -1.395 0.162974
itemS2WantShout -1.216254 0.372085 -3.269 0.001080 **
itemS3WantCurse -0.698531 0.372413 -1.876 0.060698 .
itemS3WantScold -1.914135 0.374130 -5.116 3.12e-07 ***
itemS3WantShout -2.754460 0.381239 -7.225 5.01e-13 ***
itemS4wantCurse -0.145811 0.374614 -0.389 0.697106
itemS4WantScold -1.577302 0.372773 -4.231 2.32e-05 ***
itemS4WantShout -2.271676 0.376426 -6.035 1.59e-09 ***
itemS1DoCurse -0.006700 0.375521 -0.018 0.985766
itemS1DoScold -0.838408 0.372170 -2.253 0.024274 *
itemS1DoShout -2.098922 0.375198 -5.594 2.22e-08 ***
itemS2DoCurse -0.355242 0.373532 -0.951 0.341586
itemS2DoScold -1.284619 0.372157 -3.452 0.000557 ***
itemS2DoShout -2.709417 0.380693 -7.117 1.10e-12 ***
itemS3DoCurse -1.438952 0.372419 -3.864 0.000112 ***
itemS3DoScold -2.731815 0.380962 -7.171 7.45e-13 ***
itemS3DoShout -4.203096 0.415489 -10.116 < 2e-16 ***
itemS4DoCurse -0.520245 0.372898 -1.395 0.162974
itemS4DoScold -1.612128 0.372880 -4.323 1.54e-05 ***
itemS4DoShout -3.226973 0.388440 -8.308 < 2e-16 ***
Anger 0.057653 0.016841 3.423 0.000619 ***
GenderM 0.321547 0.192220 1.673 0.094366 .
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
In the last model, we use Anger
, Gender
, and their interaction as predictors.
model.lrm3 <- glmer(r2~-1+item+Anger*Gender+(1|id), family=binomial("logit"),
data=VerbAgg, control=control)
summary(model.lrm3)
Generalized linear mixed model fit by maximum likelihood (Laplace
Approximation) [glmerMod]
Family: binomial ( logit )
Formula: r2 ~ -1 + item + Anger * Gender + (1 | id)
Data: VerbAgg
Control: control
AIC BIC logLik deviance df.resid
8120.5 8314.7 -4032.3 8064.5 7556
Scaled residuals:
Min 1Q Median 3Q Max
-6.3612 -0.6113 -0.1647 0.6149 15.8230
Random effects:
Groups Name Variance Std.Dev.
id (Intercept) 1.811 1.346
Number of obs: 7584, groups: id, 316
Fixed effects:
Estimate Std. Error z value Pr(>|z|)
itemS1WantCurse -0.027227 0.425506 -0.064 0.948980
itemS1WantScold -0.683758 0.422901 -1.617 0.105916
itemS1WantShout -1.168422 0.422531 -2.765 0.005687 **
itemS2WantCurse 0.500873 0.429941 1.165 0.244026
itemS2WantScold -0.540753 0.423246 -1.278 0.201379
itemS2WantShout -1.236797 0.422573 -2.927 0.003424 **
itemS3WantCurse -0.719052 0.422833 -1.701 0.089026 .
itemS3WantScold -1.934679 0.424376 -4.559 5.14e-06 ***
itemS3WantShout -2.774981 0.430589 -6.445 1.16e-10 ***
itemS4wantCurse -0.166306 0.424721 -0.392 0.695380
itemS4WantScold -1.597808 0.423186 -3.776 0.000160 ***
itemS4WantShout -2.292211 0.426383 -5.376 7.62e-08 ***
itemS1DoCurse -0.027218 0.425506 -0.064 0.948997
itemS1DoScold -0.858917 0.422629 -2.032 0.042121 *
itemS1DoShout -2.119473 0.425310 -4.983 6.25e-07 ***
itemS2DoCurse -0.375806 0.423789 -0.887 0.375200
itemS2DoScold -1.305166 0.422639 -3.088 0.002014 **
itemS2DoShout -2.729916 0.430111 -6.347 2.20e-10 ***
itemS3DoCurse -1.459514 0.422873 -3.451 0.000558 ***
itemS3DoScold -2.752337 0.430347 -6.396 1.60e-10 ***
itemS3DoShout -4.223533 0.460955 -9.163 < 2e-16 ***
itemS4DoCurse -0.540752 0.423246 -1.278 0.201379
itemS4DoScold -1.632647 0.423280 -3.857 0.000115 ***
itemS4DoShout -3.247477 0.436909 -7.433 1.06e-13 ***
Anger 0.058675 0.019577 2.997 0.002725 **
GenderM 0.399802 0.786397 0.508 0.611175
Anger:GenderM -0.003936 0.038370 -0.103 0.918295
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Interaction models can be used in multiple ways. First, we begin with a model where we look at the interaction between Gender
and item
. This model will show us whether the items have any differential item functioning (i.e., bias) against any gender groups.
model.dif <- glmer(r2 ~ -1 + Gender*item + (1|id), family=binomial("logit"),
data=VerbAgg, control=control)
summary(model.dif)
Generalized linear mixed model fit by maximum likelihood (Laplace
Approximation) [glmerMod]
Family: binomial ( logit )
Formula: r2 ~ -1 + Gender * item + (1 | id)
Data: VerbAgg
Control: control
AIC BIC logLik deviance df.resid
8103.1 8442.8 -4002.5 8005.1 7535
Scaled residuals:
Min 1Q Median 3Q Max
-6.6339 -0.6039 -0.1618 0.6095 16.6449
Random effects:
Groups Name Variance Std.Dev.
id (Intercept) 1.931 1.389
Number of obs: 7584, groups: id, 316
Fixed effects:
Estimate Std. Error z value Pr(>|z|)
GenderF 1.258876 0.185443 6.788 1.13e-11 ***
GenderM 1.123529 0.329350 3.411 0.000646 ***
itemS1WantScold -0.669164 0.221027 -3.028 0.002466 **
itemS1WantShout -1.147887 0.219398 -5.232 1.68e-07 ***
itemS2WantCurse 0.589378 0.242380 2.432 0.015031 *
itemS2WantScold -0.503917 0.222314 -2.267 0.023409 *
itemS2WantShout -1.080384 0.219445 -4.923 8.51e-07 ***
itemS3WantCurse -0.761938 0.220472 -3.456 0.000548 ***
itemS3WantScold -2.117190 0.225297 -9.397 < 2e-16 ***
itemS3WantShout -2.726871 0.236100 -11.550 < 2e-16 ***
itemS4wantCurse -0.131704 0.226714 -0.581 0.561290
itemS4WantScold -1.690581 0.221150 -7.645 2.10e-14 ***
itemS4WantShout -2.216603 0.226643 -9.780 < 2e-16 ***
itemS1DoCurse -0.157509 0.226337 -0.696 0.486490
itemS1DoScold -1.080385 0.219445 -4.923 8.51e-07 ***
itemS1DoShout -2.141847 0.225617 -9.493 < 2e-16 ***
itemS2DoCurse -0.645806 0.221185 -2.920 0.003503 **
itemS2DoScold -1.598958 0.220585 -7.249 4.21e-13 ***
itemS2DoShout -2.845375 0.238972 -11.907 < 2e-16 ***
itemS3DoCurse -1.736716 0.221476 -7.842 4.45e-15 ***
itemS3DoScold -3.033340 0.244120 -12.426 < 2e-16 ***
itemS3DoShout -4.405625 0.311778 -14.131 < 2e-16 ***
itemS4DoCurse -0.715681 0.220734 -3.242 0.001186 **
itemS4DoScold -1.829766 0.222223 -8.234 < 2e-16 ***
itemS4DoShout -3.236601 0.250577 -12.917 < 2e-16 ***
GenderM:itemS1WantScold 0.044559 0.449711 0.099 0.921072
GenderM:itemS1WantShout 0.011538 0.447493 0.026 0.979430
GenderM:itemS2WantCurse -0.238552 0.480779 -0.496 0.619769
GenderM:itemS2WantScold -0.046046 0.451039 -0.102 0.918687
GenderM:itemS2WantShout -0.572978 0.451905 -1.268 0.204827
GenderM:itemS3WantCurse 0.287322 0.450969 0.637 0.524045
GenderM:itemS3WantScold 0.835131 0.451049 1.852 0.064093 .
GenderM:itemS3WantShout -0.152407 0.497954 -0.306 0.759554
GenderM:itemS4wantCurse -0.031800 0.458955 -0.069 0.944760
GenderM:itemS4WantScold 0.481476 0.448615 1.073 0.283159
GenderM:itemS4WantShout -0.263673 0.476046 -0.554 0.579659
GenderM:itemS1DoCurse 0.700367 0.480480 1.458 0.144939
GenderM:itemS1DoScold 1.080352 0.459045 2.353 0.018599 *
GenderM:itemS1DoShout 0.176745 0.460550 0.384 0.701149
GenderM:itemS2DoCurse 1.396279 0.488022 2.861 0.004222 **
GenderM:itemS2DoScold 1.355944 0.454460 2.984 0.002848 **
GenderM:itemS2DoShout 0.545368 0.475983 1.146 0.251889
GenderM:itemS3DoCurse 1.262109 0.451460 2.796 0.005180 **
GenderM:itemS3DoScold 1.147931 0.468259 2.451 0.014227 *
GenderM:itemS3DoShout 0.754579 0.587328 1.285 0.198874
GenderM:itemS4DoCurse 0.886420 0.464307 1.909 0.056246 .
GenderM:itemS4DoScold 0.911434 0.448789 2.031 0.042268 *
GenderM:itemS4DoShout 0.007611 0.525079 0.014 0.988435
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
The second model is similar to first one, but it is a more parsimonious analysis because we check the relationship between item types (rather than individual items) and Gender
. This type of analysis is called differential facet functioning (rather than differential item functioning).
dff <- with(VerbAgg, factor(0 + (Gender == "F" & mode == "do" & btype != "shout")))
model.dff <- glmer(r2 ~ -1 + item + dff + Gender + (1|id), family=binomial("logit"),
data=VerbAgg, control=control)
summary(model.dff)
Generalized linear mixed model fit by maximum likelihood (Laplace
Approximation) [glmerMod]
Family: binomial ( logit )
Formula: r2 ~ -1 + item + dff + Gender + (1 | id)
Data: VerbAgg
Control: control
AIC BIC logLik deviance df.resid
8081.2 8268.4 -4013.6 8027.2 7557
Scaled residuals:
Min 1Q Median 3Q Max
-6.4638 -0.6067 -0.1653 0.6019 15.0872
Random effects:
Groups Name Variance Std.Dev.
id (Intercept) 1.913 1.383
Number of obs: 7584, groups: id, 316
Fixed effects:
Estimate Std. Error z value Pr(>|z|)
itemS1WantCurse 1.23268 0.16804 7.336 2.21e-13 ***
itemS1WantScold 0.57548 0.15985 3.600 0.000318 ***
itemS1WantShout 0.08991 0.15789 0.569 0.569080
itemS2WantCurse 1.76075 0.18018 9.772 < 2e-16 ***
itemS2WantScold 0.71866 0.16106 4.462 8.12e-06 ***
itemS2WantShout 0.02140 0.15788 0.136 0.892199
itemS3WantCurse 0.54012 0.15959 3.384 0.000714 ***
itemS3WantScold -0.67821 0.16148 -4.200 2.67e-05 ***
itemS3WantShout -1.52080 0.17590 -8.646 < 2e-16 ***
itemS4wantCurse 1.09350 0.16572 6.598 4.15e-11 ***
itemS4WantScold -0.34048 0.15888 -2.143 0.032110 *
itemS4WantShout -1.03669 0.16613 -6.240 4.37e-10 ***
itemS1DoCurse 2.02581 0.21138 9.584 < 2e-16 ***
itemS1DoScold 1.17809 0.20204 5.831 5.51e-09 ***
itemS1DoShout -0.86348 0.16363 -5.277 1.31e-07 ***
itemS2DoCurse 1.67066 0.20644 8.093 5.84e-16 ***
itemS2DoScold 0.72277 0.20025 3.609 0.000307 ***
itemS2DoShout -1.47561 0.17480 -8.442 < 2e-16 ***
itemS3DoCurse 0.56526 0.20013 2.824 0.004736 **
itemS3DoScold -0.75390 0.21074 -3.577 0.000347 ***
itemS3DoShout -2.97240 0.23830 -12.474 < 2e-16 ***
itemS4DoCurse 1.50252 0.20464 7.342 2.10e-13 ***
itemS4DoScold 0.38853 0.20030 1.940 0.052416 .
itemS4DoShout -1.99453 0.19002 -10.496 < 2e-16 ***
dff1 -1.00401 0.14374 -6.985 2.85e-12 ***
GenderM -0.03618 0.20301 -0.178 0.858539
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Note: Explanatory IRT modeling is also possible with polytomous items; however, we need to make some adjustments in the code to make the models work. The script for Week 5 (R Script - Week 5.R) will have the details about this process. We will discuss polytomous explanatory IRT models in class.