library(tidyverse)
d <- read_csv("/Users/joshuarosenberg/Google Drive/SCHMIDTLAB/PSE/data/STEM-IE/STEM-IE-esm.csv")
The basic rule of processing the data for use in multi-level (or mixed effects) models is that R
will figure out the nesting and cross automatically, as long as the levels of the random effect are uniquely identified. Thus, if students are named 1 to ~25 in each of 10 programs, then they need to be nested explicitly (R
will not figure out the nesting and crossing automatically in this case). I find it similar to simply use unique levels of each random effect and so that is the focus here.
We have three random effects:
Our solution is to use paste()
to create a distinct identifier - how readable it is is not as important as it being coded correctly and uniquely.
d$beep_ID_new <- paste0(d$program_ID, d$sociedad_class, d$signal_number, d$response_date, sep = "-")
Here is how we specify a random intercept model. Note in the output the random parts.
library(lme4)
library(sjPlot)
library(sjstats)
m1 <- lmer(interest ~ 1 +
(1|program_ID) +
(1|participant_ID) +
(1|beep_ID_new),
data = d)
summary(m1)
## Linear mixed model fit by REML ['lmerMod']
## Formula: interest ~ 1 + (1 | program_ID) + (1 | participant_ID) + (1 |
## beep_ID_new)
## Data: d
##
## REML criterion at convergence: 7870.1
##
## Scaled residuals:
## Min 1Q Median 3Q Max
## -3.2430 -0.5421 0.1237 0.6004 3.1235
##
## Random effects:
## Groups Name Variance Std.Dev.
## beep_ID_new (Intercept) 0.04139 0.2034
## participant_ID (Intercept) 0.36295 0.6025
## program_ID (Intercept) 0.02182 0.1477
## Residual 0.68733 0.8291
## Number of obs: 2970, groups:
## beep_ID_new, 248; participant_ID, 203; program_ID, 9
##
## Fixed effects:
## Estimate Std. Error t value
## (Intercept) 2.86630 0.06913 41.46
icc(m1)
## Linear mixed model
## Family: gaussian (identity)
## Formula: interest ~ 1 + (1 | program_ID) + (1 | participant_ID) + (1 | beep_ID_new)
##
## ICC (beep_ID_new): 0.037172
## ICC (participant_ID): 0.325959
## ICC (program_ID): 0.019599
Here is how we specify a random intercept and random slopes model, with challenge as a predictor. Note in the output the random parts. Also note the output for the ICC
.
m2 <- lmer(interest ~ challenge +
(challenge|program_ID) +
(challenge|participant_ID) +
(challenge|beep_ID_new),
data = d)
summary(m2)
## Linear mixed model fit by REML ['lmerMod']
## Formula: interest ~ challenge + (challenge | program_ID) + (challenge |
## participant_ID) + (challenge | beep_ID_new)
## Data: d
##
## REML criterion at convergence: 7659.9
##
## Scaled residuals:
## Min 1Q Median 3Q Max
## -3.3843 -0.5346 0.0857 0.5825 2.9600
##
## Random effects:
## Groups Name Variance Std.Dev. Corr
## beep_ID_new (Intercept) 0.2327947 0.48249
## challenge 0.0200481 0.14159 -0.98
## participant_ID (Intercept) 0.8357892 0.91422
## challenge 0.0692396 0.26313 -0.85
## program_ID (Intercept) 0.0007757 0.02785
## challenge 0.0023119 0.04808 1.00
## Residual 0.6071912 0.77922
## Number of obs: 2970, groups:
## beep_ID_new, 248; participant_ID, 203; program_ID, 9
##
## Fixed effects:
## Estimate Std. Error t value
## (Intercept) 2.44521 0.08731 28.005
## challenge 0.17907 0.03259 5.495
##
## Correlation of Fixed Effects:
## (Intr)
## challenge -0.708
icc(m2)
## Linear mixed model
## Family: gaussian (identity)
## Formula: interest ~ challenge + (challenge | program_ID) + (challenge | participant_ID) + (challenge | beep_ID_new)
##
## ICC (beep_ID_new): 0.138853
## ICC (participant_ID): 0.498517
## ICC (program_ID): 0.000463