1. Loading data

library(tidyverse)
d <- read_csv("/Users/joshuarosenberg/Google Drive/SCHMIDTLAB/PSE/data/STEM-IE/STEM-IE-esm.csv")

2. Processing data

The basic rule of processing the data for use in multi-level (or mixed effects) models is that R will figure out the nesting and cross automatically, as long as the levels of the random effect are uniquely identified. Thus, if students are named 1 to ~25 in each of 10 programs, then they need to be nested explicitly (R will not figure out the nesting and crossing automatically in this case). I find it similar to simply use unique levels of each random effect and so that is the focus here.

We have three random effects:

Our solution is to use paste() to create a distinct identifier - how readable it is is not as important as it being coded correctly and uniquely.

d$beep_ID_new <- paste0(d$program_ID, d$sociedad_class, d$signal_number, d$response_date, sep = "-")

3. Modeling data using lme4

Here is how we specify a random intercept model. Note in the output the random parts.

library(lme4)
library(sjPlot)
library(sjstats)

m1 <- lmer(interest ~ 1 +
               (1|program_ID) + 
               (1|participant_ID) + 
               (1|beep_ID_new),
           data = d)

summary(m1)
## Linear mixed model fit by REML ['lmerMod']
## Formula: interest ~ 1 + (1 | program_ID) + (1 | participant_ID) + (1 |  
##     beep_ID_new)
##    Data: d
## 
## REML criterion at convergence: 7870.1
## 
## Scaled residuals: 
##     Min      1Q  Median      3Q     Max 
## -3.2430 -0.5421  0.1237  0.6004  3.1235 
## 
## Random effects:
##  Groups         Name        Variance Std.Dev.
##  beep_ID_new    (Intercept) 0.04139  0.2034  
##  participant_ID (Intercept) 0.36295  0.6025  
##  program_ID     (Intercept) 0.02182  0.1477  
##  Residual                   0.68733  0.8291  
## Number of obs: 2970, groups:  
## beep_ID_new, 248; participant_ID, 203; program_ID, 9
## 
## Fixed effects:
##             Estimate Std. Error t value
## (Intercept)  2.86630    0.06913   41.46
icc(m1)
## Linear mixed model
##  Family: gaussian (identity)
## Formula: interest ~ 1 + (1 | program_ID) + (1 | participant_ID) + (1 | beep_ID_new)
## 
##      ICC (beep_ID_new): 0.037172
##   ICC (participant_ID): 0.325959
##       ICC (program_ID): 0.019599

Here is how we specify a random intercept and random slopes model, with challenge as a predictor. Note in the output the random parts. Also note the output for the ICC.

m2 <- lmer(interest ~ challenge +
               (challenge|program_ID) + 
               (challenge|participant_ID) + 
               (challenge|beep_ID_new),
           data = d)

summary(m2)
## Linear mixed model fit by REML ['lmerMod']
## Formula: interest ~ challenge + (challenge | program_ID) + (challenge |  
##     participant_ID) + (challenge | beep_ID_new)
##    Data: d
## 
## REML criterion at convergence: 7659.9
## 
## Scaled residuals: 
##     Min      1Q  Median      3Q     Max 
## -3.3843 -0.5346  0.0857  0.5825  2.9600 
## 
## Random effects:
##  Groups         Name        Variance  Std.Dev. Corr 
##  beep_ID_new    (Intercept) 0.2327947 0.48249       
##                 challenge   0.0200481 0.14159  -0.98
##  participant_ID (Intercept) 0.8357892 0.91422       
##                 challenge   0.0692396 0.26313  -0.85
##  program_ID     (Intercept) 0.0007757 0.02785       
##                 challenge   0.0023119 0.04808  1.00 
##  Residual                   0.6071912 0.77922       
## Number of obs: 2970, groups:  
## beep_ID_new, 248; participant_ID, 203; program_ID, 9
## 
## Fixed effects:
##             Estimate Std. Error t value
## (Intercept)  2.44521    0.08731  28.005
## challenge    0.17907    0.03259   5.495
## 
## Correlation of Fixed Effects:
##           (Intr)
## challenge -0.708
icc(m2)
## Linear mixed model
##  Family: gaussian (identity)
## Formula: interest ~ challenge + (challenge | program_ID) + (challenge | participant_ID) + (challenge | beep_ID_new)
## 
##      ICC (beep_ID_new): 0.138853
##   ICC (participant_ID): 0.498517
##       ICC (program_ID): 0.000463