- Mock exam next week (15 Q, 50 mins)
- Main exam 20th March
- In CHR2602 unless you receive an email saying otherwise
- Linear mixed effect models
# Load packages library(tidyverse) library(modelr) library(lme4) # multilevel models library(lmerTest) # add p-values for mlm models
lme4::sleepstudy data published in Belenky et al. (2003).
glimpse(sleepstudy)
Rows: 180 Columns: 3 $ Reaction <dbl> 249.5600, 258.7047, 250.8006, 321.4398, 356.8519, 414.6901, 3… $ Days <dbl> 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 0… $ Subject <fct> 308, 308, 308, 308, 308, 308, 308, 308, 308, 308, 309, 309, 3…
m_sleep <- lmer(Reaction ~ Days + ( Days | Subject ), data = sleepstudy)
Reaction ~ Days: linear regression returning coefficients that represent the behaviour of the “average individual”(Days|Subject): dependency within the data; variance associated with individual participants
# 10 example subjects coef(m_sleep)$Subject
(Intercept) Days 308 253.6637 19.6662617 309 211.0064 1.8476053 310 212.4447 5.0184295 330 275.0957 5.6529356 331 273.6654 7.3973743 332 260.4447 10.1951090 333 268.2456 10.2436499 334 244.1725 11.5418676 335 251.0714 -0.2848792 337 286.2956 19.0955511
fixef(m_sleep)
(Intercept) Days 251.40510 10.46729
See Chapter 12 in Andrews (2021)
Normal distribution over the intercept and the slopes of the population is a distribution of the participant distributions. These normal distributions are defined by mean \(\phi\) and standard deviation \(\tau\).
The population mean \(\phi\) is a function of the fixed effects where \(\phi_i = \beta_0 + \beta_1 \times \text{Days}_i\)
fixef(m_sleep)
(Intercept) Days 251.40510 10.46729
The standard deviation \(\tau\) is the intercept of random effects term for Subject indicating the variability of the intercept terms in the population.
VarCorr(m_sleep)
Groups Name Std.Dev. Corr
Subject (Intercept) 24.7407
Days 5.9221 0.066
Residual 25.5918
Distribution over reaction times with mean \(\hat\phi\) of 251.405 msecs at days 0 and standard deviation \(\hat\tau\) of 24.741 msecs. Increments show change in reaction time for slope \(\hat\beta_1\) of 10.467 msecs for each day.
Distribution over reaction times with mean \(\hat\phi\) of 251.405 msecs at days 0 and standard deviation \(\hat\tau\) of 24.741 msecs. Increments show change in reaction time for slope \(\hat\beta_1\) of 10.467 msecs for each day.
VarCorr(m_sleep)
Groups Name Std.Dev. Corr
Subject (Intercept) 24.7407
Days 5.9221 0.066
Residual 25.5918
Four parameters
Corr is the correlation between random intercepts and random slopes:Random-intercept only model
m_sleep_ri <- lmer(Reaction ~ Days + (1|Subject), data = sleepstudy)
1 in (1|Subject)? Compare to flat linear model.These models are identical:
m_lm <- lm(Reaction ~ Days, data = sleepstudy) coef(m_lm)
(Intercept) Days 251.40510 10.46729
m_lm_1 <- lm(Reaction ~ 1 + Days, data = sleepstudy) coef(m_lm_1)
(Intercept) Days 251.40510 10.46729
You don’t have to tell lm to include an intercept term using 1. Why is it 1 and not another number? Because any multiple of 1 remains the same value (it’s constant)!
anova(m_lm, m_lm_1)
Analysis of Variance Table Model 1: Reaction ~ Days Model 2: Reaction ~ 1 + Days Res.Df RSS Df Sum of Sq F Pr(>F) 1 178 405252 2 178 405252 0 0
What if you didn’t want an intercept term? You can remove the intercept by using 0 + (intercept is multiplied by 0). This means that we assume that the reaction time at day 0 is always 0 msecs.
# simple linear regression with no intercept m_lm_no_intercept <- lm(Reaction ~ 0 + Days, data = sleepstudy) coef(m_lm_no_intercept)
Days 50.16283
Same logic applies for random effects models.
m_sleep_1a <- lmer(Reaction ~ Days + (Days|Subject), data = sleepstudy)
is equivalent to
#random slope, random intercept, and correlation model m_sleep_1b <- lmer(Reaction ~ Days + (1 + Days|Subject), data = sleepstudy)
If you don’t want an intercept, you have to explicitly remove it using 0 + (later).
As before, we can test if including participant-specific sleep deprivation effects (1 | Days) improve our model fit over and above modelling average by-participant reaction times (1 | Subject). Using likelihood-ratio test
anova(m_sleep_ri, m_sleep)
Data: sleepstudy
Models:
m_sleep_ri: Reaction ~ Days + (1 | Subject)
m_sleep: Reaction ~ Days + (Days | Subject)
npar AIC BIC logLik -2*log(L) Chisq Df Pr(>Chisq)
m_sleep_ri 4 1802.1 1814.8 -897.04 1794.1
m_sleep 6 1763.9 1783.1 -875.97 1751.9 42.139 2 7.072e-10 ***
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
npar: number of parameters for each model (can you name them?)AIC/BIC: Information criteria derived from deviancelogLik: log likelihood indicating how badely the models predict the data; the higher the value (closer to 0), the better the model predicts an outcome-2*log(L) is the devianceChisq: the difference in deviance (their sampling distribution is approximately a \(\chi^2\) distribution)Pr(>Chisq): p-value evaluating the null that these models predict the data equally wellrefitting model(s) with ML (instead of REML)
lmer uses restricted maximum likelihood estimation to get their estimatesanova does it for us!logLik from a REML model will be different to an ML modellogLik(m_sleep)
'log Lik.' -871.8141 (df=6)
If you need to run a ML model specifically to get the logLik or deviance you will need to run
m_sleep_ml <- lmer(Reaction ~ Days + (Days|Subject), REML = FALSE, data = sleepstudy)
Compare
logLik(m_sleep_ml)
'log Lik.' -875.9697 (df=6)
vs
logLik(m_sleep)
'log Lik.' -871.8141 (df=6)
Or use the anova table which returns ML estimated log likelihoods!
To remove the intercept, we specified that the intercept was 0. We can do the same with random effects.
m_sleep_rs <- lmer(Reaction ~ Days +
(0 + Days | Subject),
data = sleepstudy)
VarCorr(m_sleep_rs)
Groups Name Std.Dev. Subject Days 7.260 Residual 29.018
This model assumes that every participant starts at the same reaction time speed, but are differently affected by sleep deprivation.
Task: Use nested model comparisons using likelihood-ratio tests to test which of our three models is the best (most parsimonious) fit to our data.
Given your model comparison results, which intercept and slope coefficient are the best approximations of the population effect?
Why can’t we just do anova(m_sleep, m_sleep_ri, m_sleep_rs)? Think nesting! What happens if compare the _ri and _rs model?
Comparison of models with and without random slopes:
Data: sleepstudy
Models:
m_sleep_ri: Reaction ~ Days + (1 | Subject)
m_sleep: Reaction ~ Days + (Days | Subject)
npar AIC BIC logLik -2*log(L) Chisq Df Pr(>Chisq)
m_sleep_ri 4 1802.1 1814.8 -897.04 1794.1
m_sleep 6 1763.9 1783.1 -875.97 1751.9 42.139 2 7.072e-10 ***
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Comparison of models with and without random intercepts:
Data: sleepstudy
Models:
m_sleep_rs: Reaction ~ Days + (0 + Days | Subject)
m_sleep: Reaction ~ Days + (Days | Subject)
npar AIC BIC logLik -2*log(L) Chisq Df Pr(>Chisq)
m_sleep_rs 4 1782.1 1794.8 -887.04 1774.1
m_sleep 6 1763.9 1783.1 -875.97 1751.9 22.141 2 1.557e-05 ***
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
How do we force a model to assume that the correlation between random slopes and random intercepts is zero:
m_no_corr_1 <- lmer(Reaction ~ Days + (1|Subject) + (0 + Days|Subject), data = sleepstudy)
Short-hand
m_no_corr_2 <- lmer(Reaction ~ Days + (Days||Subject), data = sleepstudy)
With correlation:
VarCorr(m_sleep)
Groups Name Std.Dev. Corr
Subject (Intercept) 24.7407
Days 5.9221 0.066
Residual 25.5918
Without correlation:
VarCorr(m_no_corr_1)
Groups Name Std.Dev. Subject (Intercept) 25.0513 Subject.1 Days 5.9882 Residual 25.5653
Task: Use likelihood ratio test to check if including corrections improves your model fit. What are the best candidate models for such a comparison?
Such a comparison shows us if the correlation in
VarCorr(m_sleep)
Groups Name Std.Dev. Corr
Subject (Intercept) 24.7407
Days 5.9221 0.066
Residual 25.5918
is statistically meaningful?
Comparison for correlation of random intercepts and slopes:
Data: sleepstudy
Models:
m_no_corr_1: Reaction ~ Days + (1 | Subject) + (0 + Days | Subject)
m_sleep: Reaction ~ Days + (Days | Subject)
npar AIC BIC logLik -2*log(L) Chisq Df Pr(>Chisq)
m_no_corr_1 5 1762.0 1778.0 -876.00 1752.0
m_sleep 6 1763.9 1783.1 -875.97 1751.9 0.0639 1 0.8004
# Load data classroom_df <- read_csv( "https://raw.githubusercontent.com/mark-andrews/ntupsychology-data/main/data-sets/classroom.csv") # View variables glimpse(classroom_df)
Rows: 1,190 Columns: 5 $ mathscore <dbl> 480, 569, 567, 532, 478, 515, 503, 509, 510, 473, 562, 500, … $ ses <dbl> 0.46, -0.27, -0.03, -0.38, -0.03, 0.76, -0.03, 0.20, 0.64, 0… $ classid <dbl> 160, 160, 160, 217, 217, 217, 217, 217, 217, 217, 217, 197, … $ schoolid <dbl> 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 2, 2, 2, 2, 2, 2, 2, 2, 2, … $ classid2 <dbl> 1, 1, 1, 2, 2, 2, 2, 2, 2, 2, 2, 1, 1, 2, 2, 2, 2, 3, 3, 3, …
Implement this hierarchy using collapsibleTree: School -> Class -> mathscores
library(collapsibleTree)
collapsibleTree(
df = classroom_df,
hierarchy = c("--", "--", "---"),
root = "Data"
)
Task: Implement a model to test to what extent does SES predict math scores. In this model assume that the effect of SES on math scores varies by school. Ignore class.
The intercept is represents the math score of someone with an average SES (because SES was standardised).
fixef(m_school)
(Intercept) ses 522.94641 11.31036
How much does the SES effect on mathscores vary by school?
VarCorr(m_school)
Groups Name Std.Dev. Corr
schoolid (Intercept) 16.1803
ses 7.4147 0.474
Residual 33.6065
Does SES matter to the same extent for all schools?
Task: Extend your model and add another assumption: assume that average math scores varies not just by school but also by class.
+ (1 | grouping_1) + (1 | grouping_2)fixef(m_school_class)
(Intercept) ses 522.82488 11.20439
The average relationship of SES and mathscores is now representative not just for the average school (as before) but also the average classroom (within the average school).
VarCorr(m_school_class)
Groups Name Std.Dev. Corr
classid (Intercept) 8.3945
schoolid (Intercept) 15.4393
ses 7.3042 0.470
Residual 32.8954
schoolid: random effects represent between-school variability in both the average outcome (intercepts) and the effect of SES (slopes), meaning that schools differ in their baseline outcomes and in how strongly SES relates to the outcome.classid: random effect represents between-class variability in the average outcome, meaning that classes differ in their baseline outcomes around the overall population (across school) average.classid is \(\tau\) of parent model.Load data published in Martin et al. (2010):
d_martin <- read_csv("https://raw.githubusercontent.com/jensroes/bristol-ws-2025/refs/heads/main/data/martin-etal-2010-exp3a.csv")
glimpse(d_martin)
Rows: 1,036 Columns: 4 $ item <dbl> 11, 11, 11, 11, 11, 11, 25, 25, 25, 25, 25, 37, 37, 37, 37, 5, … $ ppt <dbl> 10, 9, 11, 6, 7, 2, 10, 9, 11, 7, 2, 10, 9, 6, 2, 10, 9, 11, 6,… $ rt <dbl> 1055, 2010, 461, 977, 1152, 1079, 1194, 1267, 1160, 684, 1367, … $ nptype <chr> "conjoined", "conjoined", "conjoined", "conjoined", "conjoined"…
nptype is short for noun phrase type, so a language unit like “The cat and the hat” (conjoined) vs “The cat” (simple).
Random effects that are not nested. Observations \(r\) are nested within two variables (e.g. subjects \(s\) and items \(w\)):
See Chapter 12 in Andrews (2021)
So a response depends on both the person giving the response and the item (stimulus) they responded to.
Model reaction times rt with predictor for nptype and
ppt and items item using the format (1|randomeffect_1) + (1|randomeffect_2)nptype slopes for both random intercept terms (with correlations).nptype slopes for both random intercept terms (without correlations).nptype slopes but no random intercepts.Which models make theoretically sense? One doesn’t.
Use model comparisons to check (1) if adding random slopes improved model fit and (2) if adding correlations affected model fit?
Compare (visually) fixed and random effects. Do you notice anything suspicious (check the correlations)?
lme4: Corr = 1?(nptype || ppt) removes correlation between random intercept and slopeVarCorr() sometimes shows Corr = 1 despite |||| works as intended, but Corr = 1 is a numerical artifactAndrews, M. (2021). Doing data science in R: An Introduction for Social Scientists. SAGE Publications Ltd.
Belenky, G., Wesensten, N. J., Thorne, D. R., Thomas, M. L., Sing, H. C., Redmond, D. P., Russo, M. B., & Balkin, T. J. (2003). Patterns of performance degradation and restoration during sleep restriction and subsequent recovery: A sleep dose-response study. Journal of Sleep Research, 12(1), 1–12.
Martin, R. C., Crowther, J. E., Knight, M., Tamborello II, F. P., & Yang, C.-L. (2010). Planning in sentence production: Evidence for the phrase as a default planning scope. Cognition, 116(2), 177–192.