Multi-level linear models 2

Overview

Mock exam next week (15 Q, 50 mins)
Main exam 20th March
In CHR2602 unless you receive an email saying otherwise
Linear mixed effect models

Packages

# Load packages
library(tidyverse)
library(modelr)
library(lme4) # multilevel models
library(lmerTest) # add p-values for mlm models

Reaction time and sleep deprivation

lme4::sleepstudy data published in Belenky et al. (2003).

glimpse(sleepstudy)

Rows: 180
Columns: 3
$ Reaction <dbl> 249.5600, 258.7047, 250.8006, 321.4398, 356.8519, 414.6901, 3…
$ Days     <dbl> 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 0…
$ Subject  <fct> 308, 308, 308, 308, 308, 308, 308, 308, 308, 308, 309, 309, 3…

18 ppts sleep deprived for 9 days
Reaction times recorded on days 0 - 9 (10 measurements per ppt)
To what extent does sleep deprivation affects affect reaction times?

Data structure

Full random-effects model

m_sleep <- lmer(Reaction ~ Days + ( Days | Subject ), data = sleepstudy)

Reaction ~ Days: linear regression returning coefficients that represent the behaviour of the “average individual”
(Days|Subject): dependency within the data; variance associated with individual participants
- Each participant has their own average reaction time.
- Each participant has their own sleep deprivation affect (change in reaction over days).
- Random intercepts and slopes are correlated
Both are modeled deviations from the average behaviour (represented by intercept, slope).

Full random-effects model

# 10 example subjects
coef(m_sleep)$Subject

    (Intercept)       Days
308    253.6637 19.6662617
309    211.0064  1.8476053
310    212.4447  5.0184295
330    275.0957  5.6529356
331    273.6654  7.3973743
332    260.4447 10.1951090
333    268.2456 10.2436499
334    244.1725 11.5418676
335    251.0714 -0.2848792
337    286.2956 19.0955511

Full random-effects model

fixef(m_sleep)

(Intercept)        Days 
  251.40510    10.46729

Model of models

See Chapter 12 in Andrews (2021)

Model of models

Normal distribution over the intercept and the slopes of the population is a distribution of the participant distributions. These normal distributions are defined by mean \(\phi\) and standard deviation \(\tau\).

The population mean \(\phi\) is a function of the fixed effects where \(\phi_i = \beta_0 + \beta_1 \times \text{Days}_i\)

fixef(m_sleep)

(Intercept)        Days 
  251.40510    10.46729

The standard deviation \(\tau\) is the intercept of random effects term for Subject indicating the variability of the intercept terms in the population.

VarCorr(m_sleep)

 Groups   Name        Std.Dev. Corr 
 Subject  (Intercept) 24.7407       
          Days         5.9221  0.066
 Residual             25.5918

Model of models

Distribution over reaction times with mean \(\hat\phi\) of 251.405 msecs at days 0 and standard deviation \(\hat\tau\) of 24.741 msecs. Increments show change in reaction time for slope \(\hat\beta_1\) of 10.467 msecs for each day.

Model of models

The mean and standard deviation form a probability distribution: some observations are more likely than others.
We assume, plausibly, that our 18 subjects have been sampled randomly from this distribution.
All other (unobserved) people of this population would also come from this distribution.
We can work out the range of possible reaction time values (intercept and slope) which lie within 95% of the area under the curve.

What’s left for random effects?

VarCorr(m_sleep)

 Groups   Name        Std.Dev. Corr 
 Subject  (Intercept) 24.7407       
          Days         5.9221  0.066
 Residual             25.5918

Four parameters

Standard deviation of random intercepts
Standard deviation of random slopes
Standard deviation of residuals, i.e. individual subject distributions
Corr is the correlation between random intercepts and random slopes:

Is the effect of sleep deprivation correlated with the average speed of a participant?
Slower participants have a stronger sleep deprivation slowdown (\(>0\)).
Slower participants have a weaker sleep deprivation slowdown (\(<0\))

Constant and random intercepts and slopes

Random-intercept only model

m_sleep_ri <- lmer(Reaction ~ Days + (1|Subject), data = sleepstudy)

Intercept varies by Subject but all have the same slope
No random slopes, and, therefore, no correlation
What’s the 1 in (1|Subject)? Compare to flat linear model.

Simple linear regression

These models are identical:

m_lm <- lm(Reaction ~ Days, data = sleepstudy)
coef(m_lm)

(Intercept)        Days 
  251.40510    10.46729

m_lm_1 <- lm(Reaction ~ 1 + Days, data = sleepstudy)
coef(m_lm_1)

(Intercept)        Days 
  251.40510    10.46729

You don’t have to tell lm to include an intercept term using 1. Why is it 1 and not another number? Because any multiple of 1 remains the same value (it’s constant)!

Simple linear regression

anova(m_lm, m_lm_1)

Analysis of Variance Table

Model 1: Reaction ~ Days
Model 2: Reaction ~ 1 + Days
  Res.Df    RSS Df Sum of Sq F Pr(>F)
1    178 405252                      
2    178 405252  0         0

Simple linear regression

What if you didn’t want an intercept term? You can remove the intercept by using 0 + (intercept is multiplied by 0). This means that we assume that the reaction time at day 0 is always 0 msecs.

# simple linear regression with no intercept
m_lm_no_intercept <- lm(Reaction ~ 0 + Days, data = sleepstudy)
coef(m_lm_no_intercept)

    Days 
50.16283

Constant and random intercepts and slopes

Same logic applies for random effects models.

m_sleep_1a <- lmer(Reaction ~ Days + (Days|Subject), data = sleepstudy)

is equivalent to

#random slope, random intercept, and correlation model
m_sleep_1b <- lmer(Reaction ~ Days + (1 + Days|Subject), data = sleepstudy)

If you don’t want an intercept, you have to explicitly remove it using 0 + (later).

Nested model comparison

As before, we can test if including participant-specific sleep deprivation effects (1 | Days) improve our model fit over and above modelling average by-participant reaction times (1 | Subject). Using likelihood-ratio test

anova(m_sleep_ri, m_sleep)

Data: sleepstudy
Models:
m_sleep_ri: Reaction ~ Days + (1 | Subject)
m_sleep: Reaction ~ Days + (Days | Subject)
           npar    AIC    BIC  logLik -2*log(L)  Chisq Df Pr(>Chisq)    
m_sleep_ri    4 1802.1 1814.8 -897.04    1794.1                         
m_sleep       6 1763.9 1783.1 -875.97    1751.9 42.139  2  7.072e-10 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Nested model comparison

npar: number of parameters for each model (can you name them?)
AIC/BIC: Information criteria derived from deviance
logLik: log likelihood indicating how badely the models predict the data; the higher the value (closer to 0), the better the model predicts an outcome
-2*log(L) is the deviance
Chisq: the difference in deviance (their sampling distribution is approximately a \(\chi^2\) distribution)
Pr(>Chisq): p-value evaluating the null that these models predict the data equally well

Nested model comparison

refitting model(s) with ML (instead of REML)

By default lmer uses restricted maximum likelihood estimation to get their estimates
To compare models, we need maximum likelihood estimation
anova does it for us!
Pulling logLik from a REML model will be different to an ML model

logLik(m_sleep)

'log Lik.' -871.8141 (df=6)

Maximum likelihood

If you need to run a ML model specifically to get the logLik or deviance you will need to run

m_sleep_ml <- lmer(Reaction ~ Days + (Days|Subject), REML = FALSE, data = sleepstudy)

Compare

logLik(m_sleep_ml)

'log Lik.' -875.9697 (df=6)

logLik(m_sleep)

'log Lik.' -871.8141 (df=6)

Or use the anova table which returns ML estimated log likelihoods!

Random slopes only model

To remove the intercept, we specified that the intercept was 0. We can do the same with random effects.

m_sleep_rs <- lmer(Reaction ~ Days + 
                     (0 + Days | Subject), 
                   data = sleepstudy)

VarCorr(m_sleep_rs)

 Groups   Name Std.Dev.
 Subject  Days  7.260  
 Residual      29.018

This model assumes that every participant starts at the same reaction time speed, but are differently affected by sleep deprivation.

Random slopes only model

Task: Use nested model comparisons using likelihood-ratio tests to test which of our three models is the best (most parsimonious) fit to our data.

Random slopes only
Random intercepts only
Random slopes + random intercepts + correlation

Given your model comparison results, which intercept and slope coefficient are the best approximations of the population effect?

Why can’t we just do anova(m_sleep, m_sleep_ri, m_sleep_rs)? Think nesting! What happens if compare the _ri and _rs model?

Which model is better?

Comparison of models with and without random slopes:

Data: sleepstudy
Models:
m_sleep_ri: Reaction ~ Days + (1 | Subject)
m_sleep: Reaction ~ Days + (Days | Subject)
           npar    AIC    BIC  logLik -2*log(L)  Chisq Df Pr(>Chisq)    
m_sleep_ri    4 1802.1 1814.8 -897.04    1794.1                         
m_sleep       6 1763.9 1783.1 -875.97    1751.9 42.139  2  7.072e-10 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Comparison of models with and without random intercepts:

Data: sleepstudy
Models:
m_sleep_rs: Reaction ~ Days + (0 + Days | Subject)
m_sleep: Reaction ~ Days + (Days | Subject)
           npar    AIC    BIC  logLik -2*log(L)  Chisq Df Pr(>Chisq)    
m_sleep_rs    4 1782.1 1794.8 -887.04    1774.1                         
m_sleep       6 1763.9 1783.1 -875.97    1751.9 22.141  2  1.557e-05 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Random slopes, intercepts, but no correlation?

How do we force a model to assume that the correlation between random slopes and random intercepts is zero:

m_no_corr_1 <- lmer(Reaction ~ Days + (1|Subject) + (0 + Days|Subject), data = sleepstudy)

Short-hand

m_no_corr_2 <- lmer(Reaction ~ Days + (Days||Subject), data = sleepstudy)

Random slopes, intercepts, but no correlation?

With correlation:

VarCorr(m_sleep)

 Groups   Name        Std.Dev. Corr 
 Subject  (Intercept) 24.7407       
          Days         5.9221  0.066
 Residual             25.5918

Without correlation:

VarCorr(m_no_corr_1)

 Groups    Name        Std.Dev.
 Subject   (Intercept) 25.0513 
 Subject.1 Days         5.9882 
 Residual              25.5653

Which model is better?

Task: Use likelihood ratio test to check if including corrections improves your model fit. What are the best candidate models for such a comparison?

Such a comparison shows us if the correlation in

VarCorr(m_sleep)

 Groups   Name        Std.Dev. Corr 
 Subject  (Intercept) 24.7407       
          Days         5.9221  0.066
 Residual             25.5918

is statistically meaningful?

Which model is better?

Comparison for correlation of random intercepts and slopes:

Data: sleepstudy
Models:
m_no_corr_1: Reaction ~ Days + (1 | Subject) + (0 + Days | Subject)
m_sleep: Reaction ~ Days + (Days | Subject)
            npar    AIC    BIC  logLik -2*log(L)  Chisq Df Pr(>Chisq)
m_no_corr_1    5 1762.0 1778.0 -876.00    1752.0                     
m_sleep        6 1763.9 1783.1 -875.97    1751.9 0.0639  1     0.8004

More than one grouping variables

# Load data
classroom_df <- read_csv(
  "https://raw.githubusercontent.com/mark-andrews/ntupsychology-data/main/data-sets/classroom.csv")
# View variables
glimpse(classroom_df)

Rows: 1,190
Columns: 5
$ mathscore <dbl> 480, 569, 567, 532, 478, 515, 503, 509, 510, 473, 562, 500, …
$ ses       <dbl> 0.46, -0.27, -0.03, -0.38, -0.03, 0.76, -0.03, 0.20, 0.64, 0…
$ classid   <dbl> 160, 160, 160, 217, 217, 217, 217, 217, 217, 217, 217, 197, …
$ schoolid  <dbl> 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 2, 2, 2, 2, 2, 2, 2, 2, 2, …
$ classid2  <dbl> 1, 1, 1, 2, 2, 2, 2, 2, 2, 2, 2, 1, 1, 2, 2, 2, 2, 3, 3, 3, …

Math score, SES, class ID, school ID
SES is standardised (average = 0)
Each row is a different pupil nested within a school and a classroom.
Each classroom is nested within a school.

Nesting hierarchy

Implement this hierarchy using collapsibleTree: School -> Class -> mathscores

library(collapsibleTree)

collapsibleTree(
  df = classroom_df,
  hierarchy = c("--", "--", "---"),
  root = "Data"  
)

By-school nesting

Task: Implement a model to test to what extent does SES predict math scores. In this model assume that the effect of SES on math scores varies by school. Ignore class.

By-school nesting

The intercept is represents the math score of someone with an average SES (because SES was standardised).

fixef(m_school)

(Intercept)         ses 
  522.94641    11.31036

How much does the SES effect on mathscores vary by school?

VarCorr(m_school)

 Groups   Name        Std.Dev. Corr 
 schoolid (Intercept) 16.1803       
          ses          7.4147  0.474
 Residual             33.6065

Does SES matter to the same extent for all schools?

By-school and by class nesting

Task: Extend your model and add another assumption: assume that average math scores varies not just by school but also by class.

For simplicity, don’t assume that the SES effect on mathscores varies by class (not enough data for this), only that the average math score varies across classrooms.
You can add another random effect by adding it to your formula using + (1 | grouping_1) + (1 | grouping_2)

By-school and by class nesting

fixef(m_school_class)

(Intercept)         ses 
  522.82488    11.20439

The average relationship of SES and mathscores is now representative not just for the average school (as before) but also the average classroom (within the average school).

By-school and by class nesting

VarCorr(m_school_class)

 Groups   Name        Std.Dev. Corr 
 classid  (Intercept)  8.3945       
 schoolid (Intercept) 15.4393       
          ses          7.3042  0.470
 Residual             32.8954

schoolid: random effects represent between-school variability in both the average outcome (intercepts) and the effect of SES (slopes), meaning that schools differ in their baseline outcomes and in how strongly SES relates to the outcome.
classid: random effect represents between-class variability in the average outcome, meaning that classes differ in their baseline outcomes around the overall population (across school) average.
The standard deviation of classid is \(\tau\) of parent model.

Crossed random effects

Load data published in Martin et al. (2010):

d_martin <- read_csv("https://raw.githubusercontent.com/jensroes/bristol-ws-2025/refs/heads/main/data/martin-etal-2010-exp3a.csv")

glimpse(d_martin)

Rows: 1,036
Columns: 4
$ item   <dbl> 11, 11, 11, 11, 11, 11, 25, 25, 25, 25, 25, 37, 37, 37, 37, 5, …
$ ppt    <dbl> 10, 9, 11, 6, 7, 2, 10, 9, 11, 7, 2, 10, 9, 6, 2, 10, 9, 11, 6,…
$ rt     <dbl> 1055, 2010, 461, 977, 1152, 1079, 1194, 1267, 1160, 684, 1367, …
$ nptype <chr> "conjoined", "conjoined", "conjoined", "conjoined", "conjoined"…

nptype is short for noun phrase type, so a language unit like “The cat and the hat” (conjoined) vs “The cat” (simple).

Crossed random effects

Random effects that are not nested. Observations \(r\) are nested within two variables (e.g. subjects \(s\) and items \(w\)):

See Chapter 12 in Andrews (2021)

So a response depends on both the person giving the response and the item (stimulus) they responded to.

Exercise

Model reaction times rt with predictor for nptype and

random intercepts for participants ppt and items item using the format (1|randomeffect_1) + (1|randomeffect_2)
random nptype slopes for both random intercept terms (with correlations).
random nptype slopes for both random intercept terms (without correlations).
random nptype slopes but no random intercepts.

Which models make theoretically sense? One doesn’t.

Use model comparisons to check (1) if adding random slopes improved model fit and (2) if adding correlations affected model fit?

Compare (visually) fixed and random effects. Do you notice anything suspicious (check the correlations)?

`lme4: Corr = 1`?

(nptype || ppt) removes correlation between random intercept and slope
VarCorr() sometimes shows Corr = 1 despite ||
This usually indicates a singular fit: One variance component is estimated near zero
Implications:
- || works as intended, but Corr = 1 is a numerical artifact
- Consider removing zero-variance components or simplifying random effects
Takeaway:
- Corr = 1 in output ≠ correlation ignored
- It signals insufficient data to estimate separate variance components

References

Andrews, M. (2021). Doing data science in R: An Introduction for Social Scientists. SAGE Publications Ltd.

Belenky, G., Wesensten, N. J., Thorne, D. R., Thomas, M. L., Sing, H. C., Redmond, D. P., Russo, M. B., & Balkin, T. J. (2003). Patterns of performance degradation and restoration during sleep restriction and subsequent recovery: A sleep dose-response study. Journal of Sleep Research, 12(1), 1–12.

Martin, R. C., Crowther, J. E., Knight, M., Tamborello II, F. P., & Yang, C.-L. (2010). Planning in sentence production: Evidence for the phrase as a default planning scope. Cognition, 116(2), 177–192.

Overview

Packages

Reaction time and sleep deprivation

Data structure

Full random-effects model

Full random-effects model

Full random-effects model

Model of models

Model of models

Model of models

Model of models

Model of models

What’s left for random effects?

Constant and random intercepts and slopes

Constant and random intercepts and slopes

Simple linear regression

Simple linear regression

Simple linear regression

Constant and random intercepts and slopes

Nested model comparison

Nested model comparison

Nested model comparison

Maximum likelihood

Random slopes only model

Random slopes only model

Which model is better?

Random slopes, intercepts, but no correlation?

Random slopes, intercepts, but no correlation?

Which model is better?

Which model is better?

More than one grouping variables

Nesting hierarchy

By-school nesting

By-school nesting

By-school and by class nesting

By-school and by class nesting

By-school and by class nesting

Crossed random effects

Crossed random effects

Exercise

lme4: Corr = 1?

References

`lme4: Corr = 1`?