Welcome to this R demo session! Here, I will demonstrate how to use R to conduct multi-level modeling (MLM).
In this R Markdown file, we will explore a dataset titled
MLM_example_data.xlsx
from Hox (2002).
SCHOOL
); 16-26 students per teacherPOPULAR
: popularity rating; students’
popularity measured by a self-rating scale that ranges from 0 (very
unpopular) to 10 (very popular).SEX
: gender of student (0 = boy; 1 =
girl)TEXP
: teacher’s experience in
years#read in data from Excel file
library(readxl)
library(tidyverse)
## ── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ──
## ✔ dplyr 1.1.4 ✔ readr 2.1.5
## ✔ forcats 1.0.0 ✔ stringr 1.5.1
## ✔ ggplot2 3.5.1 ✔ tibble 3.2.1
## ✔ lubridate 1.9.3 ✔ tidyr 1.3.1
## ✔ purrr 1.0.2
## ── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
## ✖ dplyr::filter() masks stats::filter()
## ✖ dplyr::lag() masks stats::lag()
## ℹ Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to become errors
library(dplyr)
example<-read_excel("MLM_example_data.xlsx")
head(example)
## # A tibble: 6 × 7
## PUPIL SCHOOL POPULAR SEX TEXP CONST TEACHPOP
## <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
## 1 1 1 8 1 24 1 7
## 2 2 1 7 0 24 1 7
## 3 3 1 7 1 24 1 6
## 4 4 1 9 1 24 1 6
## 5 5 1 8 1 24 1 7
## 6 6 1 7 0 24 1 7
summary(example) #some variables should be coded as cateogorical
## PUPIL SCHOOL POPULAR SEX
## Min. : 1.00 Min. : 1.00 Min. :2.000 Min. :0.000
## 1st Qu.: 6.00 1st Qu.: 25.00 1st Qu.:4.000 1st Qu.:0.000
## Median :11.00 Median : 51.00 Median :5.000 Median :0.000
## Mean :10.65 Mean : 50.37 Mean :5.308 Mean :0.487
## 3rd Qu.:16.00 3rd Qu.: 76.00 3rd Qu.:6.000 3rd Qu.:1.000
## Max. :26.00 Max. :100.00 Max. :9.000 Max. :1.000
## TEXP CONST TEACHPOP
## Min. : 2.00 Min. :1 Min. :2.000
## 1st Qu.: 8.00 1st Qu.:1 1st Qu.:4.000
## Median :15.00 Median :1 Median :4.000
## Mean :14.26 Mean :1 Mean :4.484
## 3rd Qu.:20.00 3rd Qu.:1 3rd Qu.:5.000
## Max. :25.00 Max. :1 Max. :7.000
example <- example %>%
mutate(PUPIL = as.factor(PUPIL),
SCHOOL = as.factor(SCHOOL),
SEX = as.factor(SEX))
summary(example) #looks better now
## PUPIL SCHOOL POPULAR SEX TEXP
## 2 : 100 17 : 26 Min. :2.000 0:1026 Min. : 2.00
## 4 : 100 63 : 25 1st Qu.:4.000 1: 974 1st Qu.: 8.00
## 5 : 100 10 : 24 Median :5.000 Median :15.00
## 6 : 100 15 : 24 Mean :5.308 Mean :14.26
## 7 : 100 4 : 23 3rd Qu.:6.000 3rd Qu.:20.00
## 8 : 100 21 : 23 Max. :9.000 Max. :25.00
## (Other):1400 (Other):1855
## CONST TEACHPOP
## Min. :1 Min. :2.000
## 1st Qu.:1 1st Qu.:4.000
## Median :1 Median :4.000
## Mean :1 Mean :4.484
## 3rd Qu.:1 3rd Qu.:5.000
## Max. :1 Max. :7.000
##
In this example, we are interested in knowing whether the popularity ratings collected from students are related to students’ gender and teacher’s experience.
Let’s visualize the relationship between popularity ratings and teacher experience, taking into account potential variations across gender. This time, however, let’s exclude the grouping effect of the teacher.
library(ggplot2)
ggplot(data = example, aes(x = TEXP, y = POPULAR, color = SEX)) +
geom_point(position = position_jitter(), alpha = 0.3) +
stat_smooth(method = "lm")
## `geom_smooth()` using formula = 'y ~ x'
From the graph, it seems that the relationships between popularity ratings and teacher experience look quite similar for boys and girls: students taught by more experienced teachers tend to give higher popularity ratings compared to the peers taught by less experienced teachers.
However, it this the full story?
Let’s start by looking at popularity predicted by sex
MLM is a lot like running a regression separately for each group (level 1)
Then looking at the distribution of intercept and regression coefficient estimates across groups (level 2)
library(dplyr)
library(broom)
#regression for school 1 only
summary(lm(POPULAR ~ SEX, data = example[which (example$SCHOOL == 1),]))
##
## Call:
## lm(formula = POPULAR ~ SEX, data = example[which(example$SCHOOL ==
## 1), ])
##
## Residuals:
## Min 1Q Median 3Q Max
## -0.8 -0.3 -0.3 0.2 1.7
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 7.3000 0.2321 31.447 <2e-16 ***
## SEX1 0.5000 0.3283 1.523 0.145
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 0.7341 on 18 degrees of freedom
## Multiple R-squared: 0.1142, Adjusted R-squared: 0.06494
## F-statistic: 2.32 on 1 and 18 DF, p-value: 0.1451
#regression for school 2 only
summary(lm(POPULAR ~ SEX, data = example[which (example$SCHOOL == 2),]))
##
## Call:
## lm(formula = POPULAR ~ SEX, data = example[which(example$SCHOOL ==
## 2), ])
##
## Residuals:
## Min 1Q Median 3Q Max
## -0.9 -0.3 0.1 0.1 0.7
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 3.9000 0.1291 30.209 <2e-16 ***
## SEX1 0.4000 0.1826 2.191 0.0419 *
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 0.4082 on 18 degrees of freedom
## Multiple R-squared: 0.2105, Adjusted R-squared: 0.1667
## F-statistic: 4.8 on 1 and 18 DF, p-value: 0.04186
#regression for all schools at once
#uses dplyr package for grouping and broom to "sweep" up the results into a data frame
by_school <- group_by(example,SCHOOL)
output <- do(by_school,glance(lm(POPULAR ~ SEX, data=.)))
summary(output$r.squared)
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 0.0000 0.1850 0.3777 0.3555 0.5269 0.8453
output <- do(by_school,tidy(lm(POPULAR ~ SEX, data=.)))
sexcoeff <- output[which(output$term == "SEX"),]
summary(sexcoeff$estimate)
## Min. 1st Qu. Median Mean 3rd Qu. Max.
##
This is a summary of the estimated regression coefficients
Note that the effect of sex on popularity is positive on average
For the average school, a one unit difference in sex is associated with a .8418 unit difference in popularity
boy = 0, girl = 1, so one unit difference in sex is girl vs. boy
Being a girl is associated with .8418 higher score on popularity compared to a boy for the average school
However, please note that the min is -.6000, which means there is a school where the effect goes the other way
The max is 2.28, an effect more than 3 times as big as the mean so there appears to be a lot of variability across schools in the effect of sex
We can also look at school differences in the intercept
The intercept is the predicted popularity when the predictors are all
= 0 which in this case is boy (i.e., SEX
= 0)
So the intercept is the predicted level of popularity for a boy in the school
intcept <- output[which(output$term == "(Intercept)"),]
summary(intcept$estimate)
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 3.167 4.212 4.833 4.887 5.414 7.857
Boy at mean school is predicted to have popularity of 4.887 but varies from 3.167 to 7.857
We can also look at the correlation between the intercept and effect of sex across schools
estimates<-cbind(sexcoeff$estimate,intcept$estimate)
cor(estimates)
## [,1]
## [1,] 1
Correlation is -.31
Schools with higher intercepts tend to have lower effects of sex
Put another way, taking into account the sign of the effect of sex above for most schools:
Schools with higher predicted popularity for boys tend to have lower differences in popularity between sexes
Let’s now put this into a MLM analysis. We need two packages
library(lme4)
## Loading required package: Matrix
##
## Attaching package: 'Matrix'
## The following objects are masked from 'package:tidyr':
##
## expand, pack, unpack
library(lmerTest)
##
## Attaching package: 'lmerTest'
## The following object is masked from 'package:lme4':
##
## lmer
## The following object is masked from 'package:stats':
##
## step
Start with an intercept-only model with random intercept across schools
An intercept-only model has no predictor variables
Random intercept allows schools to vary in their intercept
output <- lmer(POPULAR ~ (1|SCHOOL),data=example)
summary(output)
## Linear mixed model fit by REML. t-tests use Satterthwaite's method [
## lmerModLmerTest]
## Formula: POPULAR ~ (1 | SCHOOL)
## Data: example
##
## REML criterion at convergence: 5115.6
##
## Scaled residuals:
## Min 1Q Median 3Q Max
## -2.88825 -0.63376 -0.05155 0.71091 3.00393
##
## Random effects:
## Groups Name Variance Std.Dev.
## SCHOOL (Intercept) 0.8798 0.9380
## Residual 0.6387 0.7992
## Number of obs: 2000, groups: SCHOOL, 100
##
## Fixed effects:
## Estimate Std. Error df t value Pr(>|t|)
## (Intercept) 5.3076 0.0955 98.9510 55.58 <2e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
We end up with estimates of random effects and fixed effects
Only fixed effect is the intercept:
The intercept for the average school is 5.3076 with a significance test indicating that this intercept is significantly different from 0
This is not an important null hypothesis to consider- like most intercepts
Two random effects:
SCHOOL
(Intercept) is the variance in the intercept
across schools
Residual is the variance across individuals within schools
Does this look like anything familiar?
It is exactly equivalent to an ANOVA (with a random factor instead of a more common fixed factor)
No direct statistical test for the significance of the school intercept variance
Because of statistical issues, there is no really valid test
The best (easiest and good enough) way is to run a model with no intercept variance
This involves first tricking lmer
with a school variable
that is the same for everyone
example$SCHOOLCONSTANT <- rep(1,length(example$SCHOOL))
output2 <- lmer(POPULAR ~ (1|SCHOOLCONSTANT),data=example,control=lmerControl(check.nlev.gtr.1="ignore")) # telling R to proceed with fitting the mixed-effects model without checking if the random effects factors have more than one level.
#then use the anova function to compare the models
anova(output,output2)
## refitting model(s) with ML (instead of REML)
## Data: example
## Models:
## output: POPULAR ~ (1 | SCHOOL)
## output2: POPULAR ~ (1 | SCHOOLCONSTANT)
## npar AIC BIC logLik deviance Chisq Df Pr(>Chisq)
## output 3 5118.7 5135.5 -2556.4 5112.7
## output2 3 6495.5 6512.3 -3244.8 6489.5 0 0
The model with the lower AIC and BIC is the better description of the data
Here, the model with random school intercept is better
Now let’s add in sex as a predictor but as a fixed effect with no variance in the effect across schools
output3 <- lmer(POPULAR ~ SEX + (1|SCHOOL),data=example)
summary(output3)
## Linear mixed model fit by REML. t-tests use Satterthwaite's method [
## lmerModLmerTest]
## Formula: POPULAR ~ SEX + (1 | SCHOOL)
## Data: example
##
## REML criterion at convergence: 4492.9
##
## Scaled residuals:
## Min 1Q Median 3Q Max
## -3.3184 -0.6892 0.0018 0.5961 3.8239
##
## Random effects:
## Groups Name Variance Std.Dev.
## SCHOOL (Intercept) 0.8622 0.9286
## Residual 0.4599 0.6782
## Number of obs: 2000, groups: SCHOOL, 100
##
## Fixed effects:
## Estimate Std. Error df t value Pr(>|t|)
## (Intercept) 4.897e+00 9.530e-02 1.041e+02 51.39 <2e-16 ***
## SEX1 8.437e-01 3.096e-02 1.903e+03 27.25 <2e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Correlation of Fixed Effects:
## (Intr)
## SEX1 -0.158
Now the intercept fixed effect (4.897 but in scientific notation) is the predicted level for a boy in the typical or average school
The fixed effect of SEX
says that girls are predicted to
score .8437 points higher in popularity
This is a statistically significant effect
Now allow for variance in the effect of sex across schools
output4 <- lmer(POPULAR ~ SEX + (1 + SEX|SCHOOL),data=example)
summary(output4)
## Linear mixed model fit by REML. t-tests use Satterthwaite's method [
## lmerModLmerTest]
## Formula: POPULAR ~ SEX + (1 + SEX | SCHOOL)
## Data: example
##
## REML criterion at convergence: 4336.3
##
## Scaled residuals:
## Min 1Q Median 3Q Max
## -2.9478 -0.6598 -0.0047 0.5335 3.5286
##
## Random effects:
## Groups Name Variance Std.Dev. Corr
## SCHOOL (Intercept) 0.9403 0.9697
## SEX1 0.2725 0.5220 -0.28
## Residual 0.3924 0.6265
## Number of obs: 2000, groups: SCHOOL, 100
##
## Fixed effects:
## Estimate Std. Error df t value Pr(>|t|)
## (Intercept) 4.89009 0.09902 98.61054 49.39 <2e-16 ***
## SEX1 0.84311 0.05963 97.93491 14.14 <2e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Correlation of Fixed Effects:
## (Intr)
## SEX1 -0.307
Same interpretation of the fixed effect for intercept
Now fixed effect for SEX
(.84311) is the effect of sex
for the typical or average school
Three random effects:
SCHOOL
(Intercept) is variability across schools in
intercept (predicted level for boys)SCHOOL SEX
is variability across schools in effect of
sex
SEX
within the grouping variable
(SCHOOL
). It quantifies how the effect of sex varies
alongside the baseline level (intercept) across different schools.cor(estimates)
although the
two values are not exactly the same.Let’s compare the last two models
The model in
output3 <- lmer(POPULAR ~ SEX + (1|SCHOOL),data=example)
had no variance in the effect of sex
So a comparison of these models is the same as a significance test for the variance
anova(output3,output4)
## refitting model(s) with ML (instead of REML)
## Data: example
## Models:
## output3: POPULAR ~ SEX + (1 | SCHOOL)
## output4: POPULAR ~ SEX + (1 + SEX | SCHOOL)
## npar AIC BIC logLik deviance Chisq Df Pr(>Chisq)
## output3 4 4492.9 4515.3 -2242.4 4484.9
## output4 6 4341.6 4375.2 -2164.8 4329.6 155.27 2 < 2.2e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Notice the error message here
This analysis is only valid if using the ML
estimator
instead of the default REML
estimator
This is not something you really need to worry about because the program re-estimates the models with the ML estimator
output2
- with school variance in the effect of sex -
has lower AIC and BIC plus we get a significance test using a
chi-square
There are 2 df in this test because it is equivalent to setting the
variance to 0 (1st df) but also setting the covariance/correlation
between the school random intercept and school SEX
effect
to 0 (2nd df)
Within a school, the value for teacher experience is the same for all students
Because it’s the same teacher (1 classroom per school in the data)
output5 <- lmer(POPULAR ~ SEX + TEXP + (1 + SEX|SCHOOL),data=example)
summary(output5)
## Linear mixed model fit by REML. t-tests use Satterthwaite's method [
## lmerModLmerTest]
## Formula: POPULAR ~ SEX + TEXP + (1 + SEX | SCHOOL)
## Data: example
##
## REML criterion at convergence: 4275.9
##
## Scaled residuals:
## Min 1Q Median 3Q Max
## -2.9379 -0.6156 0.0023 0.5304 3.5167
##
## Random effects:
## Groups Name Variance Std.Dev. Corr
## SCHOOL (Intercept) 0.4116 0.6415
## SEX1 0.2733 0.5228 0.06
## Residual 0.3925 0.6265
## Number of obs: 2000, groups: SCHOOL, 100
##
## Fixed effects:
## Estimate Std. Error df t value Pr(>|t|)
## (Intercept) 3.33995 0.16079 97.44554 20.77 <2e-16 ***
## SEX1 0.84315 0.05969 97.89193 14.13 <2e-16 ***
## TEXP 0.10835 0.01022 97.47243 10.61 <2e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Correlation of Fixed Effects:
## (Intr) SEX1
## SEX1 -0.020
## TEXP -0.908 0.000
Fixed effects
Intercept
is predicted level of popularity for boy
(SEX
= 0) in class with teacher experience = 0 so intercept
is no longer really interpretable unless we rescale teacher experience
to have a meaningful 0 point (One way to do this would be to center
teacher experience)SEX
fixed effect is the same interpretation (and almost
identical value) as beforeTEXP
fixed effect means a teacher with 1 additional
year of experience will tend to have students with .10835 higher points
of popularity (this is a statistically significant effect)Random effects
Random effects for school are now harder to interpret: Variance across school in intercept and effect of sex after controlling for teacher experience
output6 <- lmer(POPULAR ~ SEX * TEXP + (1 + SEX|SCHOOL),data=example)
summary(output6)
## Linear mixed model fit by REML. t-tests use Satterthwaite's method [
## lmerModLmerTest]
## Formula: POPULAR ~ SEX * TEXP + (1 + SEX | SCHOOL)
## Data: example
##
## REML criterion at convergence: 4268.4
##
## Scaled residuals:
## Min 1Q Median 3Q Max
## -2.9337 -0.6519 0.0216 0.5307 3.4883
##
## Random effects:
## Groups Name Variance Std.Dev. Corr
## SCHOOL (Intercept) 0.4120 0.6419
## SEX1 0.2264 0.4758 0.08
## Residual 0.3924 0.6264
## Number of obs: 2000, groups: SCHOOL, 100
##
## Fixed effects:
## Estimate Std. Error df t value Pr(>|t|)
## (Intercept) 3.313521 0.161014 97.351210 20.579 < 2e-16 ***
## SEX1 1.329594 0.133050 97.111558 9.993 < 2e-16 ***
## TEXP 0.110235 0.010232 97.439205 10.774 < 2e-16 ***
## SEX1:TEXP -0.034035 0.008457 97.299743 -4.024 0.000113 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Correlation of Fixed Effects:
## (Intr) SEX1 TEXP
## SEX1 -0.046
## TEXP -0.909 0.042
## SEX1:TEXP 0.042 -0.908 -0.046
Fixed effects
Intercept
, SEX
, and
TEXP
are similarRandom effects
In our example, the selection of a grouping effect is guided by theoretical considerations. However, in other instances, the decision to include a random effect might not be as clear-cut. In such cases, understanding the proportion of total variance due to within-group variation, as well as the variance that exists between groups, can be critical for informed decision-making. This measure is known as the intraclass correlation, also referred to as the variance partition coefficient (VPC).
We calculate the ICC on the variance estimates from the
lmer
model. We can extract the variance estimates from the
VarCorr
function. Let’s revisit our intercept-only
model:
output <- lmer(POPULAR ~ (1|SCHOOL),data=example)
summary(output)
## Linear mixed model fit by REML. t-tests use Satterthwaite's method [
## lmerModLmerTest]
## Formula: POPULAR ~ (1 | SCHOOL)
## Data: example
##
## REML criterion at convergence: 5115.6
##
## Scaled residuals:
## Min 1Q Median 3Q Max
## -2.88825 -0.63376 -0.05155 0.71091 3.00393
##
## Random effects:
## Groups Name Variance Std.Dev.
## SCHOOL (Intercept) 0.8798 0.9380
## Residual 0.6387 0.7992
## Number of obs: 2000, groups: SCHOOL, 100
##
## Fixed effects:
## Estimate Std. Error df t value Pr(>|t|)
## (Intercept) 5.3076 0.0955 98.9510 55.58 <2e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
VarCorr(output)
## Groups Name Std.Dev.
## SCHOOL (Intercept) 0.93798
## Residual 0.79917
And we can test the variance parameter using the rand()
function:
rand(output)
## ANOVA-like table for random-effects: Single term deletions
##
## Model:
## POPULAR ~ (1 | SCHOOL)
## npar logLik AIC LRT Df Pr(>Chisq)
## <none> 3 -2557.8 5121.6
## (1 | SCHOOL) 2 -3247.4 6498.9 1379.3 1 < 2.2e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Helpfully, if we convert the result of VarCorr
to a
dataframe, we are provided with the columns vcov
which
stands for variance or covariance, as well as the sdcor
(standard deviation or correlation) which is provided in the printed
summary:
VarCorr(output) %>%
as_tibble()
## # A tibble: 2 × 5
## grp var1 var2 vcov sdcor
## <chr> <chr> <chr> <dbl> <dbl>
## 1 SCHOOL (Intercept) <NA> 0.880 0.938
## 2 Residual <NA> <NA> 0.639 0.799
The ICC is simply the variance at a given level of the model, divided by the total variance (the sum of the variance parameters). So we can write:
VarCorr(output) %>%
as_tibble() %>%
mutate(icc=vcov/sum(vcov)) %>%
select(grp, icc)
## # A tibble: 2 × 2
## grp icc
## <chr> <dbl>
## 1 SCHOOL 0.579
## 2 Residual 0.421
The value of icc (the amount of variance due to cluster) is 0.58,
which suggests the grouping effect of SCHOOL
should be
considered when building a MLM.
It’s common, when variances and covariances are close to zero, that
lmer
has trouble fitting your model. The solution is to
simplify complex models, removing of constraining some random
effects.
You can list the random effects from the model using the
VarCorr
function. If these covariances are very close to
zero though, as is often the case, this can cause convergence issues,
especially if insufficient data are available. If this occurs, you might
want to simplify the model.