Linear Mixed Effects Tutorial

2/9/2021

“Of course, the most rewarding part is the ‘Aha’ moment, the excitement of discovery and enjoyment of understanding something new – the feeling of being on top of a hill and having a clear view. But most of the time, doing mathematics for me is like being on a long hike with no trail and no end in sight.” -Maryam Mirzakhani

“I am in a charming state of confusion.” -Ada Lovelace

Material Covered in this Tutorial

Section 1: Introduction to Linear Mixed Effects Models
Section 2: The Mathematical Model Explained
Section 3: Practical Examples
- 3:1: Hierarchical Example
- 3.2: Repeated Measures Example
Section 4: Closing Remarks and Resources for Further Education

Motivating Images

randeff1

Motivating Images

randeff

ref: Harrison et al 2018

1.0. Introduction to Linear Mixed Effects Models

Linear Regression models with added components that account for variation in the intercept and/or slope parameters across units
Units may be individuals (repeated measures context) or groups (multilevel/hierarchical context)
Extension of the general linear model (GLM)
Whereas basic GLM contains fixed effects (i.e. intercept and mean slope) the LME approach includes both fixed effects and random effects, hence the name “mixed effects modeling”

What is a Fixed Effect?

The simplest linear regression model includes an intercept parameter and a slope parameter
Both of these parameters are what we call “fixed,” meaning that they do not vary across units; e.g. every individual in the population has the same expected value for the intercept and slope
Of course, the typical value won’t perfectly predict True value for each individual or unit. The difference between the expected value and the true value for each unit/person is called “residual error”

What is a Random Effect?

A parameter that is allowed to vary across units or individuals
Rather than taking a single fixed value, random effects are assumed to follow a distribution
Can be added to a model to account for variation about an intercept or slope parameter
Result: Each parameter has a mean and and variance

What is a Random Effect?

Example: a random intercept may be used in a repeated measures context to account for variation across individuals in their baseline levels of y. Each individual gets their own estimated random effect (from the distribution of random effects), representing their unique adjustment from the mean (fixed effect)
Each individual’s line has an intercept:

\[ \beta_{0j} = \gamma_{00} + u_{0j}\] * Poll 1

What is a Random Effect?

randeff

ref: Harrison et al 2018

How do I know whether I should use a mixed-effects modeling approach?

You should consider this approach if your data violate the GLM assumption of independence. For example, if repeated measures are nested within individuals (or if individuals are nested within groups), we actually expect that observations belonging to the same individual and/or the same group would be correlated
Ignoring the correlated error structure or “clustering” in the data may lead to biased standard errors

How do I know whether I should use a mixed-effects modeling approach?

You may also approach this decision from a more theoretical perspective
In a repeated measures context, you may be interested in individual differences when it comes to change across time, or a random slope model
Or, perhaps you are a researcher doing group-level research and you have reason to believe that the effect of \(x\) on \(y\) varies across groups. This would also lead you to a random slope model
Poll 2

Convince me that it matters!

Example 1: National Median wages over time
Example 2: Berkeley gender bias

2.0. The Mathematical Model Defined

The Mathematical Model Defined

As mentioned in Section 1, the LME model is an extention of a basic linear model. Recall the model for a simple linear regression:

\[ Y_{i} = \beta_{0} + \beta_{1}x_{i} + \epsilon_{i} \]

The outcome \(Y_{i}\) has a subscript “\(i\)”, indicating that it is predicted for each individual. \(Y_{i}\) is predicted by the individual’s value on predictor variable \(x\). The variable \(x\) is a linear predictor of \(Y\)
Includes an intercept (\(\beta_{0}\)) and a slope (\(\beta_{1}\)) paramater, as well as a residual error term (\(\epsilon_{i}\)), typically assumed to follow a Normal distribution with mean \(0\) and variance \(\sigma^2\)

The Mathematical Model Defined

Now let’s take a look at the most basic LME model in comparison to the regression model:

\[ Y_{ij} = \beta_{0j} + \beta_{1j}x_{ij} + \epsilon_{ij} \]

Look familiar? The only difference in between this equation and the linear regression equation is that this one contains more subscripts. (But this simple update will give us so much more information, as you’ll see!)

The Mathematical Model Defined

\[ Y_{ij} = \beta_{0j} + \beta_{1j}x_{ij} + \epsilon_{ij} \]

For now we’ll assume that subscript \(j\) represents a group of individuals.
\(i\) can take on any value in \((1, ..., N)\), where \(N\) is the number of individuals and \(j\) may take on values in \((1, ..., J)\), where \(J\) is the number of groups.
“The outcome value for person \(i\) in group \(j\) is equal to the intercept for group \(j\), plus the slope for group \(j\) multiplied by the x-value for person \(i\) in group \(j\), plus some error that cannot be explained by the model for person \(i\) in group \(j\).”
The errors in \(\epsilon_{ij}\) are typically assumed to be independently and identically distributed \((iid)\) ~\(N(0, \sigma^2)\).

The Mathematical Model Defined

\[ Y_{ij} = \beta_{0j} + \beta_{1j}x_{ij} + \epsilon_{ij} \]

If we think of the above expression as Level 1 of the LME model, we could define Level 2 as:

\[ \beta_{0j} = \gamma_{00} + u_{0j}\] \[ \beta_{1j} = \gamma_{10} + u_{1j}\]

This LME model has two fixed effects, \(\gamma_{00}\) and \(\gamma_{10}\)
\(u_{0j}\) is the random intercept, and \(u_{1j}\) is the random slope
\(\gamma\)s define the average line for all groups, and the random effects allow for group-level adjustments

3.0. Practical Examples

In this section, I provide a couple of brief fictional scenarios in which a researcher may want to use the LME approach.

Hierarchical Design Example

Example: Study on the effect of perceived administrative support (PAS) on teacher burnout in junior high schools
Sample: The sample includes 200 (\(N\)=200) teachers from 7 different schools (\(J\)=7) from the Sacramento region of Northern California.
Design: Teachers across different schools in the district each complete a survey of perceived administrative support (PAS) (one time)

Teacher Burnout

Teachers within a school have more shared variance than teachers across schools. One solution would be to account for the clustering using a LME model.

Level 1 of the random-intercept random-slope model would look like this:

\[ Burn_{teach|sch} = \beta_{0sch} + \beta_{1sch}PAS_{teach|sch} + \epsilon_{teach|sch} \] and Level 2 may be:

\[ \beta_{0sch} = \gamma_{00} + u_{0sch}\] \[ \beta_{1sch} = \gamma_{10} + u_{1sch}\]

Teacher Burnout

\(Burn_{teac|sch}\) = The expected burnout for teacher \(i\) in school \(j\)
\(\gamma_{00}\) = The avg teacher burnout level for the entire population of schools
\(u_{0sch}\) = The adjustment to or deviation from the average burnout level unique to school \(j\)
\(\gamma_{10}\) = The avg effect of PAS on burnout across all teachers in all schools
\(u_{1sch}\) = The adjustment to or deviation from the average effect of PAS on burnout that is unique to school \(j\)
\(PAS_{teach|sch}\) = The PAS score for teacher \(i\) in school \(j\)
\(\epsilon_{teach|sch}\) = the teacher burnout level for teacher \(i\) in school \(j\) not explained by the model

Repeated Measures Example

Example: Effectiveness of on-campus pet therapy in decreasing stress levels of college students
Sample: 150 college students
Design: 5-week long program during which p’s spend 4 hours per week in pet therapy. P’s each complete a stress inventory at the start of the study and again during each week of the intervention program, resulting in a total of 6 observations per individual.

Repeated Measures Example

In this example, we may reasonably hypothesize that pet therapy has a similar effect on all individuals
However, we may want to account for the fact that students will vary in their baseline levels of stress (imagine parallel decreasing lines representing decrease in stress across time for each individual).
Poll 3

Repeated Measures Example

Random intercept model, Level 1:

\[ Stress_{time|student} = \beta_{0student} + \beta_{1}Time_{time} + \epsilon_{time|student} \]

Level 2:

\[ \beta_{0student} = \gamma_{00} + u_{0student}\]

Repeated Measures Example

yielding the full model:

\[ Stress_{time|student} = (\gamma_{00} + u_{0student}) + \beta_{1}Time_{time} + \epsilon_{time|student} \]

\(\gamma_{00}\) is the fixed intercept, or the population mean stress level at Time=0.
The random intercept \(u_{0}\) essentially acts like a deviation score from the mean
You could think about adding a random slope to this model, as well. A random slope would account for individual differences in change across the course of the pet therapy intervention program.

Model Parameters

So what exactly gets estimated in these models? Good question! When you run these models, you’ll get the following estimates:

Each fixed effect parameter
The variance of each random effect
the variance/covariance matrix of the random effects (if more than 1)
Residual variance

Model Output

4.0. Closing Remarks and Resources for Further Education

This was a very brief introduction to linear mixed-effects models. There are so many more complex applications of this method, such as

adding more levels (time nested within student, students nested within schools, schools nested within districts, etc.),
adding predictors at Level 2 (i.e. adding predictors to explain some of the variance across groups)
and more!

I hope that you have learned something today that will set you on your journey to continue learning about LME models.

4.0. Closing Remarks and Resources for Further Education

If you have any questions specific to this tutorial, please feel free to send me an email at melissa.mcternan@bc.edu.

Resources

Bates, Machler, Bolker, & Walker, Fitting Linear Mixed-Effects Models using lme4
Raudenbush & Bryk, Hierarchical Linear Models
Singer & Willett, Applied Longitudinal Data Analysis