author: Corey Sparks date: Fall 2018
Names - Mixed effects models - Hierarchical linear models - Random effects models
These all refer to the same suite of models, but different disciplines/authors/applications have managed to confuse us all.
Basic multilevel perspectives - The Social epidemiology perspective - General ecological models - Longitudinal models for change
Each of these situate humans within some higher, contextual level that is assumed to either influence or be influenced by their actions/behaviors
And we would ideally like to be able to incorporate covariates at the individual and higher level that could influence behaviors
Random Intercepts
Figures from Kawachi and Berkman (2003)
Here’s a picture of this:
Multistage Sampling
When we have a research statement that involves individuals within some context, this is a multi-level proposition. In this sense, we are interested in questions that relate variables at different levels, the micro and the macro. This also holds in general if a sample was collected with a multi-stage sampling scheme.
In a multilevel proposition, variables are present at two different levels, and we are interested in the relationship between both the micro and macro level association with our outcome, y.
Multi-level
This can be contrasted with a purely micro level proposition, where all our observed variables are the level of the individual
micro-level
Likewise, if we are only interested in the relationship between macro level variables, we have this situation:
macro-level
We commonly encounter the situation where a macro level variable effects a micro level outcome. This can happen in several different ways.
macro-micro
The first case is a macro to micro proposition, which may be exemplified by a statement such as: “Individuals in areas with high environmental contamination leads to higher risk of death”.
Whereas the second frame illustrates a more specific special case, where there is a macro level effect, net of the individual level predictor, and may be stated “For individuals with the a given level of education, living in areas with high environmental contamination leads to higher risk of death”.
The last panel illustrates what is known as a cross level interaction, or a macro-micro interaction. This is where the relationship between x and y is dependent on Z. This leads to the statement “Individuals with low levels of education, living in areas with high environmental contamination have higher risk of death”.
These kinds of models are used when you have observations on the same individuals over multiple time points * Don’t have to be the same times/ages for each observation + More flexibility * These models treat the individual as the higher level unit, and you are interested in studying * Change over time within an individual * Impacts of prior circumstances on later outcomes
This model is of the form: \[ y_{ij} = \mu + u_{j} + e_{ij} \]
where \(\mu\) is the grand mean, \(u_{j}\) is the fixed effect of the \(j^{th}\) group and \(e_{ij}\) is the residual for each individual
This model assumes that you are capturing all variation in y by the group factor differences, \(u_{j}\) alone.
If you have all your groups,
and your only predictor is the group (factor) level,
and if you expect there to be directional differences across groups a priori,
then this is probably the model for you.
You might use this framework if you want to crudely model the effect of region of residence in an analysis. + i.e. is the mean different across my region?
Each cell (group) has its mean defined by: \[ \mu + u_{j} \]
Also, the global F-test will show us if ANY of the means are different from one another
as a simple example
That all groups have the same \(\beta\) effect on the mean
\[ y_{ij} = \mu + u_{j} + e_{ij} \] + The random effects model assumes that each of the group means \(\mu + u_j\) are composed of a grand mean and an iid random effect
\[ y_{ij} = \mu + u_{j} + e_{ij} \] + Generally, this iid random effect, u is assumed to come from : \[u_j \sim N(0, \sigma^2)\]
+ the random effects are centered around the mean \(\mu\) + So that’s why there’s a 0 mean, and the variation in the groups is modeled by the estimated variance in the distribution \(\sigma^2\) + Basically, if \(\sigma^2\) = 0, then there is no variation between groups!
* This model is called the random intercept model, because only the intercepts are allowed to vary randomly
There are differences between these classic models and the linear mixed model. As a rule, you use the fixed-effects models when:
and you also know all the groups a priori e.g. sex or race