Introduction to Hierarchical Models

Corey Sparks
DEM 7903, Fall 2014

Fundamental problem

Names

  • Mixed effects models
  • Hierarchical linear models
  • Random effects models

These all refer to the same suite of models, but different disciplines/authors/applications have managed to confuse us all.

Perspectives in Multi-level Modeling

Basic multilevel perspectives

  • The Social epidemiology perspective
  • General “ecological” models
  • Longitudinal models for change

Each of these situate humans within some higher, contextual level that is assumed to either influence or be influenced by their actions/behaviors

And we would ideally like to be able to incorporate covariates at the individual and higher level that could influence behaviors

Example of a Multi-level problem

test

Longitudinal Models

These kinds of models are used when you have observations on the same individuals over multiple time points

  • Don't have to be the same times/ages for each observation
    • More flexibility
  • These models treat the individual as the higher level unit, and you are interested in studying
  • Change over time within an individual
  • Impacts of prior circumstances on later outcomes

Preface

  • Not all data sets will allow you to do multi-level modeling
    • Many data sets don't have any “higher level” units identified, or the units they do have are not necessarily meaningful
  • Not all problems are multi-level problems
    • Unless you are specifying a problem that is interested in how some characteristic of some “higher level” structure is influencing behavior, these models are not for you.

Linear Mixed Model

  • In the traditional linear model, groups are treated as “fixed” effects
    • ANOVA, ANCOVA, MANOVA
  • For instance the ANOVA model assumes that the group effects are fixed and do not change relative to the reference group
    • Also the groups represent distinct representations of all possible groups in the population
  • This model is of the form: \[ y_{ij} = \mu + u_{j} + e_{ij} \]

  • where \( \mu \) is the grand mean, \( u_{j} \) is the “fixed effect” of the \( j^{th} \) group and \( e_{ij} \) is the residual for each individual

This model assumes that you are capturing all variation in y by the group factor differences, \( u_{j} \) alone.

If you have all your groups,

and your only predictor is the group (factor) level,

and if you expect there to be directional differences across groups a priori,

then this is probably the model for you.

You might use this framework if you want to crudely model the effect of “region of residence” in an analysis.

  • i.e. is the mean different across my “region”?

ANOVA and ANCOVA

  • The ANOVA and ANCOVA models are extremely useful if:

    • You simply want to test differences in the mean across groups (ANOVA) \[ y_{ij} = \mu + u_{j} + e_{ij} \]
    • Each cell (group) has its mean defined by: \[ \mu + u_{j} \]
    • And we typically set one group as the “reference group”
    • We test if each $$\mu_{j} = 0$$ using a t-test and see if all our group means are equal
    • Also, the global F-test will show us if ANY of the means are different from one another

ANOVA and ANCOVA

  • In the ANCOVA model, you want to examine the effect of a covariate in each group (ANCOVA) \[ y_{ij} = \mu + u_{j}*\beta x_{i}+ e_{ij} \]

    • as a simple example
    • This model contains the interaction between the group factor \( u_{j} \) and \( x_i \)
    • This is often called the parallel slopes model, because it is testing the assumption from the simpler model: \[ y_{ij} = \mu + u_{j}+\beta x_{i}+ e_{ij} \]
    • That all groups have the same \( \beta \) effect on the mean

Basic Random Effect Models

  • Consider the ANOVA model:

    \[ y_{ij} = \mu + u_{j} + e_{ij} \]

    • The random effects model assumes that each of the group means \( \mu + u_j \) are composed of a grand mean and an iid random effect
    • This differes from the ANOVA model because the \( u_j \)'s are not considered fixed, by setting a comparison group.

Basic Random Effect Models

\[ y_{ij} = \mu + u_{j} + e_{ij} \]

  • Generally, this iid random effect, u is assumed to come from : \[ u_j \sim N(0, \sigma^2) \]
    • the random effects are centered around the mean \( \mu \)
    • So that's why there's a 0 mean, and the variation in the groups is modeled by the estimated variance in the distribution \( \sigma^2 \) +Basically, if \( \sigma^2 \) = 0, then there is no variation between groups!
  • This model is called the random intercept model, because only the intercepts are allowed to vary randomly

Choosing...

  • There are differences between these classic models and the linear mixed model. As a rule, you use the fixed-effects models when:

    • 1) You know that each group is regarded as unique, and you want to draw conclusions on each of these specific groups,
    • and you also know all the groups a priori e.g. sex or race

Choosing

  • 2) If the groups can be considered as a sample from some (real or hypothetical) population of groups, and you want to draw conclusions about this population of groups, then the random effects model is appropriate.
    • WHY? because if you have a LARGE number of groups,
    • say \( n_j \) > 10, then the odds that you are really interested in all possible difference in the means is probably pretty low

Forms of the random effect model

  • There are 2 basic forms for the mixed model
  • These models may be extended in MANY, MANY, MANY more ways
    • which is why we're here
    • Random Intercepts model
    • Random Slopes model

Random Intercept Model

  • The random intercept model assumes you have:
    • j groups (j=1 to J)
    • i individuals within the j groups (i=1 to \( n_j \))
    • for each individual in the j groups you have measured \( y_{ij} \) and \( x_{ij} \) and potentially for each group j, we have may have measured \( z_j \) which is a covariate measured at the group level For example, you do a survey on health and you measure:
  • y = the health status of each individual
  • x= SES, race, etc of each individual
  • j = the county each individual lives in, and
  • z = the poverty rate or median income in the county

Random Intercept Model

  • We write our full model, with k predictors as: \[ y_{ij} = \beta_{0j} + \sum_{k} {\beta_k x_{ik}} + \gamma z_j + e_{ij} \]
    • This model has a few features that we can use or not use, as it suits us
    • e.g. if we don't have a group-level predictor z, then we won't have that component of the model
    • \( \beta_{0j} \) is called the random intercept,
    • We can write the random intercept as: \[ \beta_{0j} = \beta_0 + u_{j} \]
    • i.e. a fixed mean intercept and each group's iid deviation from it
    • and again u is assumed to come from : \[ u_j \sim N(0, \sigma^2) \]

Random Intercept Model

  • Graphically the \( \beta_{0j} \) term can be seen as: test

Variance components

  • This model also estimates variance components, so you can see how much variability is accounted for by adding the random intercept term.
    • if var(\( y_{ij} \)) is the total variance, \( \sigma^2 \)
    • and var(\( u_j \)) is the higher level variance in the random intercepts, \( \sigma^2 _{u} \)
    • and var(\( e_{ij} \)) is the residual individual level variance, \( \sigma^2 _{e} \)
  • we can write the total variance as: \( \sigma^2 \) = \( \sigma^2 _{e}+\sigma^2 _{u} \)
    • These are called the “variance components” of the model,
    • and separate the variance into differences between individuals and differences between groups

Variance components

  • the correlation between any two individuals within a given group is: \[ \rho(y_{ij},y_{i'j}) = \frac{ \sigma^2 _{u} }{ \sigma^2 _{u} + \sigma^2 _{e} } \]
  • is called the intra-class correlation coefficient, and can be interpreted as the correlation between 2 random individuals in a random group, but, I find it more informative to interpret as the fraction of the variance that is due to the groups.