Generalized Linear Mixed Effects models

Alejandro Molina Moctezuma

Review

  • Simple linear model

  • Deterministic: \(E[y_i] = \beta_0 + \beta_1x\)

  • Stochastic: \(y_i \sim Normal(E[y_i], \sigma)\)

  • If it’s continuous:

Review

  • Simple linear model

  • Deterministic: \(E[y_i] = \beta_0 + \beta_1x\)

  • Stochastic: \(y_i \sim Normal(E[y_i], \sigma)\)

  • If it’s discrete:

  • Where \(x_1\) = 1 or 0

(Intercept)           x 
   7.087216    3.382339 

Review

  • Multiple groups

  • Deterministic: \(E[y_i] = \beta_0 + \beta_1x_{1,i} + \beta_2x_{2,i}\)

  • Stochastic: \(y_i \sim Normal(E[y_i], \sigma)\)

  • Where \(x_1\) = 1 or 0

  • And \(x_2\) = 1 or 0

  • Group Value of x1 Value of x2
    1 0 0
    2 1 0
    3 0 1

Review

We can have multiple regression

  • Deterministic: \(E[y_i] = \beta_0 + \beta_1x_{1,i} + \beta_2x_{2,i}\)

  • Stochastic: \(y_i \sim Normal(E[y_i], \sigma)\)

  • x1 = continuous

  • x2 = 1 or 0

Image

Linear models can be very complex

  • As many \(\beta s\) as you want (limit n-1)
  • Continuous and categorical
  • Polynomial
  • multiple continuous variables that are interactive

Assumptions

  • Linearity

  • Normality

  • Independence

  • Equal variance

GLM’s

  • Generalized linear models
  • How many assumptions does it break?

GLM’s

  • How many assumptions does it break?

Glm’s

What do Glm’s do?

  1. Transform the response to linear
  2. Have a different distribution of the residuals

If normal:

  • \[ \underbrace{E[y_i]}_{\text{expected value}} = \underbrace{\beta_0 + \beta_1x_{1,i} + ... \beta_mx_{m,i}}_{deterministic} \]

  • \[ y_i \sim \underbrace{N(mean=E[y_i], var=\sigma^2)}_{stochastic} \]

  • Poisson glm:

  • \[ \underbrace{log(\lambda)}_{\text{link function}} = \underbrace{\beta_0 + \beta_1x_{1,i} + ... \beta_mx_{m,i}}_{deterministic} \]

    \[ y_i \sim \underbrace{Poisson(\lambda)}_{stochastic} \]

  • Negative Binomial glm:

  • \[ \underbrace{log(\lambda)}_{\text{link function}} = \underbrace{\beta_0 + \beta_1x_{1,i} + ... \beta_mx_{m,i}}_{deterministic} \]

    \[ y_i \sim \underbrace{NB(\mu,\theta)}_{stochastic} \]

  • \(variance = \frac{\theta}{\mu+\theta}\)

GLM’s

GLM’s

GLM’s

GLM’s

Glm’s

Hurdle models

  • Create two datasets: One with zeroes and ones \(Z_i\)

  • \(Z_i \sim Bernoulli (p_i)\)

  • \(logit(pi) = \beta_0 + ...\)

  • and then a Poisson or negative binomial

Mixed effects

  • Set nets on each of those sites

  • Measure 50, 43, 67, and 90 fish. For mercury concentration

  • What assumptions are we breaking?

We don’t do this

  • \(E(Hg_i) = \beta_0 + \beta_1*size_{i}\)

  • Pseudoreplication

  • \(E(Hg_i) = \beta_0 + \beta_1size_{i} + \beta_2 site2_i + \beta_3 site3_i + \beta_4 site4_i\)

  • We don’t care about the specific sites… we want whole population-wide!

What do we want to know?

  • Mercury concentration population-wide

  • Variance introduces by placement of the nets

  • \[ Hg_{ij} \sim \underbrace{(\beta_0 +\underbrace{\gamma_j}_{\text{Random intercept}})}_{intercept} + \underbrace{(\beta_1+\underbrace{\psi_j}_{\text{Random slope}})size_{i}}_{slope} +\underbrace{\epsilon}_\text{ind var} \]

  • \(\gamma_j \sim Normal(0,\sigma_\gamma)\)

  • \(\psi_j \sim Normal(0,\sigma_\psi)\)

  • \(\epsilon \sim Normal(0,\sigma)\)

Mixed models

If we set all random effects to zero, we get population mean, and predicted value for a random individual

Mixed effects

  • Set nets on each of those sites

  • Measure 50, 43, 67, and 90 fish. For number of parasites OR presence of parasites?

  • What assumptions are we breaking?

Generalized linear mixed effects models

  • Linear portion of the Mixed effects models have a deterministic and stochastic component

  • Count data

  • Binary data (1, 0)

  • Multinomial response variable

Generalized linear mixed effects models

  • We talked last week about overdispersion

  • Poisson, Negative Binomial, Bernoulli and Binomial are not parameterized in terms of separate mean and variance parameters

  • Normal distribution is: \(Normal(\mu, \sigma)\)

  • Mean and variance of Poisson are \(\lambda\)

  • Mean and variance of Bernoulli are a function of p

  • \(Mean = p\) and \(variance=p(1-p)\)

GLMM’s

  • How can we run mixed effects models?

  • Easy way: Add random effects to the linear predictor, leading to generalized linear mixed effect models

  • Essentially, you have two sources of variation

  • One is normally dsitributed, the other one is distributed according to a different distribution

Example

A poisson glm:

  • \(log(\lambda) = \beta_0 + \beta_1x_i\)

  • \(y_i \sim Poisson(\lambda)\)

  • A glmm: Poisson-normal

  • \(log(\lambda) = (\beta_0+\gamma) + (\beta_1+\psi)x_{ij}\)

  • \(\gamma \sim N(0,\sigma_\gamma)\) , \(\psi\sim N(0,\sigma_\psi)\)

  • \(y_i \sim Poisson(\lambda)\)

Example

Log norm

Parameter interpretation

  • Not easy to interpret!

  • In random effects if we set all random effects to 0, then we estimate the mean for a “typical” individual

  • “typical means subject, site, individual, etc.”

  • The variances are in different distributions

  • Typical individual does not equal “population average response”

Example

  • Both curves are not lining up

  • Due to non-logit transformations (random effects are normal)

Interpreting data

  • Individual response curves (black), the response curve for a typical individual with random effects at zeroes, and the population mean response curve (blue) and on the logit and probability scales

Take away

If you do glmm’s be very careful about interpretation

  • In general, transforming data can be risky

Solutions?

  • Package GLMMadaptive estimates marginal means

  • How can we run mixed effects models?

  • Easy way: Add random effects to the linear predictor, leading to generalized linear mixed effect models

  • Hards way: Generalized Estimating Equations

  • https://fw8051statistics4ecologists.netlify.app/gee