CNCM 2023 Statistics Bootcamp: Mixed-Effects Models

Author

Zhaoxia Yu, Professor of Statistics, UCI

Published

August 21, 2023

Load Required Packages

Code
knitr::opts_chunk$set(echo = TRUE)
library("dplyr") #a library provides convenient data manipulation
library(ICC) #load the library to conduct ICC analysis with its function ICCbare
library(nlme) #load the nlme library
library(lme4)
library(lmerTest)
library(ggplot2)
library(gridExtra)
library("vioplot") #note, you will receive an error message if the package was not installed

Learning Objectives

Motivating Examples

Example 1

  • 1200 neurons from 24 mice; 5 conditions/groups
  • Which time points are different from the baseline?

pCREB stain intensities measured from 24 mice at 5 conditions

Read data

Code
Ex1=read.csv("https://www.ics.uci.edu/~zhaoxia/Data/BeyondTandANOVA/Example1.txt", head=T)
names(Ex1)=c("res", "trt_idx", "midx")
#Do not forget to factor the treatment IDs and animal IDs
#This is particularly important for the trt_idx, 
#else the values will be treated as numerical values, rather than levels
Ex1$trt_idx = factor(Ex1$trt_idx,
                              labels=c("baseline", "24h", "48h", "72h", "1week"))
Ex1$midx = factor(Ex1$midx)
#head(Ex1)

Example 1: the “Familiar” analysis

Code
mycoef=summary(lm(res~trt_idx, data=Ex1))$coef
mycoef[,1:3]=round(mycoef[,1:3],2)
mycoef
             Estimate Std. Error t value      Pr(>|t|)
(Intercept)      1.03       0.04   25.67 4.064778e-116
trt_idx24h       0.78       0.06   13.34  6.040147e-38
trt_idx48h       0.81       0.08   10.77  6.760583e-26
trt_idx72h       0.16       0.07    2.19  2.907634e-02
trt_idx1week    -0.36       0.06   -5.75  1.112796e-08

Example 1: the “Familiar” approach for null data

  • Is the familiar approach valid? We evaluate the method using data generated under the hypothesis

  • We can generate a null data set by permuting the treatment group labels of the animals

  • We generate 1000 null data sets and check how many times the familiar approach will reject the null hypothesis of no group difference

Example 1: the “Familiar” approach failed

  • We evaluated the familiar approaches (lm, anova) using 1000 data sets simulated under the null hypothesis of no treatment effects.

  • To simulate a data set under the null hypothesis, we shuffled the treatment labels of the animals in the original data.

  • The overall p-value of treatment effects was calculated using each simulated data set.

  • If the anova approach worked, there number of p-values less than 0.05 should be around 0.05.

Code
treatment=as.factor(rep(1:5, c(7,6,3,3,5)))
ncell=sapply(split(Ex1, Ex1$midx), dim)[1,]
#generate pseudo (permuted) 1000 times by randomly 
#shuffling the treatment labels across mice
pvalues=rep(NA, 1000)#initialize a vector of p-values
for(i in 1:1000) {
  Ex1.perm = data.frame(res=Ex1$res, 
                        trt_idx=rep(sample(treatment),ncell), midx=Ex1$midx)
  pvalues[i]=anova(lm(res~trt_idx, data=Ex1.perm))$"Pr(>F)"[1] }

Example 1: P-values using 1000 null data sets

  • What does the histogram suggest?
Code
hist(pvalues)

Example 2

  • Research question: determine how in vivo calcium (Ca++) activity of PV cells (measured longitudinally by the genetically encoded Ca++ indicator GCaMP6s) changes over time after ketamine treatment
  • Study: Ca++ event frequencies were measured at 24h, 48h, 72h, and 1 week after ketamine treatment in four mice
  • Want to compare Ca++ event frequency at 24h to the other three time points.
  • In total, Ca++ event frequencies of 1,724 neurons were measured.

Read data:

Code
Ex2=read.csv("https://www.ics.uci.edu/~zhaoxia/Data/BeyondTandANOVA/Example2.txt", head=T)
names(Ex2)=c("res", "trt_idx", "midx")
Ex2$trt_idx=factor(Ex2$trt_idx, labels=c("24h", "48h", "72h", "1week"))
### covert the variable of mouse IDs to a factor
Ex2$midx=factor(Ex2$midx)

Example 2: “Familiar” analysis

Code
lm.obj=lm(res~trt_idx, data=Ex2)
mycoef=summary(lm.obj)$coefficients
mycoef[,1:3]=round(mycoef[,1:3],2)
mycoef
             Estimate Std. Error t value     Pr(>|t|)
(Intercept)      0.71       0.01   57.95 0.000000e+00
trt_idx48h      -0.08       0.02   -4.59 4.835037e-06
trt_idx72h       0.01       0.02    0.53 5.946707e-01
trt_idx1week     0.05       0.02    3.04 2.369903e-03

Example 2: “Familiar” analysis

  • The LM (including ANOVA, t-test) analysis results indicate
    • significantly reduced Ca++ activity at 48h relative to 24h with \(p=4.8\times 10^{-6}\)
    • significantly increased Ca++ activity at 1 week compared to 24h with \(p=2.4\times 10^{-3}\)
  • However, the figure in next slides does not support this conclusion

Example 2: Pooling data naively might be misleading

Code
knitr::include_graphics("images/Fig2S.jpg")

The boxplots of Ca++ event frequencies measured at four time points. (A) Boxplot of Ca++ event frequencies using the pooled neurons from four mice. (B) boxplots of Ca++ event frequencies stratified by individual mice.

Example 2: Pooling data naively might be misleading

  • Consider the change in Ca++ activities from 24h to 48h
  • Pooled data from all mice:
    • The box plots suggest reduction in Ca++ activities
  • Individual mice data:
    • The box plots of Mouse 2 suggest a noticeable reduction
    • However, there was almost no change in Mouse 1
    • Mouse 3 and Mouse 4 might suggest small increases, rather than decreases

Example 2: Pooling data naively might be misleading

  • Why do the pooled data follow the pattern of Mouse 2?
Code
# the number of cells in each animal-time combination
table(Ex2$midx, Ex2$trt_idx)
   
    24h 48h 72h 1week
  1  81 254  88    43
  2 206 101 210   222
  3  33  18  51   207
  4  63  52  58    37
Code
# compute the percent of cells contributed by each mouse
rowSums(table(Ex2$midx, Ex2$trt_idx))/1724
        1         2         3         4 
0.2703016 0.4286543 0.1792343 0.1218097 
  • Mouse 2 contributed 43% cells!

Clustered Data Are Dependent

Why does LM fail for Example 1?

  • This because the observations are not independent
  • We can compute Intra-Class Correlation (ICC) to quantify the magnitude of clustering due to animal effects.

ICC Analysis of Example 1

Code
### conduct ICC analysis by organizing all the information into a data frame
icc.analysis=data.frame(n=rep(0,5), icc=rep(0,5), design.effect=rep(0,5),
effective.n=rep(0,5), M=rep(0,5), cells=rep(0,5))
treatment.groups=c("baseline", "24h", "48h", "72h", "1week")
for(i in 1:5){
    trt= Ex1[Ex1$trt_idx==treatment.groups[i],]
    trt$midx=factor(trt$midx)
    icc=ICCbare(factor(trt$midx), trt$res) #ICCbare is a function in the ICC package
    icc.analysis$cells[i]=dim(trt)[1]
    M=dim(trt)[1]/length(unique( trt$midx))
    def=1 + icc*(M-1)
    icc.analysis$n[i]=length(unique( trt$midx))
    icc.analysis$icc[i]=icc
    icc.analysis$design.effect[i]=def
    icc.analysis$effective.n[i]=dim(trt)[1]/def
    icc.analysis$M[i]=M
}
Code
tmp=t(icc.analysis[,c(6,2,4)])
row.names(tmp)= c("# of cells", "ICC", "effective n")
knitr::kable(tmp, col.names = c("Saline (7 mice)","24h (6 mice)","48h (3 mice)", "72h (3 mice)","1wk (5 mice)"))
Saline (7 mice) 24h (6 mice) 48h (3 mice) 72h (3 mice) 1wk (5 mice)
# of cells 357.0000000 309.0000000 139.000000 150.000000 245.0000000
ICC 0.6209487 0.3300633 0.017803 0.628109 0.5369458
effective n 11.1397374 17.4890529 76.920039 4.720344 9.1508744

  • The ICC indicates that the dependency due to clustering is substantial.
  • Therefore, the 1,200 neurons should not be treated as 1,200 independent cells.
  • When dependence is not adequately accounted for, the type I error rate can be much higher than the pre-chosen level of significance.

From LM to LME

LM (incorrect!) for Example 1

  • Consider Example 1. Let
    • \(Y_{ij}\) indicate the \(j\)th observed response of the \(i\)th mouse.
    • \(x_{ij}\) be the treatment label, with \(x_{ij}=1\) for baseline, \(x_{ij}=2\) for 24 hours, \(x_{ij}=3\) for 48 hours, \(x_{ij}=4\) for 72 hours, and \(x_{ij}=5\) for 1 week after ketamine treatments.
  • In the inner mathematical computation, four dummy variables, which take value 0 or 1, are generated: \(x_{ij,1} = 1\) for 24 hours, \(x_{ij,2} = 1\) for 48 hours, \(x_{ij,3} = 1\) for 72 hours, and \(x_{ij,4} = 1\) for 1 week after ketamine treatments, respectively.

\[\begin{aligned} Y_{ij} &= \beta_0 + x_{ij,1}\beta_1 + … + x_{ij,4}\beta_4 + \epsilon_{ij}, \\ & i=1, …, 24; j=1, …, n_i;\end{aligned}\] where \(n_i\) is the number of observations from the \(i\)th mouse.

LM (incorrect!) for Example 1

Code
lm.obj1=lm(res~trt_idx, data=Ex1)
design.mat=model.matrix(lm.obj1)
print(c("intercept", "24h", "48h", "72h", "1week"))
[1] "intercept" "24h"       "48h"       "72h"       "1week"    
Code
image(t(design.mat[nrow(design.mat):1,]), xlab="", ylab="")

LME for Example 1

  • The 1200 observations are by animal. We account for the resulting by adding an animal specific effect, as follows: \[\begin{aligned} Y_{ij} &= \beta_0 + x_{ij,1}\beta_1 + … + x_{ij,4}\beta_4 + u_i + \epsilon_{ij}, \\ & i=1, …, 24; j=1, …, n_i; \end{aligned}\] where

  • \(u_i\) indicates the deviance between the overall intercept \(\beta_0\) and the mean specific to the \(i\)th mouse

  • \(\epsilon_{ij}\) represents the deviation in pCREB immunoreactivity of observation (cell) \(j\) in mouse \(i\) from the mean pCREB immunoreactivity of mouse i

  • (\(\beta_0, \beta_1, \beta_2, \beta_3, \beta_4\)) are assumed to be but unknown

  • (\(u_1, \cdots, u_{24}\)) are treated as independent and identically distributed variables from a normal distribution with mean 0 and a variance parameter that reflects the variation across animals.

LME for Example 1

  • Similar to the treatment variable, for the animal ID variable, the users do not need to define the dummy variables, which are generated by R automatically in its inner working.

  • Thus, equivalently, one could write the previous equation by using a vector (\(z_{ij,1}, …, z_{ij,24}\)) of dummy variables for the cluster/animal memberships such that \(z_{ij,k}=1\) for \(i=k\) and 0 otherwise:

\[\begin{aligned} Y_{ij} &= \beta_0 + x_{ij,1}\beta_1 + … + x_{ij,4}\beta_4 + z_{ij,1}u_1 + … + z_{ij,24}u_{24} + \epsilon_{ij},\\ & i =1, …, 24; j=1, …, n_i; \end{aligned}\]

LME for Example 1

  • \(Y_{ij}\) is modeled by three components:
    • the fixed-effects from the covariates (\(x_{xij,1}, …, x_{ij,4}\)) and the overall intercept \(\beta_0\), which is the population mean of the reference group in this example
    • the random-effects due to the clustering (\(z_{ij,1}, …, z_{ij,24}\))
    • the random errors \(\epsilon_{ij}\)’s

R Packages for LME

  • Two major packages are ‘nlme’ and ‘lme4’.
  • Syntax:
    • ‘nlme::lme(res~trt_idx, data= Ex1, random = ~ 1|midx)’
    • ‘lme4::lmer(res ~ trt_idx+(1|midx), data=Ex1)’
  • For the fixed-effects (trt_idxhere), make sure that it is a factor, not numerical, as the levels “1-5” denote different times points
  • For the random-effects from “midx”(mice), R treated it as a factor with different levels (animals)

LM, LME, GLM, and GLMM

LM and LME: Matrix Format

  • LM:
    • a linear predictor \(X\beta\)
    • random errors \(\mathbf \epsilon\) are independent, have a zero mean and a constant variance.
    • \(\mathbf \epsilon \sim N(0, \sigma^2 \mathbf I)\) is used for deriving t- and F-tests. Typically this assumption is not very critical as long as the sample size is not too small
  • LME:
    • fixed-effects: a linear predictor \(X\beta\)
    • random-effects: \(Z\mathbf u\), where \(\mathbf u\sim N(0, G)\). E.g., \(G=\sigma_b^2 \mathbf I\).
    • random errors: \(\mathbf \epsilon \sim N(0, \sigma^2 \mathbf I)\), independent with \(\mathbf u\).
  • In both LM and LME \(E(Y|X)=X\beta\).

Extend LM to GLM

  • The components of GLM:
    • a linear predictor \(X\beta\)
    • a link function to connect \(E(Y|X)\) and \(X\beta\):
    • a distribution for \(Y\) given \(E(Y|X)\)
  • Example: linear model
    • the identify link: \(g(E(Y|X))=E(Y|X)\), therefore, \(E(Y|X)=X\beta\).
    • \(Y\sim N(E(Y|X), \sigma^2)\)
  • Example: logistic regression
    • the logit link: \(g(\pi)=log(\frac{\pi}{1-\pi})\), where \(\pi=E(Y|X)\).
    • \(Y\sim Bernoulli(\pi)\)

Extend LM to GLMM

  • The components of GLMM:
    • fixed-effects: a linear predictor \(X\beta\)
    • random-effects: \(Z\mathbf u\), where \(\mathbf u\sim N(0, G)\). E.g., \(G=\sigma_b^2 \mathbf I\).
    • a link function to connect \(E(Y|X, \mathbf u)\) and \(X\beta\):
    • a distribution for \(Y\) given \(E(Y|X, \mathbf u)\)

LME Examples: Example 1

LME Examples: Example 1

Code
# The nlme:lme function specifies the fixed effects in the formula
# (first argument) of the function, and the random effects
# as an optional argument (random=). The vertical bar | denotes that
# the cluster is done through the animal id (midx)
obj.lme=lme(res~trt_idx, data= Ex1, random = ~ 1|midx, method="ML")
summary(obj.lme)$tTable
                  Value Std.Error   DF    t-value      p-value
(Intercept)   1.0008500 0.1750995 1176  5.7158919 1.382236e-08
trt_idx24h    0.8191952 0.2577129   19  3.1787124 4.944475e-03
trt_idx48h    0.8427397 0.3200466   19  2.6331777 1.638113e-02
trt_idx72h    0.1896571 0.3197681   19  0.5931081 5.601033e-01
trt_idx1week -0.3202969 0.2713859   19 -1.1802269 2.524757e-01
  • The results from LME is more realistic
Code
summary(obj.lme)
Linear mixed-effects model fit by maximum likelihood
  Data: Ex1 
       AIC      BIC    logLik
  2272.961 2308.592 -1129.481

Random effects:
 Formula: ~1 | midx
        (Intercept)  Residual
StdDev:   0.4545821 0.5995347

Fixed effects:  res ~ trt_idx 
                  Value Std.Error   DF   t-value p-value
(Intercept)   1.0008500 0.1750995 1176  5.715892  0.0000
trt_idx24h    0.8191952 0.2577129   19  3.178712  0.0049
trt_idx48h    0.8427397 0.3200466   19  2.633178  0.0164
trt_idx72h    0.1896571 0.3197681   19  0.593108  0.5601
trt_idx1week -0.3202969 0.2713859   19 -1.180227  0.2525
 Correlation: 
             (Intr) trt_24 trt_48 trt_72
trt_idx24h   -0.679                     
trt_idx48h   -0.547  0.372              
trt_idx72h   -0.548  0.372  0.300       
trt_idx1week -0.645  0.438  0.353  0.353

Standardized Within-Group Residuals:
       Min         Q1        Med         Q3        Max 
-2.5410173 -0.5737059 -0.1133680  0.4733263  8.8578521 

Number of Observations: 1200
Number of Groups: 24 

Examine the overall effect

  • Similar to ANOVA, we can test whether the treatment factor is significant
Code
anova(obj.lme)
            numDF denDF   F-value p-value
(Intercept)     1  1176 179.66421  <.0001
trt_idx         4    19   5.89455  0.0029

LME Examples: Example 2

Example 2: LME

Code
lmer.obj=lmerTest::lmer(res~trt_idx+(1|midx), data= Ex2, REML="FALSE")
summary(lmer.obj)$coefficients
                 Estimate Std. Error          df    t value     Pr(>|t|)
(Intercept)   0.699786009 0.03484986    4.901964 20.0800262 6.756672e-06
trt_idx48h   -0.017490109 0.01726513 1723.485832 -1.0130306 3.111877e-01
trt_idx72h    0.009353984 0.01657856 1720.292658  0.5642219 5.726767e-01
trt_idx1week  0.029448530 0.01656107 1719.621372  1.7781780 7.555129e-02

Example 2: LM vs LME

Estimated changes of Ca+ event frequency (the baseline is 24h after treatment)

Remark: on the minimum number of levels for using random-effects

  • In Example 2, the number of levels in the random-effects variable is four, as there are four mice.
  • According to Gelman and Hill 2006, it does not hurt to use random-effects in this situation.
  • There is no unique answer on the minimum number of levels for using random-effects.

Remark: on the minimum number of levels for using random-effects

  • An alternative is to include the animal ID variable as factor with fixed animal effects.
  • Neither of two approaches is the same as fitting an LM to the pooled cells naively.
  • In a more extreme case, for an experiment using only two monkeys for example,
    • naively pooling data (such as neurons) is NOT recommended.
    • a more appropriate approach is to analyze the animals separately and then check whether the results from the two animals are consistent

LME Examples: Example 3

Example 3: data structure

  • Ca++ event integrated amplitudes are compared between baseline and 24h after ketamine treatment.

  • 1244 cells were sampled from 11 mice

  • each cell was measured twice (baseline and after ketamine treatment)

  • correlation arises from both cells and animals, which creates a three-level structure:

    • measurements within cells and cells within animals.

Read Data

Code
    Ex3=read.csv("https://www.ics.uci.edu/~zhaoxia/Data/BeyondTandANOVA/Example3.txt", head=T)

Example 3: LM vs LME

Wrong analysis using lm or t-tests

Code
#### wrong analysis: using the linear model
summary(lm(res~treatment, data=Ex3[!is.na(Ex3$res),])) #0.0036

Call:
lm(formula = res ~ treatment, data = Ex3[!is.na(Ex3$res), ])

Residuals:
    Min      1Q  Median      3Q     Max 
-3.1311 -1.3203 -0.1806  1.1438  6.7518 

Coefficients:
            Estimate Std. Error t value Pr(>|t|)    
(Intercept)  2.73206    0.10817  25.258   <2e-16 ***
treatment    0.19952    0.06847   2.914   0.0036 ** 
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 1.708 on 2487 degrees of freedom
Multiple R-squared:  0.003403,  Adjusted R-squared:  0.003002 
F-statistic: 8.492 on 1 and 2487 DF,  p-value: 0.0036
Code
#### wrong analysis using t tests (paired or unpaired)
t.test(Ex3[Ex3$treatment==1,"res"], Ex3[Ex3$treatment==2,"res"], var.eq=T)

    Two Sample t-test

data:  Ex3[Ex3$treatment == 1, "res"] and Ex3[Ex3$treatment == 2, "res"]
t = -2.914, df = 2487, p-value = 0.0036
alternative hypothesis: true difference in means is not equal to 0
95 percent confidence interval:
 -0.33378231 -0.06525884
sample estimates:
mean of x mean of y 
 2.931585  3.131106 
Code
t.test(Ex3[Ex3$treatment==1,"res"], Ex3[Ex3$treatment==2,"res"])

    Welch Two Sample t-test

data:  Ex3[Ex3$treatment == 1, "res"] and Ex3[Ex3$treatment == 2, "res"]
t = -2.9121, df = 2347.6, p-value = 0.003624
alternative hypothesis: true difference in means is not equal to 0
95 percent confidence interval:
 -0.33387620 -0.06516496
sample estimates:
mean of x mean of y 
 2.931585  3.131106 
Code
t.test(Ex3[Ex3$treatment==1,"res"], Ex3[Ex3$treatment==2,"res"], paired=T)

    Paired t-test

data:  Ex3[Ex3$treatment == 1, "res"] and Ex3[Ex3$treatment == 2, "res"]
t = -3.7067, df = 1240, p-value = 0.0002193
alternative hypothesis: true mean difference is not equal to 0
95 percent confidence interval:
 -0.28671163 -0.08824982
sample estimates:
mean difference 
     -0.1874807 

Analysis using LME

Code
#LME
lme.obj1=lme(res~ treatment, random =~1| midx/cidx, 
             data= Ex3[!is.na(Ex3$res),], method="ML")
summary(lme.obj1)
Linear mixed-effects model fit by maximum likelihood
  Data: Ex3[!is.na(Ex3$res), ] 
       AIC      BIC    logLik
  9378.498 9407.596 -4684.249

Random effects:
 Formula: ~1 | midx
        (Intercept)
StdDev:    0.404508

 Formula: ~1 | cidx %in% midx
        (Intercept) Residual
StdDev:    1.083418 1.259769

Fixed effects:  res ~ treatment 
                Value  Std.Error   DF   t-value p-value
(Intercept) 2.7983541 0.15017647 1240 18.633772   0e+00
treatment   0.1934755 0.05055295 1240  3.827184   1e-04
 Correlation: 
          (Intr)
treatment -0.504

Standardized Within-Group Residuals:
        Min          Q1         Med          Q3         Max 
-2.69833206 -0.60733714 -0.09362515  0.52748499  3.91394332 

Number of Observations: 2489
Number of Groups: 
          midx cidx %in% midx 
            11           1248 

Example 3: LM vs LME

  • LME and LM produce similar estimates for the fix-effects coefficients
  • the standard error of the LM is larger; the p-value based on LME is smaller (0.0036 for LM vs 0.0001 for LME).
  • In this example, since the two measures from each cell are positively correlated, the variance of the differences is smaller when treating the data as paired rather than independent.
  • As a result, LME produces a smaller p-value

(Left) the scatter plot of Ca++ event integrated amplitude at baseline vs 24h after treatment for the neurons from four mice (labeled as 1, 2, 3 and 4) indicates that the baseline and after-treatment measures are positively correlated. (Right) boxplot of the baseline and after-treatment correlations of the 11 mice.

A note on “nested” random effects

  • When specifying the nested random effects, we used “random =~1| midx/cidx”.
  • This leads to random effects at two levels: the mouse level and the cells-within-mouse level.
  • This specification is important if same cell IDs might appear in different mice.
  • When each cell has its unique ID, just like “cidx” variable in Example 3, it does not matter and “random =list(midx=~1, cidx=~1)” leads to exactly the same model.

A note on “nested” random effects

Verify that the cell IDs are indeed unique

Code
### to verify that the cell IDs are indeed unique
length(unique(Ex3$cidx))
[1] 1248
Code
#lme.obj2 is the same as lme.obj
lme.obj2=lme(res~ treatment, random =list(midx=~1, cidx=~1),
             data=Ex3[!is.na(Ex3$res),], method="ML")
summary(lme.obj2)
Linear mixed-effects model fit by maximum likelihood
  Data: Ex3[!is.na(Ex3$res), ] 
       AIC      BIC    logLik
  9378.498 9407.596 -4684.249

Random effects:
 Formula: ~1 | midx
        (Intercept)
StdDev:    0.404508

 Formula: ~1 | cidx %in% midx
        (Intercept) Residual
StdDev:    1.083418 1.259769

Fixed effects:  res ~ treatment 
                Value  Std.Error   DF   t-value p-value
(Intercept) 2.7983541 0.15017647 1240 18.633772   0e+00
treatment   0.1934755 0.05055295 1240  3.827184   1e-04
 Correlation: 
          (Intr)
treatment -0.504

Standardized Within-Group Residuals:
        Min          Q1         Med          Q3         Max 
-2.69833206 -0.60733714 -0.09362515  0.52748499  3.91394332 

Number of Observations: 2489
Number of Groups: 
          midx cidx %in% midx 
            11           1248 

Advanced Topic: More Random Effects

  • The above LME model only involves random intercepts.
  • There might be random effects due to multiple sources.
  • A model with more random-effects might be a better choice.
  • Visualization is a useful exploratory tool to help identify an appropriate model.

Advanced Topic: More Random Effects

Ca++ event integrated amplitudes at baseline vs 24h after treatment for the neurons from four mice (labeled as A, B, C and D) with each dot representing a neuron. The four plots on the left are “spaghetti” plots of the four animals with each line representing the values at baseline and 24h after treatment for a neuron; the four plots on the right report the before-after scatter plots (with fitted straight lines using a simple linear regression).

Advanced Topic: More Random Effects

Code
#mouse: random intercepts; neuron: both random intercepts and random slopes
#(independence not assumed)
lme.obj1=lme(res~ treatment, random =~1| midx/cidx, data= Ex3[!is.na(Ex3$res),], method="ML")
summary(lme.obj1)
lme.obj3=lme(res~ treatment, random=list(midx=~1, cidx=~treatment), data=
Ex3[!is.na(Ex3$res),], method="ML")
summary(lme.obj3)
anova(lme.obj1, lme.obj3)
#mouse: random intercepts and random slopes (independence not assumed); neuron: random intercepts
lme.obj4=lme(res~ treatment, random=list(midx=~treatment, cidx=~1), data= Ex3[!is.na(Ex3$res),], method="ML")
summary(lme.obj4)
#mouse: random intercepts and random slopes; neuron: random intercepts and random slopes
lme.obj5=lme(res~ treatment, random= ~ 1+treatment | midx/cidx, data= Ex3[!is.na(Ex3$res),], method="ML")
summary(lme.obj5)
anova(lme.obj1, lme.obj3)
anova(lme.obj1, lme.obj4)
anova(lme.obj3, lme.obj5)

On models with more random effects

  • For example 3, the model I chose have the following random-effects:

“random=list(midx=~1, cidx=~treatment)”

  • It improves the random intercept model substantially.

  • Adding more random-effects does not lead to further improvement

GLMM: Example 4

Generalized Linear Mixed-Effects Model (GLMM)

  • The components of a GLMM:
    • fixed-effects: a linear predictor \(X\beta\)
    • random-effects: \(Z\mathbf u\), where \(\mathbf u\sim N(0, G)\). E.g., \(G=\sigma_b^2 \mathbf I\).
    • a link function to connect \(E(Y|X, \mathbf u)\) and \(X\beta\): \[g(E(Y|X, \mathbf u))=X\beta + Z\mathbf u\]
    • a distribution for \(Y\)

GLMM Examples: a Simulated Data Set

  • The simulation used parameters estimated from real data
  • Eight mice were trained to do task
  • The behavior outcome is whether the animals make the correct predictions
    • 512 trials in total: 216 correct trials, 296 wrong trials
  • Mean neuronal activity levels (dF/F) were recorded for each trial
  • We would like to model behaviors using neuronal data (decoding)

Read Data

Code
library(pbkrtest)
waterlick=read.table("https://www.ics.uci.edu/~zhaoxia/Data/BeyondTandANOVA/waterlick_sim.txt", head=T)
#summary(waterlick)
#change the mouseID to a factor
waterlick$mouseID=as.factor(waterlick$mouseID)
waterlick$lick=as.factor(waterlick$lick)

Visualize the data

Code
ggplot(waterlick, aes(x=mouseID, y=dff, by=lick, color=lick)) + geom_boxplot() 

Use lme4::glmer to fit a GLMM

Code
obj.glmm=glmer(lick~dff+(1|mouseID),
data=waterlick,family="binomial")
summary(obj.glmm)$coef
               Estimate Std. Error   z value     Pr(>|z|)
(Intercept) -0.63382066 0.17752804 -3.570257 0.0003566318
dff          0.06234582 0.01985968  3.139316 0.0016934268

Compute odds ratio and a 95% CI

Code
exp(c(0.06235, 0.06235-1.96*0.01986, 0.06235+1.96*0.01986))
[1] 1.064335 1.023701 1.106582

Compute percentage of increase in odds ratio and a 95% CI

Code
100*(exp(c(0.06235, 0.06235-1.96*0.01986, 0.06235+1.96*0.01986))-1)
[1]  6.433480  2.370091 10.658157

Interpret GLMM results

  • The estimate of odd is 6.4% increase and a 95% confidence interval is 2.3% to 10.7%
  • The interpretation of the fixed effects for GLMM is complicated by both
    • the random effects and
    • non-linear link functions
  • Among typical mice, the odds of making correct licks increased by 6.4% (95% C.I.: 2.4%-10.7%) with one unit increase in dF/F.

LRT test

  • Likelihood ratio test can be done by comparing the model with and the model without the ``dff” variance (neuronal activity). Large-sample approximation is used.
Code
#fit a smaller model, the model with the dff variable removed
obj.glmm.smaller=glmer(lick~(1|mouseID),
data=waterlick,family="binomial")
#use the anova function to compare the likelihoods of the two models
anova(obj.glmm, obj.glmm.smaller)
#alternatively, one can use the “drop1” function to test the effect of dfff
drop1(obj.glmm, test="Chisq")

Resampling Methods for More Accurate Results

  • The large-sample approximations in GLMM might not be accurate
  • We show how to conduct a parametric bootstrap test
Code
#The code might take a few minutes
PBmodcomp(obj.glmm, obj.glmm.smaller)
  • By default, 1000 samples were generated to obtain an empirical null distribution of the likelihood ratio statistic

Convergence Issues in GLMM

  • GLMM is harder to converge than LME.
    • Increase the number of iterations
    • Switch to a different numerical maximization methods
    • Modify models such as eliminate some random effects

https://rstudio-pubs-static.s3.amazonaws.com/33653_57fc7b8e5d484c909b615d8633c01d51.html

https://bbolker.github.io/mixedmodels-misc/glmmFAQ.html

https://m-clark.github.io/posts/2020-03-16-convergence/

Convergence Issues in GLMM

  • Consider more robust methods such generalized estimating equation (GEE)
  • Oftentimes, Bayesian approaches are easier to converge

Conclusion

Conclusion

Code
# two figs side by side
knitr::include_graphics(c("images/Fig1S.jpg",
                   "images/Fig2S.jpg"))

  • Observational units might not be independent
  • Ignoring the dependence in observations might lead to irreproducible results
  • Mixed-effects models are a useful tool for analyzing clustered data

Additional Visualizations for Example 1

Boxplots by R base graphics

Code
#Use base graphics 
mycolors=rep(1:5, c(7,6,3,3,5)) #different colors for different treatment groups

#a basic plot of boxplots by mice
#Mice in the same treatment groups use the same color
par(mfrow=c(2,1))
boxplot(res~midx, data=Ex1, col=mycolors, xaxt="n")
axis(1, at = 1+c(1, 8, 14, 17, 20),
     labels = c("baseline", "24h", "48h", "72h", "1wk"))

#boxplot with jitter
boxplot(res~midx, data=Ex1, col=0, xaxt="n")
axis(1, at = 1+c(1, 8, 14, 17, 20),
     labels = c("baseline", "24h", "48h", "72h", "1wk"))
stripchart(res ~ midx, vertical = TRUE, data = Ex1, 
           method = "jitter", add = TRUE, pch = 20, col = mycolors)

Violin plots generated by the vioplot package

Code
par(mfrow=c(2,1)) #split the plot window to two vertically-stacked ones
vioplot(res~midx, data=Ex1, col=mycolors, xaxt = "n")
axis(1, at = 1+c(1, 8, 14, 17, 20),
     labels = c("baseline", "24h", "48h", "72h", "1wk"))

#violin plot with jitters
vioplot(res~midx, data=Ex1, col=0, xaxt="n")
stripchart(res ~ midx, vertical = TRUE, data = Ex1, 
           method = "jitter", add = TRUE, pch = 20, col = mycolors)
axis(1, at = 1+c(1, 8, 14, 17, 20),
     labels = c("baseline", "24h", "48h", "72h", "1wk"))

Fancy plots generated by ggplot2 package

Code
plot1=ggplot(Ex1, aes(x = midx, y = res, fill=trt_idx)) + 
  geom_violin()
#boxplot within violin plot
plot2=ggplot(Ex1, aes(x = midx, y = res, fill=trt_idx)) + 
  geom_violin()+
  geom_boxplot(width=0.1)
grid.arrange(plot1, plot2, ncol=1, nrow=2)#library(gridExtra)