knitr::opts_chunk$set(echo =TRUE)library("dplyr") #a library provides convenient data manipulationlibrary(ICC) #load the library to conduct ICC analysis with its function ICCbarelibrary(nlme) #load the nlme librarylibrary(lme4)library(lmerTest)library(ggplot2)library(gridExtra)library("vioplot") #note, you will receive an error message if the package was not installed
Learning Objectives
Motivating examples
Why do the “familiar” approaches fail?
LM, LME, GLM, and GLMM
LME examples: Examples 1 - 3
Generalized Linear Mixed-Effects Model (GLMM): Example 4
Which time points are different from the baseline?
pCREB stain intensities measured from 24 mice at 5 conditions
Read data
Code
Ex1=read.csv("https://www.ics.uci.edu/~zhaoxia/Data/BeyondTandANOVA/Example1.txt", head=T)names(Ex1)=c("res", "trt_idx", "midx")#Do not forget to factor the treatment IDs and animal IDs#This is particularly important for the trt_idx, #else the values will be treated as numerical values, rather than levelsEx1$trt_idx =factor(Ex1$trt_idx,labels=c("baseline", "24h", "48h", "72h", "1week"))Ex1$midx =factor(Ex1$midx)#head(Ex1)
Is the familiar approach valid? We evaluate the method using data generated under the hypothesis
We can generate a null data set by permuting the treatment group labels of the animals
We generate 1000 null data sets and check how many times the familiar approach will reject the null hypothesis of no group difference
Example 1: the “Familiar” approach failed
We evaluated the familiar approaches (lm, anova) using 1000 data sets simulated under the null hypothesis of no treatment effects.
To simulate a data set under the null hypothesis, we shuffled the treatment labels of the animals in the original data.
The overall p-value of treatment effects was calculated using each simulated data set.
If the anova approach worked, there number of p-values less than 0.05 should be around 0.05.
Code
treatment=as.factor(rep(1:5, c(7,6,3,3,5)))ncell=sapply(split(Ex1, Ex1$midx), dim)[1,]#generate pseudo (permuted) 1000 times by randomly #shuffling the treatment labels across micepvalues=rep(NA, 1000)#initialize a vector of p-valuesfor(i in1:1000) { Ex1.perm =data.frame(res=Ex1$res, trt_idx=rep(sample(treatment),ncell), midx=Ex1$midx) pvalues[i]=anova(lm(res~trt_idx, data=Ex1.perm))$"Pr(>F)"[1] }
Example 1: P-values using 1000 null data sets
What does the histogram suggest?
Code
hist(pvalues)
Example 2
Research question: determine how in vivo calcium (Ca++) activity of PV cells (measured longitudinally by the genetically encoded Ca++ indicator GCaMP6s) changes over time after ketamine treatment
Study: Ca++ event frequencies were measured at 24h, 48h, 72h, and 1 week after ketamine treatment in four mice
Want to compare Ca++ event frequency at 24h to the other three time points.
In total, Ca++ event frequencies of 1,724 neurons were measured.
Read data:
Code
Ex2=read.csv("https://www.ics.uci.edu/~zhaoxia/Data/BeyondTandANOVA/Example2.txt", head=T)names(Ex2)=c("res", "trt_idx", "midx")Ex2$trt_idx=factor(Ex2$trt_idx, labels=c("24h", "48h", "72h", "1week"))### covert the variable of mouse IDs to a factorEx2$midx=factor(Ex2$midx)
The LM (including ANOVA, t-test) analysis results indicate
significantly reduced Ca++ activity at 48h relative to 24h with \(p=4.8\times 10^{-6}\)
significantly increased Ca++ activity at 1 week compared to 24h with \(p=2.4\times 10^{-3}\)
However, the figure in next slides does not support this conclusion
Example 2: Pooling data naively might be misleading
Code
knitr::include_graphics("images/Fig2S.jpg")
The boxplots of Ca++ event frequencies measured at four time points. (A) Boxplot of Ca++ event frequencies using the pooled neurons from four mice. (B) boxplots of Ca++ event frequencies stratified by individual mice.
Example 2: Pooling data naively might be misleading
Consider the change in Ca++ activities from 24h to 48h
Pooled data from all mice:
The box plots suggest reduction in Ca++ activities
Individual mice data:
The box plots of Mouse 2 suggest a noticeable reduction
However, there was almost no change in Mouse 1
Mouse 3 and Mouse 4 might suggest small increases, rather than decreases
Example 2: Pooling data naively might be misleading
Why do the pooled data follow the pattern of Mouse 2?
Code
# the number of cells in each animal-time combinationtable(Ex2$midx, Ex2$trt_idx)
# compute the percent of cells contributed by each mouserowSums(table(Ex2$midx, Ex2$trt_idx))/1724
1 2 3 4
0.2703016 0.4286543 0.1792343 0.1218097
Mouse 2 contributed 43% cells!
Clustered Data Are Dependent
Why does LM fail for Example 1?
This because the observations are not independent
We can compute Intra-Class Correlation (ICC) to quantify the magnitude of clustering due to animal effects.
ICC Analysis of Example 1
Code
### conduct ICC analysis by organizing all the information into a data frameicc.analysis=data.frame(n=rep(0,5), icc=rep(0,5), design.effect=rep(0,5),effective.n=rep(0,5), M=rep(0,5), cells=rep(0,5))treatment.groups=c("baseline", "24h", "48h", "72h", "1week")for(i in1:5){ trt= Ex1[Ex1$trt_idx==treatment.groups[i],] trt$midx=factor(trt$midx) icc=ICCbare(factor(trt$midx), trt$res) #ICCbare is a function in the ICC package icc.analysis$cells[i]=dim(trt)[1] M=dim(trt)[1]/length(unique( trt$midx)) def=1+ icc*(M-1) icc.analysis$n[i]=length(unique( trt$midx)) icc.analysis$icc[i]=icc icc.analysis$design.effect[i]=def icc.analysis$effective.n[i]=dim(trt)[1]/def icc.analysis$M[i]=M}
The ICC indicates that the dependency due to clustering is substantial.
Therefore, the 1,200 neurons should not be treated as 1,200 independent cells.
When dependence is not adequately accounted for, the type I error rate can be much higher than the pre-chosen level of significance.
From LM to LME
LM (incorrect!) for Example 1
Consider Example 1. Let
\(Y_{ij}\) indicate the \(j\)th observed response of the \(i\)th mouse.
\(x_{ij}\) be the treatment label, with \(x_{ij}=1\) for baseline, \(x_{ij}=2\) for 24 hours, \(x_{ij}=3\) for 48 hours, \(x_{ij}=4\) for 72 hours, and \(x_{ij}=5\) for 1 week after ketamine treatments.
In the inner mathematical computation, four dummy variables, which take value 0 or 1, are generated: \(x_{ij,1} = 1\) for 24 hours, \(x_{ij,2} = 1\) for 48 hours, \(x_{ij,3} = 1\) for 72 hours, and \(x_{ij,4} = 1\) for 1 week after ketamine treatments, respectively.
\[\begin{aligned}
Y_{ij} &= \beta_0 + x_{ij,1}\beta_1 + … + x_{ij,4}\beta_4 + \epsilon_{ij}, \\
& i=1, …, 24; j=1, …, n_i;\end{aligned}\] where \(n_i\) is the number of observations from the \(i\)th mouse.
The 1200 observations are by animal. We account for the resulting by adding an animal specific effect, as follows: \[\begin{aligned}
Y_{ij} &= \beta_0 + x_{ij,1}\beta_1 + … + x_{ij,4}\beta_4 + u_i + \epsilon_{ij}, \\
& i=1, …, 24; j=1, …, n_i;
\end{aligned}\] where
\(u_i\) indicates the deviance between the overall intercept \(\beta_0\) and the mean specific to the \(i\)th mouse
\(\epsilon_{ij}\) represents the deviation in pCREB immunoreactivity of observation (cell) \(j\) in mouse \(i\) from the mean pCREB immunoreactivity of mouse i
(\(\beta_0, \beta_1, \beta_2, \beta_3, \beta_4\)) are assumed to be but unknown
(\(u_1, \cdots, u_{24}\)) are treated as independent and identically distributed variables from a normal distribution with mean 0 and a variance parameter that reflects the variation across animals.
LME for Example 1
Similar to the treatment variable, for the animal ID variable, the users do not need to define the dummy variables, which are generated by R automatically in its inner working.
Thus, equivalently, one could write the previous equation by using a vector (\(z_{ij,1}, …, z_{ij,24}\)) of dummy variables for the cluster/animal memberships such that \(z_{ij,k}=1\) for \(i=k\) and 0 otherwise:
the fixed-effects from the covariates (\(x_{xij,1}, …, x_{ij,4}\)) and the overall intercept \(\beta_0\), which is the population mean of the reference group in this example
the random-effects due to the clustering (\(z_{ij,1}, …, z_{ij,24}\))
the random errors \(\epsilon_{ij}\)’s
R Packages for LME
Two major packages are ‘nlme’ and ‘lme4’.
Syntax:
‘nlme::lme(res~trt_idx, data= Ex1, random = ~ 1|midx)’
‘lme4::lmer(res ~ trt_idx+(1|midx), data=Ex1)’
For the fixed-effects (trt_idxhere), make sure that it is a factor, not numerical, as the levels “1-5” denote different times points
For the random-effects from “midx”(mice), R treated it as a factor with different levels (animals)
LM, LME, GLM, and GLMM
LM and LME: Matrix Format
LM:
a linear predictor \(X\beta\)
random errors \(\mathbf \epsilon\) are independent, have a zero mean and a constant variance.
\(\mathbf \epsilon \sim N(0, \sigma^2 \mathbf I)\) is used for deriving t- and F-tests. Typically this assumption is not very critical as long as the sample size is not too small
a link function to connect \(E(Y|X, \mathbf u)\) and \(X\beta\):
a distribution for \(Y\) given \(E(Y|X, \mathbf u)\)
LME Examples: Example 1
LME Examples: Example 1
Code
# The nlme:lme function specifies the fixed effects in the formula# (first argument) of the function, and the random effects# as an optional argument (random=). The vertical bar | denotes that# the cluster is done through the animal id (midx)obj.lme=lme(res~trt_idx, data= Ex1, random =~1|midx, method="ML")summary(obj.lme)$tTable
#### wrong analysis: using the linear modelsummary(lm(res~treatment, data=Ex3[!is.na(Ex3$res),])) #0.0036
Call:
lm(formula = res ~ treatment, data = Ex3[!is.na(Ex3$res), ])
Residuals:
Min 1Q Median 3Q Max
-3.1311 -1.3203 -0.1806 1.1438 6.7518
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 2.73206 0.10817 25.258 <2e-16 ***
treatment 0.19952 0.06847 2.914 0.0036 **
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Residual standard error: 1.708 on 2487 degrees of freedom
Multiple R-squared: 0.003403, Adjusted R-squared: 0.003002
F-statistic: 8.492 on 1 and 2487 DF, p-value: 0.0036
Code
#### wrong analysis using t tests (paired or unpaired)t.test(Ex3[Ex3$treatment==1,"res"], Ex3[Ex3$treatment==2,"res"], var.eq=T)
Two Sample t-test
data: Ex3[Ex3$treatment == 1, "res"] and Ex3[Ex3$treatment == 2, "res"]
t = -2.914, df = 2487, p-value = 0.0036
alternative hypothesis: true difference in means is not equal to 0
95 percent confidence interval:
-0.33378231 -0.06525884
sample estimates:
mean of x mean of y
2.931585 3.131106
Welch Two Sample t-test
data: Ex3[Ex3$treatment == 1, "res"] and Ex3[Ex3$treatment == 2, "res"]
t = -2.9121, df = 2347.6, p-value = 0.003624
alternative hypothesis: true difference in means is not equal to 0
95 percent confidence interval:
-0.33387620 -0.06516496
sample estimates:
mean of x mean of y
2.931585 3.131106
Paired t-test
data: Ex3[Ex3$treatment == 1, "res"] and Ex3[Ex3$treatment == 2, "res"]
t = -3.7067, df = 1240, p-value = 0.0002193
alternative hypothesis: true mean difference is not equal to 0
95 percent confidence interval:
-0.28671163 -0.08824982
sample estimates:
mean difference
-0.1874807
Analysis using LME
Code
#LMElme.obj1=lme(res~ treatment, random =~1| midx/cidx, data= Ex3[!is.na(Ex3$res),], method="ML")summary(lme.obj1)
Linear mixed-effects model fit by maximum likelihood
Data: Ex3[!is.na(Ex3$res), ]
AIC BIC logLik
9378.498 9407.596 -4684.249
Random effects:
Formula: ~1 | midx
(Intercept)
StdDev: 0.404508
Formula: ~1 | cidx %in% midx
(Intercept) Residual
StdDev: 1.083418 1.259769
Fixed effects: res ~ treatment
Value Std.Error DF t-value p-value
(Intercept) 2.7983541 0.15017647 1240 18.633772 0e+00
treatment 0.1934755 0.05055295 1240 3.827184 1e-04
Correlation:
(Intr)
treatment -0.504
Standardized Within-Group Residuals:
Min Q1 Med Q3 Max
-2.69833206 -0.60733714 -0.09362515 0.52748499 3.91394332
Number of Observations: 2489
Number of Groups:
midx cidx %in% midx
11 1248
Example 3: LM vs LME
LME and LM produce similar estimates for the fix-effects coefficients
the standard error of the LM is larger; the p-value based on LME is smaller (0.0036 for LM vs 0.0001 for LME).
In this example, since the two measures from each cell are positively correlated, the variance of the differences is smaller when treating the data as paired rather than independent.
As a result, LME produces a smaller p-value
(Left) the scatter plot of Ca++ event integrated amplitude at baseline vs 24h after treatment for the neurons from four mice (labeled as 1, 2, 3 and 4) indicates that the baseline and after-treatment measures are positively correlated. (Right) boxplot of the baseline and after-treatment correlations of the 11 mice.
A note on “nested” random effects
When specifying the nested random effects, we used “random =~1| midx/cidx”.
This leads to random effects at two levels: the mouse level and the cells-within-mouse level.
This specification is important if same cell IDs might appear in different mice.
When each cell has its unique ID, just like “cidx” variable in Example 3, it does not matter and “random =list(midx=~1, cidx=~1)” leads to exactly the same model.
A note on “nested” random effects
Verify that the cell IDs are indeed unique
Code
### to verify that the cell IDs are indeed uniquelength(unique(Ex3$cidx))
[1] 1248
Code
#lme.obj2 is the same as lme.objlme.obj2=lme(res~ treatment, random =list(midx=~1, cidx=~1),data=Ex3[!is.na(Ex3$res),], method="ML")summary(lme.obj2)
Linear mixed-effects model fit by maximum likelihood
Data: Ex3[!is.na(Ex3$res), ]
AIC BIC logLik
9378.498 9407.596 -4684.249
Random effects:
Formula: ~1 | midx
(Intercept)
StdDev: 0.404508
Formula: ~1 | cidx %in% midx
(Intercept) Residual
StdDev: 1.083418 1.259769
Fixed effects: res ~ treatment
Value Std.Error DF t-value p-value
(Intercept) 2.7983541 0.15017647 1240 18.633772 0e+00
treatment 0.1934755 0.05055295 1240 3.827184 1e-04
Correlation:
(Intr)
treatment -0.504
Standardized Within-Group Residuals:
Min Q1 Med Q3 Max
-2.69833206 -0.60733714 -0.09362515 0.52748499 3.91394332
Number of Observations: 2489
Number of Groups:
midx cidx %in% midx
11 1248
Advanced Topic: More Random Effects
The above LME model only involves random intercepts.
There might be random effects due to multiple sources.
A model with more random-effects might be a better choice.
Visualization is a useful exploratory tool to help identify an appropriate model.
Advanced Topic: More Random Effects
Ca++ event integrated amplitudes at baseline vs 24h after treatment for the neurons from four mice (labeled as A, B, C and D) with each dot representing a neuron. The four plots on the left are “spaghetti” plots of the four animals with each line representing the values at baseline and 24h after treatment for a neuron; the four plots on the right report the before-after scatter plots (with fitted straight lines using a simple linear regression).
Advanced Topic: More Random Effects
Tests can be used to compare models with different random effects.
#mouse: random intercepts; neuron: both random intercepts and random slopes#(independence not assumed)lme.obj1=lme(res~ treatment, random =~1| midx/cidx, data= Ex3[!is.na(Ex3$res),], method="ML")summary(lme.obj1)lme.obj3=lme(res~ treatment, random=list(midx=~1, cidx=~treatment), data=Ex3[!is.na(Ex3$res),], method="ML")summary(lme.obj3)anova(lme.obj1, lme.obj3)#mouse: random intercepts and random slopes (independence not assumed); neuron: random interceptslme.obj4=lme(res~ treatment, random=list(midx=~treatment, cidx=~1), data= Ex3[!is.na(Ex3$res),], method="ML")summary(lme.obj4)#mouse: random intercepts and random slopes; neuron: random intercepts and random slopeslme.obj5=lme(res~ treatment, random=~1+treatment | midx/cidx, data= Ex3[!is.na(Ex3$res),], method="ML")summary(lme.obj5)anova(lme.obj1, lme.obj3)anova(lme.obj1, lme.obj4)anova(lme.obj3, lme.obj5)
On models with more random effects
For example 3, the model I chose have the following random-effects:
“random=list(midx=~1, cidx=~treatment)”
It improves the random intercept model substantially.
Adding more random-effects does not lead to further improvement
a link function to connect \(E(Y|X, \mathbf u)\) and \(X\beta\): \[g(E(Y|X, \mathbf u))=X\beta + Z\mathbf u\]
a distribution for \(Y\)
GLMM Examples: a Simulated Data Set
The simulation used parameters estimated from real data
Eight mice were trained to do task
The behavior outcome is whether the animals make the correct predictions
512 trials in total: 216 correct trials, 296 wrong trials
Mean neuronal activity levels (dF/F) were recorded for each trial
We would like to model behaviors using neuronal data (decoding)
Read Data
Code
library(pbkrtest)waterlick=read.table("https://www.ics.uci.edu/~zhaoxia/Data/BeyondTandANOVA/waterlick_sim.txt", head=T)#summary(waterlick)#change the mouseID to a factorwaterlick$mouseID=as.factor(waterlick$mouseID)waterlick$lick=as.factor(waterlick$lick)
The estimate of odd is 6.4% increase and a 95% confidence interval is 2.3% to 10.7%
The interpretation of the fixed effects for GLMM is complicated by both
the random effects and
non-linear link functions
Among typical mice, the odds of making correct licks increased by 6.4% (95% C.I.: 2.4%-10.7%) with one unit increase in dF/F.
LRT test
Likelihood ratio test can be done by comparing the model with and the model without the ``dff” variance (neuronal activity). Large-sample approximation is used.
Code
#fit a smaller model, the model with the dff variable removedobj.glmm.smaller=glmer(lick~(1|mouseID),data=waterlick,family="binomial")#use the anova function to compare the likelihoods of the two modelsanova(obj.glmm, obj.glmm.smaller)#alternatively, one can use the “drop1” function to test the effect of dfffdrop1(obj.glmm, test="Chisq")
Resampling Methods for More Accurate Results
The large-sample approximations in GLMM might not be accurate
We show how to conduct a parametric bootstrap test
Code
#The code might take a few minutesPBmodcomp(obj.glmm, obj.glmm.smaller)
By default, 1000 samples were generated to obtain an empirical null distribution of the likelihood ratio statistic
Convergence Issues in GLMM
GLMM is harder to converge than LME.
Increase the number of iterations
Switch to a different numerical maximization methods
Modify models such as eliminate some random effects
Consider more robust methods such generalized estimating equation (GEE)
Oftentimes, Bayesian approaches are easier to converge
Conclusion
Conclusion
Code
# two figs side by sideknitr::include_graphics(c("images/Fig1S.jpg","images/Fig2S.jpg"))
Observational units might not be independent
Ignoring the dependence in observations might lead to irreproducible results
Mixed-effects models are a useful tool for analyzing clustered data
Additional Visualizations for Example 1
Boxplots by R base graphics
Code
#Use base graphics mycolors=rep(1:5, c(7,6,3,3,5)) #different colors for different treatment groups#a basic plot of boxplots by mice#Mice in the same treatment groups use the same colorpar(mfrow=c(2,1))boxplot(res~midx, data=Ex1, col=mycolors, xaxt="n")axis(1, at =1+c(1, 8, 14, 17, 20),labels =c("baseline", "24h", "48h", "72h", "1wk"))#boxplot with jitterboxplot(res~midx, data=Ex1, col=0, xaxt="n")axis(1, at =1+c(1, 8, 14, 17, 20),labels =c("baseline", "24h", "48h", "72h", "1wk"))stripchart(res ~ midx, vertical =TRUE, data = Ex1, method ="jitter", add =TRUE, pch =20, col = mycolors)
Violin plots generated by the vioplot package
Code
par(mfrow=c(2,1)) #split the plot window to two vertically-stacked onesvioplot(res~midx, data=Ex1, col=mycolors, xaxt ="n")axis(1, at =1+c(1, 8, 14, 17, 20),labels =c("baseline", "24h", "48h", "72h", "1wk"))#violin plot with jittersvioplot(res~midx, data=Ex1, col=0, xaxt="n")stripchart(res ~ midx, vertical =TRUE, data = Ex1, method ="jitter", add =TRUE, pch =20, col = mycolors)axis(1, at =1+c(1, 8, 14, 17, 20),labels =c("baseline", "24h", "48h", "72h", "1wk"))
Fancy plots generated by ggplot2 package
Code
plot1=ggplot(Ex1, aes(x = midx, y = res, fill=trt_idx)) +geom_violin()#boxplot within violin plotplot2=ggplot(Ex1, aes(x = midx, y = res, fill=trt_idx)) +geom_violin()+geom_boxplot(width=0.1)grid.arrange(plot1, plot2, ncol=1, nrow=2)#library(gridExtra)