Linear Mixed Model Practice
This week we practiced building linear mixed models in R. In particular, we worked with nested and crossed random effect structures.
Here is an example of crossed random effects from the \(Penicillin\) data set:
library(lme4)
## Warning: package 'lme4' was built under R version 3.6.3
## Loading required package: Matrix
dat1<-Penicillin
head(dat1)
## diameter plate sample
## 1 27 a A
## 2 23 a B
## 3 26 a C
## 4 23 a D
## 5 23 a E
## 6 21 a F
Here the response variable is the diameter of the growth of the organism. The random effects are defined by the plate and sample variables. Different plates interact with different samples, so these random effects are crossed.
Here is how to make a model for the grand mean of diameter, using crossed random effects. Note that the order of the random effects does not matter in the \(lmer()\) function for crossed effects.
gmean<-lmer(diameter~1+(1|plate)+(1|sample),data = dat1)
summary(gmean)
## Linear mixed model fit by REML ['lmerMod']
## Formula: diameter ~ 1 + (1 | plate) + (1 | sample)
## Data: dat1
##
## REML criterion at convergence: 330.9
##
## Scaled residuals:
## Min 1Q Median 3Q Max
## -2.07923 -0.67140 0.06292 0.58377 2.97959
##
## Random effects:
## Groups Name Variance Std.Dev.
## plate (Intercept) 0.7169 0.8467
## sample (Intercept) 3.7311 1.9316
## Residual 0.3024 0.5499
## Number of obs: 144, groups: plate, 24; sample, 6
##
## Fixed effects:
## Estimate Std. Error t value
## (Intercept) 22.9722 0.8086 28.41
Reading this output, we can see that the largest source of variability comes from the random intercept for sample. Its variance is \(3.7311\).
Here is an example of nested random effects from the \(eggs\) data set.
library(faraway)
## Warning: package 'faraway' was built under R version 3.6.3
dat2<-eggs
head(dat2)
## Fat Lab Technician Sample
## 1 0.62 I one G
## 2 0.55 I one G
## 3 0.34 I one H
## 4 0.24 I one H
## 5 0.80 I two G
## 6 0.68 I two G
Fat is the response, the measure of fat content in a given sample. Lab, technician, and sample are all variables that define the random effects. In particular, sample is nested in technician and technicians are nested in labs.
Here is a grand mean model for such a random effect structure.
gmean3<-lmer(Fat~(1|Lab/Technician/Sample),data=dat2)
summary(gmean3)
## Linear mixed model fit by REML ['lmerMod']
## Formula: Fat ~ (1 | Lab/Technician/Sample)
## Data: dat2
##
## REML criterion at convergence: -64.2
##
## Scaled residuals:
## Min 1Q Median 3Q Max
## -2.04098 -0.46576 0.00927 0.59713 1.54276
##
## Random effects:
## Groups Name Variance Std.Dev.
## Sample:(Technician:Lab) (Intercept) 0.003065 0.05536
## Technician:Lab (Intercept) 0.006980 0.08355
## Lab (Intercept) 0.005920 0.07694
## Residual 0.007196 0.08483
## Number of obs: 48, groups:
## Sample:(Technician:Lab), 24; Technician:Lab, 12; Lab, 6
##
## Fixed effects:
## Estimate Std. Error t value
## (Intercept) 0.38750 0.04296 9.019
Alternatively, we could create a model without the sample random intercept.
gmean4<-lmer(Fat~(1|Lab/Technician),data=dat2)
summary(gmean4)
## Linear mixed model fit by REML ['lmerMod']
## Formula: Fat ~ (1 | Lab/Technician)
## Data: dat2
##
## REML criterion at convergence: -62.6
##
## Scaled residuals:
## Min 1Q Median 3Q Max
## -2.17799 -0.42424 0.08043 0.67361 1.77544
##
## Random effects:
## Groups Name Variance Std.Dev.
## Technician:Lab (Intercept) 0.008002 0.08945
## Lab (Intercept) 0.005920 0.07694
## Residual 0.009239 0.09612
## Number of obs: 48, groups: Technician:Lab, 12; Lab, 6
##
## Fixed effects:
## Estimate Std. Error t value
## (Intercept) 0.38750 0.04296 9.019
The R-code syntax for nested random effects could have also been written as:
\[gmean4<-lmer(Fat\thicksim{}(1|Lab)+(1|Lab:Technician),data=dat2) \]
These two lines of code would have given the same model.
Note how the grand mean estimate stays the same for these two models at .3875. The variance of the lab random effect stays the same as well. The technician random intercept variance increases from 0.006980 to 0.008002 when we remove the sample random intercept. This makes sense since the sample is nested within the technician. Thus when sample is removed, its variability is now absorbed into the variability of the technicians.