Mixed Effects Models

Import the politesness data from a study that compares pitch with scenario and formality.

Structure of Study:

politeness<-read.csv("/Users/heatherkitada/Desktop/MATH138/politeness.csv", 
                     header=TRUE)
head(politeness)
##   X subject gender scenario attitude frequency
## 1 1      F1      F        1      pol     213.3
## 2 2      F1      F        1      inf     204.5
## 3 3      F1      F        2      pol     285.1
## 4 4      F1      F        2      inf     259.7
## 5 5      F1      F        3      pol     203.9
## 6 6      F1      F        3      inf     286.9
# how many subjects are there
unique(politeness$subject)
## [1] F1 F3 M4 M7 F2 M3
## Levels: F1 F2 F3 M3 M4 M7

Looking at the data

## ── Attaching packages ──────────────────────────────────── tidyverse 1.2.1 ──
## ✔ ggplot2 3.2.0       ✔ purrr   0.3.2  
## ✔ tibble  2.1.1       ✔ dplyr   0.8.0.1
## ✔ tidyr   0.8.3       ✔ stringr 1.4.0  
## ✔ readr   1.3.1       ✔ forcats 0.4.0
## ── Conflicts ─────────────────────────────────────── tidyverse_conflicts() ──
## ✖ dplyr::filter() masks stats::filter()
## ✖ dplyr::lag()    masks stats::lag()

There clearly is variablity across subjects and an overall pattern for gender. Male subject speak at lower frequencies.

# comparing across scenarios
ggplot(data=politeness, aes(as.factor(scenario), frequency))+
  geom_boxplot()
## Warning: Removed 1 rows containing non-finite values (stat_boxplot).

# politeness and pitch by gender
boxplot(frequency ~ attitude*gender,
        col=c("white","lightgray"),politeness)

There is also variability across scarios; however, it is harder to seperate.

Fitting a Mixed Model

library(lme4)
## Loading required package: Matrix
## 
## Attaching package: 'Matrix'
## The following object is masked from 'package:tidyr':
## 
##     expand
# attitude with a random intercept for subject
mod1<-lmer(frequency~attitude+(1|subject), data=politeness)
summary(mod1)
## Linear mixed model fit by REML ['lmerMod']
## Formula: frequency ~ attitude + (1 | subject)
##    Data: politeness
## 
## REML criterion at convergence: 804.7
## 
## Scaled residuals: 
##     Min      1Q  Median      3Q     Max 
## -2.2953 -0.6018 -0.2005  0.4774  3.1772 
## 
## Random effects:
##  Groups   Name        Variance Std.Dev.
##  subject  (Intercept) 3982     63.10   
##  Residual              851     29.17   
## Number of obs: 83, groups:  subject, 6
## 
## Fixed effects:
##             Estimate Std. Error t value
## (Intercept)  202.588     26.151   7.747
## attitudepol  -19.376      6.407  -3.024
## 
## Correlation of Fixed Effects:
##             (Intr)
## attitudepol -0.121
# attitude and gender with a random intercept for subject
mod2<-lmer(frequency~attitude+gender+(1|subject), data=politeness)
summary(mod2)
## Linear mixed model fit by REML ['lmerMod']
## Formula: frequency ~ attitude + gender + (1 | subject)
##    Data: politeness
## 
## REML criterion at convergence: 786.7
## 
## Scaled residuals: 
##     Min      1Q  Median      3Q     Max 
## -2.3619 -0.5305 -0.1724  0.4647  3.2260 
## 
## Random effects:
##  Groups   Name        Variance Std.Dev.
##  subject  (Intercept) 603.9    24.57   
##  Residual             850.9    29.17   
## Number of obs: 83, groups:  subject, 6
## 
## Fixed effects:
##             Estimate Std. Error t value
## (Intercept)  256.691     15.226  16.859
## attitudepol  -19.410      6.407  -3.030
## genderM     -108.205     21.063  -5.137
## 
## Correlation of Fixed Effects:
##             (Intr) atttdp
## attitudepol -0.210       
## genderM     -0.692  0.004
# attitude and gender with a random intercept for subject and scenario
mod3<-lmer(frequency~attitude+gender+(1|subject)+(1|scenario), data=politeness)
summary(mod3)
## Linear mixed model fit by REML ['lmerMod']
## Formula: frequency ~ attitude + gender + (1 | subject) + (1 | scenario)
##    Data: politeness
## 
## REML criterion at convergence: 775.5
## 
## Scaled residuals: 
##     Min      1Q  Median      3Q     Max 
## -2.2591 -0.6235 -0.0773  0.5389  3.4795 
## 
## Random effects:
##  Groups   Name        Variance Std.Dev.
##  scenario (Intercept) 219.3    14.81   
##  subject  (Intercept) 615.7    24.81   
##  Residual             645.9    25.41   
## Number of obs: 83, groups:  scenario, 7; subject, 6
## 
## Fixed effects:
##             Estimate Std. Error t value
## (Intercept)  256.846     16.116  15.937
## attitudepol  -19.721      5.584  -3.532
## genderM     -108.516     21.015  -5.164
## 
## Correlation of Fixed Effects:
##             (Intr) atttdp
## attitudepol -0.173       
## genderM     -0.652  0.004

Comparing Models with ANOVA

# reduced model
modRed<-lmer(frequency~gender+(1|subject)+(1|scenario), data=politeness, REML=FALSE)

#full model
modFull<-lmer(frequency~attitude+gender+(1|subject)+(1|scenario), data=politeness, REML=FALSE)

anova(modRed, modFull)
## Data: politeness
## Models:
## modRed: frequency ~ gender + (1 | subject) + (1 | scenario)
## modFull: frequency ~ attitude + gender + (1 | subject) + (1 | scenario)
##         Df    AIC    BIC  logLik deviance  Chisq Chi Df Pr(>Chisq)    
## modRed   5 816.72 828.81 -403.36   806.72                             
## modFull  6 807.10 821.61 -397.55   795.10 11.618      1  0.0006532 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Repeated Measures

This data comes from a sleep study in the MASS package.

Let’s first visualize the data

library(MASS)
## 
## Attaching package: 'MASS'
## The following object is masked from 'package:dplyr':
## 
##     select
data(sleepstudy)
head(sleepstudy)
##   Reaction Days Subject
## 1 249.5600    0     308
## 2 258.7047    1     308
## 3 250.8006    2     308
## 4 321.4398    3     308
## 5 356.8519    4     308
## 6 414.6901    5     308
ggplot(sleepstudy, aes(Days, Reaction))+
  geom_point()+
  geom_smooth(method="lm")+
  facet_wrap(~Subject)

It appears that both the slope and intercept vary across subjects

# model with random slope
mod<-lmer(Reaction~Days+(1|Subject), data = sleepstudy)
summary(mod)
## Linear mixed model fit by REML ['lmerMod']
## Formula: Reaction ~ Days + (1 | Subject)
##    Data: sleepstudy
## 
## REML criterion at convergence: 1786.5
## 
## Scaled residuals: 
##     Min      1Q  Median      3Q     Max 
## -3.2257 -0.5529  0.0109  0.5188  4.2506 
## 
## Random effects:
##  Groups   Name        Variance Std.Dev.
##  Subject  (Intercept) 1378.2   37.12   
##  Residual              960.5   30.99   
## Number of obs: 180, groups:  Subject, 18
## 
## Fixed effects:
##             Estimate Std. Error t value
## (Intercept) 251.4051     9.7467   25.79
## Days         10.4673     0.8042   13.02
## 
## Correlation of Fixed Effects:
##      (Intr)
## Days -0.371
# model with random slope and intercept
mod1<-lmer(Reaction~Days+(Days|Subject), data = sleepstudy)
summary(mod1)
## Linear mixed model fit by REML ['lmerMod']
## Formula: Reaction ~ Days + (Days | Subject)
##    Data: sleepstudy
## 
## REML criterion at convergence: 1743.6
## 
## Scaled residuals: 
##     Min      1Q  Median      3Q     Max 
## -3.9536 -0.4634  0.0231  0.4633  5.1793 
## 
## Random effects:
##  Groups   Name        Variance Std.Dev. Corr
##  Subject  (Intercept) 611.90   24.737       
##           Days         35.08    5.923   0.07
##  Residual             654.94   25.592       
## Number of obs: 180, groups:  Subject, 18
## 
## Fixed effects:
##             Estimate Std. Error t value
## (Intercept)  251.405      6.824  36.843
## Days          10.467      1.546   6.771
## 
## Correlation of Fixed Effects:
##      (Intr)
## Days -0.138