Import the politesness data from a study that compares pitch with scenario and formality.
Structure of Study:
politeness<-read.csv("/Users/heatherkitada/Desktop/MATH138/politeness.csv",
header=TRUE)
head(politeness)
## X subject gender scenario attitude frequency
## 1 1 F1 F 1 pol 213.3
## 2 2 F1 F 1 inf 204.5
## 3 3 F1 F 2 pol 285.1
## 4 4 F1 F 2 inf 259.7
## 5 5 F1 F 3 pol 203.9
## 6 6 F1 F 3 inf 286.9
# how many subjects are there
unique(politeness$subject)
## [1] F1 F3 M4 M7 F2 M3
## Levels: F1 F2 F3 M3 M4 M7
## ── Attaching packages ──────────────────────────────────── tidyverse 1.2.1 ──
## ✔ ggplot2 3.2.0 ✔ purrr 0.3.2
## ✔ tibble 2.1.1 ✔ dplyr 0.8.0.1
## ✔ tidyr 0.8.3 ✔ stringr 1.4.0
## ✔ readr 1.3.1 ✔ forcats 0.4.0
## ── Conflicts ─────────────────────────────────────── tidyverse_conflicts() ──
## ✖ dplyr::filter() masks stats::filter()
## ✖ dplyr::lag() masks stats::lag()
There clearly is variablity across subjects and an overall pattern for gender. Male subject speak at lower frequencies.
# comparing across scenarios
ggplot(data=politeness, aes(as.factor(scenario), frequency))+
geom_boxplot()
## Warning: Removed 1 rows containing non-finite values (stat_boxplot).
# politeness and pitch by gender
boxplot(frequency ~ attitude*gender,
col=c("white","lightgray"),politeness)
There is also variability across scarios; however, it is harder to seperate.
library(lme4)
## Loading required package: Matrix
##
## Attaching package: 'Matrix'
## The following object is masked from 'package:tidyr':
##
## expand
# attitude with a random intercept for subject
mod1<-lmer(frequency~attitude+(1|subject), data=politeness)
summary(mod1)
## Linear mixed model fit by REML ['lmerMod']
## Formula: frequency ~ attitude + (1 | subject)
## Data: politeness
##
## REML criterion at convergence: 804.7
##
## Scaled residuals:
## Min 1Q Median 3Q Max
## -2.2953 -0.6018 -0.2005 0.4774 3.1772
##
## Random effects:
## Groups Name Variance Std.Dev.
## subject (Intercept) 3982 63.10
## Residual 851 29.17
## Number of obs: 83, groups: subject, 6
##
## Fixed effects:
## Estimate Std. Error t value
## (Intercept) 202.588 26.151 7.747
## attitudepol -19.376 6.407 -3.024
##
## Correlation of Fixed Effects:
## (Intr)
## attitudepol -0.121
# attitude and gender with a random intercept for subject
mod2<-lmer(frequency~attitude+gender+(1|subject), data=politeness)
summary(mod2)
## Linear mixed model fit by REML ['lmerMod']
## Formula: frequency ~ attitude + gender + (1 | subject)
## Data: politeness
##
## REML criterion at convergence: 786.7
##
## Scaled residuals:
## Min 1Q Median 3Q Max
## -2.3619 -0.5305 -0.1724 0.4647 3.2260
##
## Random effects:
## Groups Name Variance Std.Dev.
## subject (Intercept) 603.9 24.57
## Residual 850.9 29.17
## Number of obs: 83, groups: subject, 6
##
## Fixed effects:
## Estimate Std. Error t value
## (Intercept) 256.691 15.226 16.859
## attitudepol -19.410 6.407 -3.030
## genderM -108.205 21.063 -5.137
##
## Correlation of Fixed Effects:
## (Intr) atttdp
## attitudepol -0.210
## genderM -0.692 0.004
# attitude and gender with a random intercept for subject and scenario
mod3<-lmer(frequency~attitude+gender+(1|subject)+(1|scenario), data=politeness)
summary(mod3)
## Linear mixed model fit by REML ['lmerMod']
## Formula: frequency ~ attitude + gender + (1 | subject) + (1 | scenario)
## Data: politeness
##
## REML criterion at convergence: 775.5
##
## Scaled residuals:
## Min 1Q Median 3Q Max
## -2.2591 -0.6235 -0.0773 0.5389 3.4795
##
## Random effects:
## Groups Name Variance Std.Dev.
## scenario (Intercept) 219.3 14.81
## subject (Intercept) 615.7 24.81
## Residual 645.9 25.41
## Number of obs: 83, groups: scenario, 7; subject, 6
##
## Fixed effects:
## Estimate Std. Error t value
## (Intercept) 256.846 16.116 15.937
## attitudepol -19.721 5.584 -3.532
## genderM -108.516 21.015 -5.164
##
## Correlation of Fixed Effects:
## (Intr) atttdp
## attitudepol -0.173
## genderM -0.652 0.004
# reduced model
modRed<-lmer(frequency~gender+(1|subject)+(1|scenario), data=politeness, REML=FALSE)
#full model
modFull<-lmer(frequency~attitude+gender+(1|subject)+(1|scenario), data=politeness, REML=FALSE)
anova(modRed, modFull)
## Data: politeness
## Models:
## modRed: frequency ~ gender + (1 | subject) + (1 | scenario)
## modFull: frequency ~ attitude + gender + (1 | subject) + (1 | scenario)
## Df AIC BIC logLik deviance Chisq Chi Df Pr(>Chisq)
## modRed 5 816.72 828.81 -403.36 806.72
## modFull 6 807.10 821.61 -397.55 795.10 11.618 1 0.0006532 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
This data comes from a sleep study in the MASS package.
Let’s first visualize the data
library(MASS)
##
## Attaching package: 'MASS'
## The following object is masked from 'package:dplyr':
##
## select
data(sleepstudy)
head(sleepstudy)
## Reaction Days Subject
## 1 249.5600 0 308
## 2 258.7047 1 308
## 3 250.8006 2 308
## 4 321.4398 3 308
## 5 356.8519 4 308
## 6 414.6901 5 308
ggplot(sleepstudy, aes(Days, Reaction))+
geom_point()+
geom_smooth(method="lm")+
facet_wrap(~Subject)
It appears that both the slope and intercept vary across subjects
# model with random slope
mod<-lmer(Reaction~Days+(1|Subject), data = sleepstudy)
summary(mod)
## Linear mixed model fit by REML ['lmerMod']
## Formula: Reaction ~ Days + (1 | Subject)
## Data: sleepstudy
##
## REML criterion at convergence: 1786.5
##
## Scaled residuals:
## Min 1Q Median 3Q Max
## -3.2257 -0.5529 0.0109 0.5188 4.2506
##
## Random effects:
## Groups Name Variance Std.Dev.
## Subject (Intercept) 1378.2 37.12
## Residual 960.5 30.99
## Number of obs: 180, groups: Subject, 18
##
## Fixed effects:
## Estimate Std. Error t value
## (Intercept) 251.4051 9.7467 25.79
## Days 10.4673 0.8042 13.02
##
## Correlation of Fixed Effects:
## (Intr)
## Days -0.371
# model with random slope and intercept
mod1<-lmer(Reaction~Days+(Days|Subject), data = sleepstudy)
summary(mod1)
## Linear mixed model fit by REML ['lmerMod']
## Formula: Reaction ~ Days + (Days | Subject)
## Data: sleepstudy
##
## REML criterion at convergence: 1743.6
##
## Scaled residuals:
## Min 1Q Median 3Q Max
## -3.9536 -0.4634 0.0231 0.4633 5.1793
##
## Random effects:
## Groups Name Variance Std.Dev. Corr
## Subject (Intercept) 611.90 24.737
## Days 35.08 5.923 0.07
## Residual 654.94 25.592
## Number of obs: 180, groups: Subject, 18
##
## Fixed effects:
## Estimate Std. Error t value
## (Intercept) 251.405 6.824 36.843
## Days 10.467 1.546 6.771
##
## Correlation of Fixed Effects:
## (Intr)
## Days -0.138