3-Way ANOVA
The three-way ANOVA is an extension of the two-way ANOVA for assessing whether there is an interaction effect between three independent categorical variables on a continuous outcome variable.
We’ll use the headache dataset [datarium package], which contains the measures of migraine headache episode pain score in 72 participants treated with three different treatments. The participants include 36 males and 36 females. Males and females were further subdivided into whether they were at low or high risk of migraine.
We want to understand how each independent variable (type of treatments, risk of migraine and gender) interact to predict the pain score.
Descriptive statistics
Let’s take a look at the general structure of the data and first few rows of the data frame.
str(headache)## tibble [72 x 5] (S3: tbl_df/tbl/data.frame)
## $ id : int [1:72] 1 2 3 4 5 6 7 8 9 10 ...
## $ gender : Factor w/ 2 levels "male","female": 1 1 1 1 1 1 1 1 1 1 ...
## $ risk : Factor w/ 2 levels "high","low": 2 2 2 2 2 2 2 2 2 2 ...
## $ treatment : Factor w/ 3 levels "X","Y","Z": 1 1 1 1 1 1 2 2 2 2 ...
## $ pain_score: num [1:72] 79.3 76.8 70.8 81.2 75.1 ...
#head(headache)Pain score by treatment for each gender separately
ggboxplot(headache, x = "treatment", y = "pain_score", color = "risk", palette = c("red", "black"), facet.by = "gender")ms <- headache %>%
group_by(gender, risk, treatment) %>%
get_summary_stats(pain_score, type = "mean_sd")
library(kableExtra)
library(formattable)
ms$sd <- color_bar("lightgreen")(ms$sd)
ms$mean <- color_tile("yellow", "red")(ms$mean)
ms$gender <- ifelse(
ms$gender == "male",
cell_spec(ms$gender, color = "steelblue", bold = T),
cell_spec(ms$gender, color = "hotpink", bold = T)
)
ms %>%
kbl(escape = F) %>%
kable_material_dark()| gender | risk | treatment | variable | n | mean | sd |
|---|---|---|---|---|---|---|
| male | high | X | pain_score | 6 | 92.7 | 5.12 |
| male | high | Y | pain_score | 6 | 82.3 | 5.00 |
| male | high | Z | pain_score | 6 | 79.7 | 4.05 |
| male | low | X | pain_score | 6 | 76.1 | 3.85 |
| male | low | Y | pain_score | 6 | 73.1 | 4.76 |
| male | low | Z | pain_score | 6 | 74.5 | 4.89 |
| female | high | X | pain_score | 6 | 78.9 | 5.32 |
| female | high | Y | pain_score | 6 | 81.2 | 4.62 |
| female | high | Z | pain_score | 6 | 81.0 | 3.98 |
| female | low | X | pain_score | 6 | 74.2 | 3.69 |
| female | low | Y | pain_score | 6 | 68.4 | 4.08 |
| female | low | Z | pain_score | 6 | 69.8 | 2.72 |
Assumptions
Outliers
tab <- headache %>%
group_by(gender, risk, treatment) %>%
identify_outliers(pain_score)
tab$is.extreme <- cell_spec(tab$is.extreme, color = ifelse(tab$is.extreme == TRUE, "red", "white"))
tab %>%
kbl(caption = "Outliers identification", escape = F) %>%
kable_material_dark("striped")| gender | risk | treatment | id | pain_score | is.outlier | is.extreme |
|---|---|---|---|---|---|---|
| female | high | X | 57 | 68.4 | TRUE | TRUE |
| female | high | Y | 62 | 73.1 | TRUE | FALSE |
| female | high | Z | 67 | 75.0 | TRUE | FALSE |
| female | high | Z | 71 | 87.1 | TRUE | FALSE |
It can be seen that, the data contain one extreme outlier (id = 57, female at high risk of migraine taking drug X) In further analysis I have decided to keep the outlier and see the results of the analysis. Then, analysis could be repeated without the outlier.
Normality
Check normality assumption by analyzing the model residuals. QQ plot and Shapiro-Wilk test of normality are used.
model <- lm(pain_score ~ gender*risk*treatment, data = headache)
# Create a QQ plot of residuals
ggqqplot(residuals(model)) In the QQ plot, as all the points fall approximately along the reference line, we can assume normality.
# Compute Shapiro-Wilk test of normality
shapiro_test(residuals(model))## # A tibble: 1 x 3
## variable statistic p.value
## <chr> <dbl> <dbl>
## 1 residuals(model) 0.982 0.398
This conclusion is supported by the Shapiro-Wilk test. The p-value is not significant (p = 0.4), so we can assume normality.
Check normality assumption by groups. Computing Shapiro-Wilk test for each combinations of factor levels.
sw <- headache %>%
group_by(gender, risk, treatment) %>%
shapiro_test(pain_score)
sw$p <- cell_spec(round(sw$p, 4), color = ifelse(sw$p < 0.05, "red", "white"))
sw %>% kbl(caption = "Outliers identification", escape = F) %>%
kable_material_dark("striped")| gender | risk | treatment | variable | statistic | p |
|---|---|---|---|---|---|
| male | high | X | pain_score | 0.958 | 0.8076 |
| male | high | Y | pain_score | 0.902 | 0.3843 |
| male | high | Z | pain_score | 0.955 | 0.7843 |
| male | low | X | pain_score | 0.982 | 0.9619 |
| male | low | Y | pain_score | 0.920 | 0.5073 |
| male | low | Z | pain_score | 0.924 | 0.535 |
| female | high | X | pain_score | 0.714 | 0.0087 |
| female | high | Y | pain_score | 0.939 | 0.6538 |
| female | high | Z | pain_score | 0.971 | 0.9006 |
| female | low | X | pain_score | 0.933 | 0.5999 |
| female | low | Y | pain_score | 0.927 | 0.5554 |
| female | low | Z | pain_score | 0.958 | 0.8009 |
The pain scores were normally distributed (p > 0.05) except for one group (female at high risk of migraine taking drug X, p = 0.0086), as assessed by Shapiro-Wilk’s test of normality.
ggqqplot(headache, "pain_score", ggtheme = theme_bw()) +
facet_grid(gender + risk ~ treatment, labeller = "label_both") All the points fall approximately along the reference line, except for one group (female at high risk of migraine taking drug X), where we already identified an extreme outlier. #### Homogeneity of variance This can be checked using the Levene’s test:
headache %>% levene_test(pain_score ~ gender*risk*treatment) %>%
kbl(caption = "Levene's test") %>%
kable_material_dark()| df1 | df2 | statistic | p |
|---|---|---|---|
| 11 | 60 | 0.179 | 0.998 |
The Levene’s test is not significant (p > 0.05). Therefore, we can assume the homogeneity of variances in the different groups.
plot(lm(pain_score~gender*risk*treatment, data=headache), 1) Points 29, 57 and 62 are leverage points which violate the homogenity of variances assumption. However the homogenity was proved in Levene’s test.
Anova
res.aov <- headache %>% anova_test(pain_score ~ gender*risk*treatment)
res.aov %>% kbl(caption = "ANOVA table") %>%
kable_material_dark() %>%
row_spec(c(7), color ="red", bold = T)| Effect | DFn | DFd | F | p | p<.05 | ges |
|---|---|---|---|---|---|---|
| gender | 1 | 60 | 16.196 | 0.000 |
|
0.213 |
| risk | 1 | 60 | 92.699 | 0.000 |
|
0.607 |
| treatment | 2 | 60 | 7.318 | 0.001 |
|
0.196 |
| gender:risk | 1 | 60 | 0.141 | 0.708 | 0.002 | |
| gender:treatment | 2 | 60 | 3.338 | 0.042 |
|
0.100 |
| risk:treatment | 2 | 60 | 0.713 | 0.494 | 0.023 | |
| gender:risk:treatment | 2 | 60 | 7.406 | 0.001 |
|
0.198 |
results <- aov(pain_score ~ gender * risk * treatment, data = headache)
anova(results) %>% kbl %>% kable_material_dark()| Df | Sum Sq | Mean Sq | F value | Pr(>F) | |
|---|---|---|---|---|---|
| gender | 1 | 313.36 | 313.36 | 16.196 | 0.000 |
| risk | 1 | 1793.56 | 1793.56 | 92.699 | 0.000 |
| treatment | 2 | 283.17 | 141.58 | 7.318 | 0.001 |
| gender:risk | 1 | 2.73 | 2.73 | 0.141 | 0.708 |
| gender:treatment | 2 | 129.18 | 64.59 | 3.338 | 0.042 |
| risk:treatment | 2 | 27.59 | 13.80 | 0.713 | 0.494 |
| gender:risk:treatment | 2 | 286.60 | 143.30 | 7.406 | 0.001 |
| Residuals | 60 | 1160.89 | 19.35 | NA | NA |
report(results)## The ANOVA (formula: pain_score ~ gender * risk * treatment) suggests that:
##
## - The main effect of gender is statistically significant and large (F(1, 60) = 16.20, p < .001; Eta2 (partial) = 0.21, 95% CI [0.08, 1.00])
## - The main effect of risk is statistically significant and large (F(1, 60) = 92.70, p < .001; Eta2 (partial) = 0.61, 95% CI [0.48, 1.00])
## - The main effect of treatment is statistically significant and large (F(2, 60) = 7.32, p = 0.001; Eta2 (partial) = 0.20, 95% CI [0.06, 1.00])
## - The interaction between gender and risk is statistically not significant and very small (F(1, 60) = 0.14, p = 0.708; Eta2 (partial) = 2.35e-03, 95% CI [0.00, 1.00])
## - The interaction between gender and treatment is statistically significant and medium (F(2, 60) = 3.34, p = 0.042; Eta2 (partial) = 0.10, 95% CI [2.01e-03, 1.00])
## - The interaction between risk and treatment is statistically not significant and small (F(2, 60) = 0.71, p = 0.494; Eta2 (partial) = 0.02, 95% CI [0.00, 1.00])
## - The interaction between gender, risk and treatment is statistically significant and large (F(2, 60) = 7.41, p = 0.001; Eta2 (partial) = 0.20, 95% CI [0.06, 1.00])
##
## Effect sizes were labelled following Field's (2013) recommendations.
Post-hoc tests
If there is a significant 3-way interaction effect, you can decompose it into:
- Simple two-way interaction: run two-way interaction at each level of third variable,
- Simple simple main effect: run one-way model at each level of second variable,
- Simple simple pairwise comparisons: run pairwise or other post-hoc comparisons if necessary.
If you do not have a statistically significant three-way interaction, you need to determine whether you have any statistically significant two-way interaction from the ANOVA output. You can follow up a significant two-way interaction by simple main effects analyses and pairwise comparisons between groups if necessary.
Two-way interactions
We want to evaluate the effect of risk*treatment interaction on pain_score at each level of gender.
model <- lm(pain_score ~ gender*risk*treatment, data = headache)
headache %>%
group_by(gender) %>%
anova_test(pain_score ~ risk*treatment, error = model) %>%
kbl(caption = "Two-way interactions") %>%
kable_material_dark()| gender | Effect | DFn | DFd | F | p | p<.05 | ges |
|---|---|---|---|---|---|---|---|
| male | risk | 1 | 60 | 50.037 | 0.000 |
|
0.455 |
| male | treatment | 2 | 60 | 10.174 | 0.000 |
|
0.253 |
| male | risk:treatment | 2 | 60 | 5.252 | 0.008 |
|
0.149 |
| female | risk | 1 | 60 | 42.803 | 0.000 |
|
0.416 |
| female | treatment | 2 | 60 | 0.482 | 0.620 | 0.016 | |
| female | risk:treatment | 2 | 60 | 2.868 | 0.065 | 0.087 |
For males, this result suggests that the effect of treatment on “pain_score” depends on one’s “risk” of migraine. In other words, the risk moderates the effect of the type of treatment on pain_score.
Main effects
Group the data by gender and risk and analyze the simple simple main effects of treatment on pain_score:
treatment.effect <- headache %>%
group_by(gender, risk) %>%
anova_test(pain_score ~ treatment, error = model)
treatment.effect %>% kbl(caption = "Treatment effect") %>%
kable_material_dark() %>%
row_spec(c(1,2), color ="red", bold = T)| gender | risk | Effect | DFn | DFd | F | p | p<.05 | ges |
|---|---|---|---|---|---|---|---|---|
| male | high | treatment | 2 | 60 | 14.77 | 0.000 |
|
0.330 |
| male | low | treatment | 2 | 60 | 0.66 | 0.521 | 0.022 | |
| female | high | treatment | 2 | 60 | 0.52 | 0.597 | 0.017 | |
| female | low | treatment | 2 | 60 | 2.83 | 0.067 | 0.086 |
This analysis indicates that, the type of treatment taken has a statistically significant effect on pain_score in males who are at high risk.
In other words, the mean pain_score in the treatment X, Y and Z groups was statistically significantly different for males who at high risk, but not for males at low risk.
Pairwise comparisons
Compare the different treatments by gender and risk variables:
pwc <- headache %>%
group_by(gender, risk) %>%
emmeans_test(pain_score ~ treatment, p.adjust.method = "bonferroni") %>%
select(-df, -statistic, -p) # Remove details
# Show comparison results for male at high risk
pwc %>% filter(gender == "male", risk == "high") %>%
kbl(caption = "Pairwise comparisons") %>%
kable_material_dark()| gender | risk | term | .y. | group1 | group2 | p.adj | p.adj.signif |
|---|---|---|---|---|---|---|---|
| male | high | treatment | pain_score | X | Y | 0.000 | *** |
| male | high | treatment | pain_score | X | Z | 0.000 | **** |
| male | high | treatment | pain_score | Y | Z | 0.897 | ns |
There is a significant difference between treatments X and Y as well as between X and Z, however the difference between treatments Z and Y is statistically insignificant.