Statistical Inference
3-way Anova
3-Way ANOVA
The three-way ANOVA is an extension of the two-way ANOVA for assessing whether there is an interaction effect between three independent categorical variables on a continuous outcome variable.
We’ll use the headache dataset [datarium package], which contains the measures of migraine headache episode pain score in 72 participants treated with three different treatments. The participants include 36 males and 36 females. Males and females were further subdivided into whether they were at low or high risk of migraine.
We want to understand how each independent variable (type of treatments, risk of migraine and gender) interact to predict the pain score.
Descriptive statistics
First, let’s observe some general things about our dataset.
summary_headache <- headache %>%
group_by(gender, risk, treatment) %>%
summarise(
mean_pain_score = mean(pain_score, na.rm = TRUE),
sd_pain_score = sd(pain_score, na.rm = TRUE),
.groups = 'drop'
)
print(summary_headache)## # A tibble: 12 × 5
## gender risk treatment mean_pain_score sd_pain_score
## <fct> <fct> <fct> <dbl> <dbl>
## 1 male high X 92.7 5.12
## 2 male high Y 82.3 5.00
## 3 male high Z 79.7 4.05
## 4 male low X 76.1 3.85
## 5 male low Y 73.1 4.77
## 6 male low Z 74.5 4.89
## 7 female high X 78.9 5.32
## 8 female high Y 81.2 4.62
## 9 female high Z 81.0 3.98
## 10 female low X 74.2 3.69
## 11 female low Y 68.4 4.08
## 12 female low Z 69.8 2.72
ggplot(headache, aes(x = treatment, y = pain_score, fill = risk)) +
geom_boxplot() +
facet_wrap(gender ~ risk, scales = "free") +
scale_fill_brewer(palette = "Pastel1") +
labs(title = "Pain Score Across Treatments by Gender and Risk of Migraine",
x = "Treatment",
y = "Pain Score",
fill = "Risk of Migraine") +
theme_minimal()As we can see, there are a few outliers, we would check that later on.
Assumptions
Outliers
outliers_test <- headache %>%
group_by(gender, risk, treatment) %>%
identify_outliers(pain_score)
outliers_test## # A tibble: 4 × 7
## gender risk treatment id pain_score is.outlier is.extreme
## <fct> <fct> <fct> <int> <dbl> <lgl> <lgl>
## 1 female high X 57 68.4 TRUE TRUE
## 2 female high Y 62 73.1 TRUE FALSE
## 3 female high Z 67 75.0 TRUE FALSE
## 4 female high Z 71 87.1 TRUE FALSE
1 out of 4 outliers from this dataset is considered extreme.
Normality
To test normality, ggqqplot and Shapiro-Wilk test can be used. We will use both.
ggqqplot(headache, "pain_score") +
facet_grid(gender ~ risk + treatment, scales = "free") + # Note the change in facet formula
theme(legend.position = "none") +
geom_qq_line() +
geom_qq(color = "red") +
labs(
title = "Normal Q-Q Plots for Pain Scores",
subtitle = "Grouped by Gender, Risk, and Treatment"
)As we can see, some outliers are still there and they affect the normality. Let’s use Shapiro-Wilk to prove this “hypothesis” ;)
normality_test <- headache %>%
group_by(gender, risk, treatment) %>%
shapiro_test(pain_score)
normality_test## # A tibble: 12 × 6
## gender risk treatment variable statistic p
## <fct> <fct> <fct> <chr> <dbl> <dbl>
## 1 male high X pain_score 0.958 0.808
## 2 male high Y pain_score 0.902 0.384
## 3 male high Z pain_score 0.955 0.784
## 4 male low X pain_score 0.982 0.962
## 5 male low Y pain_score 0.920 0.507
## 6 male low Z pain_score 0.924 0.535
## 7 female high X pain_score 0.714 0.00869
## 8 female high Y pain_score 0.939 0.654
## 9 female high Z pain_score 0.971 0.901
## 10 female low X pain_score 0.933 0.600
## 11 female low Y pain_score 0.927 0.555
## 12 female low Z pain_score 0.958 0.801
Exactly! The female individual with high risk of migraine on drug X has a p-value less than 0.05. Other groups appear to be normally distributed
Homogeneity of variance
Let’s test homogeneity of variance using levene test.
homogeneity_test <- headache %>%
levene_test(pain_score ~ gender * treatment * risk)
homogeneity_test## # A tibble: 1 × 4
## df1 df2 statistic p
## <int> <int> <dbl> <dbl>
## 1 11 60 0.179 0.998
P-value is higher that 0.05, which means that it is not significant. Conclusion: variances are homogenious
Anova
After checking all of the assumptions, we can finally use ANOVA.
## Analysis of Variance Table
##
## Response: pain_score
## Df Sum Sq Mean Sq F value Pr(>F)
## gender 1 313 313 16.20 0.00016 ***
## treatment 2 283 142 7.32 0.00143 **
## risk 1 1794 1794 92.70 0.000000000000088 ***
## gender:treatment 2 129 65 3.34 0.04220 *
## gender:risk 1 3 3 0.14 0.70849
## treatment:risk 2 28 14 0.71 0.49422
## gender:treatment:risk 2 287 143 7.41 0.00133 **
## Residuals 60 1161 19
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## The ANOVA (formula: pain_score ~ gender * treatment * risk) suggests that:
##
## - The main effect of gender is statistically significant and large (F(1, 60) =
## 16.20, p < .001; Eta2 (partial) = 0.21, 95% CI [0.08, 1.00])
## - The main effect of treatment is statistically significant and large (F(2, 60)
## = 7.32, p = 0.001; Eta2 (partial) = 0.20, 95% CI [0.06, 1.00])
## - The main effect of risk is statistically significant and large (F(1, 60) =
## 92.70, p < .001; Eta2 (partial) = 0.61, 95% CI [0.48, 1.00])
## - The interaction between gender and treatment is statistically significant and
## medium (F(2, 60) = 3.34, p = 0.042; Eta2 (partial) = 0.10, 95% CI [2.01e-03,
## 1.00])
## - The interaction between gender and risk is statistically not significant and
## very small (F(1, 60) = 0.14, p = 0.708; Eta2 (partial) = 2.35e-03, 95% CI
## [0.00, 1.00])
## - The interaction between treatment and risk is statistically not significant
## and small (F(2, 60) = 0.71, p = 0.494; Eta2 (partial) = 0.02, 95% CI [0.00,
## 1.00])
## - The interaction between gender, treatment and risk is statistically
## significant and large (F(2, 60) = 7.41, p = 0.001; Eta2 (partial) = 0.20, 95%
## CI [0.06, 1.00])
##
## Effect sizes were labelled following Field's (2013) recommendations.
Based on the results of the ANOVA test, we can see that there that relations between pain score and gender, so as between pain score and risk are statistically significant. 3-way correlation is also statistically significant and large.
Post-hoc tests
If there is a significant 3-way interaction effect, you can decompose it into:
- Simple two-way interaction: run two-way interaction at each level of third variable,
- Simple simple main effect: run one-way model at each level of second variable,
- Simple simple pairwise comparisons: run pairwise or other post-hoc comparisons if necessary.
If you do not have a statistically significant three-way interaction, you need to determine whether you have any statistically significant two-way interaction from the ANOVA output. You can follow up a significant two-way interaction by simple main effects analyses and pairwise comparisons between groups if necessary.
Two-way interactions
l_model <- lm(pain_score ~ gender * risk * treatment, data = headache)
headache %>%
group_by(gender) %>%
anova_test(pain_score ~ risk * treatment, error = l_model)## # A tibble: 6 × 8
## gender Effect DFn DFd F p `p<.05` ges
## * <fct> <chr> <dbl> <dbl> <dbl> <dbl> <chr> <dbl>
## 1 male risk 1 60 50.0 0.00000000187 "*" 0.455
## 2 male treatment 2 60 10.2 0.000157 "*" 0.253
## 3 male risk:treatment 2 60 5.25 0.008 "*" 0.149
## 4 female risk 1 60 42.8 0.000000015 "*" 0.416
## 5 female treatment 2 60 0.482 0.62 "" 0.016
## 6 female risk:treatment 2 60 2.87 0.065 "" 0.087
It appears that their is a high significance for male patients with high risk of migraine, since the p-value is so low. Basically, the effectiveness of treatment on pain scores depends on risk.
Main effects
effects <- headache %>%
group_by(gender, risk) %>%
anova_test(pain_score ~ treatment, error = l_model)
effects## # A tibble: 4 × 9
## gender risk Effect DFn DFd F p `p<.05` ges
## * <fct> <fct> <chr> <dbl> <dbl> <dbl> <dbl> <chr> <dbl>
## 1 male high treatment 2 60 14.8 0.0000061 "*" 0.33
## 2 male low treatment 2 60 0.66 0.521 "" 0.022
## 3 female high treatment 2 60 0.52 0.597 "" 0.017
## 4 female low treatment 2 60 2.83 0.067 "" 0.086
From the table we may conclude, that type of treatment is significant only for male individuals at a high risk of migraine.
Pairwise comparisons
pairwise_comp <- headache %>%
group_by(gender, risk) %>%
emmeans_test(pain_score ~ treatment, p.adjust.method = "bonferroni")
pairwise_comp %>%
filter(gender == "male", risk == "high")## # A tibble: 3 × 11
## gender risk term .y. group1 group2 df statistic p p.adj
## <fct> <fct> <chr> <chr> <chr> <chr> <dbl> <dbl> <dbl> <dbl>
## 1 male high treatment pain_sco… X Y 60 4.09 1.29e-4 3.86e-4
## 2 male high treatment pain_sco… X Z 60 5.14 3.14e-6 9.42e-6
## 3 male high treatment pain_sco… Y Z 60 1.05 2.99e-1 8.97e-1
## # ℹ 1 more variable: p.adj.signif <chr>
Difference between Y-Z treatments are statistically insignificant, while X-Y and X-Z show significant difference.
Conclusion
After checking all of the assumptions and performing statistical tests, we may draw a conclusion: indeed, there is a significant 3-way interaction between gender, risk and treatment.