3-way Anova
3-Way ANOVA
The three-way ANOVA is an extension of the two-way ANOVA for assessing whether there is an interaction effect between three independent categorical variables on a continuous outcome variable.
Weâll use the headache dataset [datarium package], which contains the measures of migraine headache episode pain score in 72 participants treated with three different treatments. The participants include 36 males and 36 females. Males and females were further subdivided into whether they were at low or high risk of migraine.
We want to understand how each independent variable (type of treatments, risk of migraine and gender) interact to predict the pain score.
Descriptive statistics
## # A tibble: 12 x 7
## gender risk treatment variable n mean sd
## <fct> <fct> <fct> <chr> <dbl> <dbl> <dbl>
## 1 male high X pain_score 6 92.7 5.12
## 2 male high Y pain_score 6 82.3 5.00
## 3 male high Z pain_score 6 79.7 4.05
## 4 male low X pain_score 6 76.1 3.86
## 5 male low Y pain_score 6 73.1 4.76
## 6 male low Z pain_score 6 74.5 4.89
## 7 female high X pain_score 6 78.9 5.32
## 8 female high Y pain_score 6 81.2 4.62
## 9 female high Z pain_score 6 81.0 3.98
## 10 female low X pain_score 6 74.2 3.69
## 11 female low Y pain_score 6 68.4 4.08
## 12 female low Z pain_score 6 69.8 2.72
Assumptions
Outliers
## # A tibble: 4 x 7
## gender risk treatment id pain_score is.outlier is.extreme
## <fct> <fct> <fct> <int> <dbl> <lgl> <lgl>
## 1 female high X 57 68.4 TRUE TRUE
## 2 female high Y 62 73.1 TRUE FALSE
## 3 female high Z 67 75.0 TRUE FALSE
## 4 female high Z 71 87.1 TRUE FALSE
There is one extreme outlier, which is a female with id = 57, with high risk and treatment X
Normality
## # A tibble: 1 x 3
## variable statistic p.value
## <chr> <dbl> <dbl>
## 1 residuals(model) 0.982 0.398
## # A tibble: 12 x 6
## gender risk treatment variable statistic p
## <fct> <fct> <fct> <chr> <dbl> <dbl>
## 1 male high X pain_score 0.958 0.808
## 2 male high Y pain_score 0.902 0.384
## 3 male high Z pain_score 0.955 0.784
## 4 male low X pain_score 0.982 0.962
## 5 male low Y pain_score 0.920 0.507
## 6 male low Z pain_score 0.924 0.535
## 7 female high X pain_score 0.714 0.00869
## 8 female high Y pain_score 0.939 0.654
## 9 female high Z pain_score 0.971 0.901
## 10 female low X pain_score 0.933 0.600
## 11 female low Y pain_score 0.927 0.555
## 12 female low Z pain_score 0.958 0.801
Only one group is not normally distributed, which is group of females with high risk and treatment X.
As it can be observed on the plot, data in all groups is normally distributed, except for the group of females with high risk and treatment X (as was already identified by Shapiro test and an outlier was found in this group).
Homogeneity of variance
## # A tibble: 1 x 4
## df1 df2 statistic p
## <int> <int> <dbl> <dbl>
## 1 11 60 0.179 0.998
Obtained p-value is bigger than 0.05, hence test is not significant. So, we can assume homogenity of variance in different groups.
Anova
## Analysis of Variance Table
##
## Response: pain_score
## Df Sum Sq Mean Sq F value Pr(>F)
## gender 1 313 313 16.20 0.00016 ***
## risk 1 1794 1794 92.70 0.000000000000088 ***
## treatment 2 283 142 7.32 0.00143 **
## gender:risk 1 3 3 0.14 0.70849
## gender:treatment 2 129 65 3.34 0.04220 *
## risk:treatment 2 28 14 0.71 0.49422
## gender:risk:treatment 2 287 143 7.41 0.00133 **
## Residuals 60 1161 19
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## Coefficient covariances computed by hccm()
## ANOVA Table (type II tests)
##
## Effect DFn DFd F p p<.05 ges
## 1 gender 1 60 16.196 0.000163000000000 * 0.213
## 2 risk 1 60 92.699 0.000000000000088 * 0.607
## 3 treatment 2 60 7.318 0.001000000000000 * 0.196
## 4 gender:risk 1 60 0.141 0.708000000000000 0.002
## 5 gender:treatment 2 60 3.338 0.042000000000000 * 0.100
## 6 risk:treatment 2 60 0.713 0.494000000000000 0.023
## 7 gender:risk:treatment 2 60 7.406 0.001000000000000 * 0.198
There is a statistically significant interaction between gender, risk and treatment, where F(2,60) = 7.406 and p = 0.001.
Post-hoc tests
If there is a significant 3-way interaction effect, you can decompose it into:
- Simple two-way interaction: run two-way interaction at each level of third variable,
- Simple simple main effect: run one-way model at each level of second variable,
- Simple simple pairwise comparisons: run pairwise or other post-hoc comparisons if necessary.
If you do not have a statistically significant three-way interaction, you need to determine whether you have any statistically significant two-way interaction from the ANOVA output. You can follow up a significant two-way interaction by simple main effects analyses and pairwise comparisons between groups if necessary.
Two-way interactions
## Coefficient covariances computed by hccm()
## Coefficient covariances computed by hccm()
## # A tibble: 6 x 8
## gender Effect DFn DFd F p `p<.05` ges
## * <fct> <chr> <dbl> <dbl> <dbl> <dbl> <chr> <dbl>
## 1 male risk 1 60 50.0 0.00000000187 "*" 0.455
## 2 male treatment 2 60 10.2 0.000157 "*" 0.253
## 3 male risk:treatment 2 60 5.25 0.008 "*" 0.149
## 4 female risk 1 60 42.8 0.000000015 "*" 0.416
## 5 female treatment 2 60 0.482 0.62 "" 0.016
## 6 female risk:treatment 2 60 2.87 0.065 "" 0.087
We can observe statistically significant to way interaction between risk and treatment for males, with F(2,60) = 5.252 and p = 0.008.
Main effects
## Coefficient covariances computed by hccm()
## Coefficient covariances computed by hccm()
## Coefficient covariances computed by hccm()
## Coefficient covariances computed by hccm()
## # A tibble: 4 x 9
## gender risk Effect DFn DFd F p `p<.05` ges
## * <fct> <fct> <chr> <dbl> <dbl> <dbl> <dbl> <chr> <dbl>
## 1 male high treatment 2 60 14.8 0.0000061 "*" 0.33
## 2 male low treatment 2 60 0.66 0.521 "" 0.022
## 3 female high treatment 2 60 0.52 0.597 "" 0.017
## 4 female low treatment 2 60 2.83 0.067 "" 0.086
There is a statistically significant main effect of treatment for males with high risk, with F(2,60) = 14.77 and p = 0.0000061.
So, for this group the type of treatment has a significant impact on the pain score.
Pairwise comparisons
## # A tibble: 12 x 11
## gender risk term .y. group1 group2 df statistic p p.adj
## * <chr> <chr> <chr> <chr> <chr> <chr> <dbl> <dbl> <dbl> <dbl>
## 1 female high treatment pain_s~ X Y 60 -0.910 3.67e-1 1 e+0
## 2 female high treatment pain_s~ X Z 60 -0.855 3.96e-1 1 e+0
## 3 female high treatment pain_s~ Y Z 60 0.0552 9.56e-1 1 e+0
## 4 female low treatment pain_s~ X Y 60 2.28 2.61e-2 7.82e-2
## 5 female low treatment pain_s~ X Z 60 1.72 9.00e-2 2.70e-1
## 6 female low treatment pain_s~ Y Z 60 -0.558 5.79e-1 1 e+0
## 7 male high treatment pain_s~ X Y 60 4.09 1.29e-4 3.86e-4
## 8 male high treatment pain_s~ X Z 60 5.14 3.14e-6 9.42e-6
## 9 male high treatment pain_s~ Y Z 60 1.05 2.99e-1 8.97e-1
## 10 male low treatment pain_s~ X Y 60 1.15 2.56e-1 7.68e-1
## 11 male low treatment pain_s~ X Z 60 0.628 5.32e-1 1 e+0
## 12 male low treatment pain_s~ Y Z 60 -0.519 6.06e-1 1 e+0
## # ... with 1 more variable: p.adj.signif <chr>
## # A tibble: 3 x 9
## gender risk treatment emmean se df conf.low conf.high method
## <fct> <fct> <fct> <dbl> <dbl> <dbl> <dbl> <dbl> <chr>
## 1 male high X 92.7 1.80 60 89.1 96.3 Emmeans test
## 2 male high Y 82.3 1.80 60 78.7 85.9 Emmeans test
## 3 male high Z 79.7 1.80 60 76.1 83.3 Emmeans test
There is statistically significant difference between those pairs of treatment: - X-Y: p.adj = 0.00038591 - X-Z: p.adj = 0.00000942
## # A tibble: 3 x 9
## .y. group1 group2 n1 n2 p p.signif p.adj p.adj.signif
## * <chr> <chr> <chr> <int> <int> <dbl> <chr> <dbl> <chr>
## 1 pain_score X Y 24 24 0.0514 ns 0.154 ns
## 2 pain_score X Z 24 24 0.0505 ns 0.152 ns
## 3 pain_score Y Z 24 24 0.994 ns 1 ns
## # A tibble: 1 x 9
## .y. group1 group2 n1 n2 p p.signif p.adj p.adj.signif
## * <chr> <chr> <chr> <int> <int> <dbl> <chr> <dbl> <chr>
## 1 pain_score male female 36 36 0.0172 * 0.0172 *
## # A tibble: 1 x 9
## .y. group1 group2 n1 n2 p p.signif p.adj p.adj.signif
## * <chr> <chr> <chr> <int> <int> <dbl> <chr> <dbl> <chr>
## 1 pain_score high low 36 36 1.24e-10 **** 1.24e-10 ****
#Summary
As it was stated before, the distribution is normal and we can assume homogenity of variances.
There is a statistically significant interaction between gender, risk and treatment, where F(2,60) = 7.406 and p = 0.001.
There is statistically significant two way interaction between risk and treatment for males, with F(2,60) = 5.252 and p = 0.008.
There is a statistically significant main effect of treatment for males with high risk, with F(2,60) = 14.77 and p = 0.0000061.
There was observed one extreme outlier, and it is a female with id = 57, with high risk and treatment X.