Statistical Inference
3-way Anova
3-Way ANOVA
The three-way ANOVA is an extension of the two-way ANOVA for assessing whether there is an interaction effect between three independent categorical variables on a continuous outcome variable.
We’ll use the headache dataset [datarium package], which contains the measures of migraine headache episode pain score in 72 participants treated with three different treatments. The participants include 36 males and 36 females. Males and females were further subdivided into whether they were at low or high risk of migraine.
We want to understand how each independent variable (type of treatments, risk of migraine and gender) interact to predict the pain score.
Descriptive statistics
## id gender risk treatment pain_score
## Min. : 1.0 male :36 high:36 X:24 Min. : 63.7
## 1st Qu.:18.8 female:36 low :36 Y:24 1st Qu.: 72.9
## Median :36.5 Z:24 Median : 77.9
## Mean :36.5 Mean : 77.6
## 3rd Qu.:54.2 3rd Qu.: 81.5
## Max. :72.0 Max. :100.0
# Boxplot for each group
headache %>% ggplot(aes(x = interaction(treatment, gender, risk), y = pain_score)) +
geom_boxplot() +
theme(axis.text.x = element_text(angle = 45, hjust = 1))No need to deal with the NA value, jusst use summary
Assumptions
Outliers
## # A tibble: 4 × 7
## gender risk treatment id pain_score is.outlier is.extreme
## <fct> <fct> <fct> <int> <dbl> <lgl> <lgl>
## 1 female high X 57 68.4 TRUE TRUE
## 2 female high Y 62 73.1 TRUE FALSE
## 3 female high Z 67 75.0 TRUE FALSE
## 4 female high Z 71 87.1 TRUE FALSE
Obviously, the first of the 4 outliers is an extreme
Normality
##
## Shapiro-Wilk normality test
##
## data: residuals(model)
## W = 1, p-value = 0.4
Homogeneity of variance
## # A tibble: 1 × 4
## df1 df2 statistic p
## <int> <int> <dbl> <dbl>
## 1 11 60 0.179 0.998
See, the P-value is greater than 0.05, we dont reject the H0 that there
is no difference between each sample group’s variance.
Anova
## Analysis of Variance Table
##
## Response: pain_score
## Df Sum Sq Mean Sq F value Pr(>F)
## gender 1 313 313 16.20 0.00016 ***
## risk 1 1794 1794 92.70 0.000000000000088 ***
## treatment 2 283 142 7.32 0.00143 **
## gender:risk 1 3 3 0.14 0.70849
## gender:treatment 2 129 65 3.34 0.04220 *
## risk:treatment 2 28 14 0.71 0.49422
## gender:risk:treatment 2 287 143 7.41 0.00133 **
## Residuals 60 1161 19
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Obviously, There is a significant 3-way interaction between Gender:Risk:Treatment (p < 0.01),
Post-hoc tests
If there is a significant 3-way interaction effect, you can decompose it into:
- Simple two-way interaction: run two-way interaction at each level of third variable,
- Simple simple main effect: run one-way model at each level of second variable,
- Simple simple pairwise comparisons: run pairwise or other post-hoc comparisons if necessary.
If you do not have a statistically significant three-way interaction, you need to determine whether you have any statistically significant two-way interaction from the ANOVA output. You can follow up a significant two-way interaction by simple main effects analyses and pairwise comparisons between groups if necessary.
Two-way interactions
## # A tibble: 6 × 8
## gender Effect DFn DFd F p `p<.05` ges
## * <fct> <chr> <dbl> <dbl> <dbl> <dbl> <chr> <dbl>
## 1 male risk 1 30 45.0 0.000000195 "*" 0.6
## 2 male treatment 2 30 9.15 0.000788 "*" 0.379
## 3 male risk:treatment 2 30 4.72 0.016 "*" 0.24
## 4 female risk 1 30 48.2 0.000000104 "*" 0.616
## 5 female treatment 2 30 0.542 0.587 "" 0.035
## 6 female risk:treatment 2 30 3.23 0.054 "" 0.177
Table 1: From table we can conclude that the risk factor significantly influences pain scores for both genders (P << 0.01). Both treatment and the interaction between risk and treatment are significant contributors, while for females, treatment does not significantly affect pain scores (p>0.05)
## # A tibble: 6 × 8
## risk Effect DFn DFd F p `p<.05` ges
## * <fct> <chr> <dbl> <dbl> <dbl> <dbl> <chr> <dbl>
## 1 high gender 1 30 8.45 0.007 "*" 0.22
## 2 high treatment 2 30 4.32 0.022 "*" 0.224
## 3 high gender:treatment 2 30 9.01 0.00086 "*" 0.375
## 4 low gender 1 30 7.79 0.009 "*" 0.206
## 5 low treatment 2 30 3.6 0.04 "*" 0.194
## 6 low gender:treatment 2 30 0.485 0.62 "" 0.031
Table 2:The data indicates significant interactions between high risk and both gender and treatment, influencing the dependent variable. For low risk, gender and treatment individually show significance.
## # A tibble: 9 × 8
## treatment Effect DFn DFd F p `p<.05` ges
## * <fct> <chr> <dbl> <dbl> <dbl> <dbl> <chr> <dbl>
## 1 X gender 1 20 18.0 0.0004 "*" 0.474
## 2 X risk 1 20 33.1 0.0000124 "*" 0.624
## 3 X gender:risk 1 20 10.4 0.004 "*" 0.342
## 4 Y gender 1 20 2.47 0.132 "" 0.11
## 5 Y risk 1 20 33.9 0.0000107 "*" 0.629
## 6 Y gender:risk 1 20 0.913 0.351 "" 0.044
## 7 Z gender 1 20 1.04 0.32 "" 0.05
## 8 Z risk 1 20 25.6 0.0000593 "*" 0.562
## 9 Z gender:risk 1 20 3.43 0.079 "" 0.147
We can observe significant interactions between treatment and both gender and risk for conditions X and Y, as indicated by the low p-values (p<0.05). Specifically, for X and Y treatments, gender, risk, and their interaction significantly influence the dependent variable. Treatment Z, however, shows no significant effects.
Main effects
## # A tibble: 4 × 9
## gender risk Effect DFn DFd F p `p<.05` ges
## * <fct> <fct> <chr> <dbl> <dbl> <dbl> <dbl> <chr> <dbl>
## 1 male high treatment 2 15 12.7 0.000595 "*" 0.628
## 2 male low treatment 2 15 0.623 0.55 "" 0.077
## 3 female high treatment 2 15 0.461 0.639 "" 0.058
## 4 female low treatment 2 15 4.36 0.032 "*" 0.368
Treatment significantly influences the dependent variable for males with high risk and females with low risk. (p<0.05)
## # A tibble: 6 × 9
## gender treatment Effect DFn DFd F p `p<.05` ges
## * <fct> <fct> <chr> <dbl> <dbl> <dbl> <dbl> <chr> <dbl>
## 1 male X risk 1 10 40.7 0.0000803 "*" 0.803
## 2 male Y risk 1 10 10.6 0.009 "*" 0.516
## 3 male Z risk 1 10 4.07 0.071 "" 0.289
## 4 female X risk 1 10 3.18 0.105 "" 0.241
## 5 female Y risk 1 10 25.9 0.00047 "*" 0.722
## 6 female Z risk 1 10 32.7 0.000194 "*" 0.766
There is significant influence of risk on its dependent variables for males with treatment Z, and females with treatment X
## # A tibble: 6 × 9
## risk treatment Effect DFn DFd F p `p<.05` ges
## * <fct> <fct> <chr> <dbl> <dbl> <dbl> <dbl> <chr> <dbl>
## 1 high X gender 1 10 21.2 0.000971 "*" 0.68
## 2 low X gender 1 10 0.757 0.405 "" 0.07
## 3 high Y gender 1 10 0.176 0.684 "" 0.017
## 4 low Y gender 1 10 3.48 0.092 "" 0.258
## 5 high Z gender 1 10 0.341 0.572 "" 0.033
## 6 low Z gender 1 10 4.19 0.068 "" 0.295
There is a significant influence of gender on its dependent variables for high risk with treatment Z and Y, and low risk with treatment Y and X
Pairwise comparisons
## NOTE: Results may be misleading due to involvement in interactions
## $emmeans
## treatment emmean SE df lower.CL upper.CL
## X 80.5 0.898 60 78.7 82.2
## Y 76.3 0.898 60 74.5 78.1
## Z 76.2 0.898 60 74.4 78.0
##
## Results are averaged over the levels of: risk, gender
## Confidence level used: 0.95
##
## $contrasts
## contrast estimate SE df t.ratio p.value
## X - Y 4.20 1.27 60 3.310 0.0040
## X - Z 4.22 1.27 60 3.320 0.0040
## Y - Z 0.02 1.27 60 0.010 1.0000
##
## Results are averaged over the levels of: risk, gender
## P value adjustment: tukey method for comparing a family of 3 estimates
## NOTE: Results may be misleading due to involvement in interactions
## $emmeans
## risk emmean SE df lower.CL upper.CL
## high 82.6 0.733 60 81.2 84.1
## low 72.7 0.733 60 71.2 74.1
##
## Results are averaged over the levels of: treatment, gender
## Confidence level used: 0.95
##
## $contrasts
## contrast estimate SE df t.ratio p.value
## high - low 9.98 1.04 60 9.630 <.0001
##
## Results are averaged over the levels of: treatment, gender
## NOTE: Results may be misleading due to involvement in interactions
## $emmeans
## gender emmean SE df lower.CL upper.CL
## male 79.7 0.733 60 78.3 81.2
## female 75.6 0.733 60 74.1 77.0
##
## Results are averaged over the levels of: treatment, risk
## Confidence level used: 0.95
##
## $contrasts
## contrast estimate SE df t.ratio p.value
## male - female 4.17 1.04 60 4.020 0.0002
##
## Results are averaged over the levels of: treatment, risk
There is a significant difference between males and females