Statistical Inference
3-way Anova
3-Way ANOVA
The three-way ANOVA is an extension of the two-way ANOVA for assessing whether there is an interaction effect between three independent categorical variables on a continuous outcome variable.
We’ll use the headache dataset [datarium package], which contains the measures of migraine headache episode pain score in 72 participants treated with three different treatments. The participants include 36 males and 36 females. Males and females were further subdivided into whether they were at low or high risk of migraine.
We want to understand how each independent variable (type of treatments, risk of migraine and gender) interact to predict the pain score.
Descriptive statistics
We plot the data in order to check that your assumptions look okay.
Histogram:
ggplot(headache, aes(x = pain_score)) +
geom_histogram(bins = 10, fill = "blue", color = "black", alpha = 0.7) +
facet_grid(treatment ~ risk * gender) +
labs(title = "Pain Score distribution",
x = "Pain Score ",
y = "Frequency") +
theme_classic()Boxplot:
attach(headache)
bxp <- ggboxplot(headache, y = "pain_score", x = "gender", color = "risk", palette = "jco", facet.by = c("treatment")) +
geom_jitter(aes(color = risk)) +
labs(title = "Dependency of pain score in relation to other factors",
x = "Gender",
y = "Pain Score") +
theme_light()
bxp
At first glance we can see how there is a significant difference in the
risk of men who have taken treatment X. Or also the difference in the
risk of women who have taken treatment Y.
## , , gender = male
##
## risk
## treatment high low
## X 6 6
## Y 6 6
## Z 6 6
##
## , , gender = female
##
## risk
## treatment high low
## X 6 6
## Y 6 6
## Z 6 6
Compute the mean and the SD (standard deviation) of the score by groups:
## # A tibble: 12 × 7
## gender risk treatment variable n mean sd
## <fct> <fct> <fct> <fct> <dbl> <dbl> <dbl>
## 1 male high X pain_score 6 92.7 5.12
## 2 male high Y pain_score 6 82.3 5.00
## 3 male high Z pain_score 6 79.7 4.05
## 4 male low X pain_score 6 76.1 3.86
## 5 male low Y pain_score 6 73.1 4.76
## 6 male low Z pain_score 6 74.5 4.89
## 7 female high X pain_score 6 78.9 5.32
## 8 female high Y pain_score 6 81.2 4.62
## 9 female high Z pain_score 6 81.0 3.98
## 10 female low X pain_score 6 74.2 3.69
## 11 female low Y pain_score 6 68.4 4.08
## 12 female low Z pain_score 6 69.8 2.72
ASSUMPTIONS
Outliers
remove_outliers <- function(data, group_vars, outlier_var) {
# Identify outliers
outliers <- data %>%
group_by(across(all_of(group_vars))) %>%
identify_outliers(!!sym(outlier_var))
# Remove outliers from the original dataset
data_clean <- data %>%
anti_join(outliers, by = c(group_vars, outlier_var))
return(data_clean)
}
#Remove outliers as they can distort the estimates and affect interpretation.´Now, let's work with that new data.
headache <- remove_outliers(headache, c("gender", "risk", "treatment"), "pain_score")
headache## # A tibble: 68 × 5
## id gender risk treatment pain_score
## <int> <fct> <fct> <fct> <dbl>
## 1 1 male low X 79.3
## 2 2 male low X 76.8
## 3 3 male low X 70.8
## 4 4 male low X 81.2
## 5 5 male low X 75.1
## 6 6 male low X 73.1
## 7 7 male low Y 68.2
## 8 8 male low Y 80.7
## 9 9 male low Y 75.3
## 10 10 male low Y 73.4
## # ℹ 58 more rows
There are 4 outliers, that are clearly seen when looking at the boxplot.
Normality
grouped_data <- headache %>%
group_by(gender, risk, treatment)
normality_test <- shapiro_test(grouped_data, pain_score)
table <- normality_test %>%
kbl() %>%
kable_material_dark()
table | gender | risk | treatment | variable | statistic | p |
|---|---|---|---|---|---|
| male | high | X | pain_score | 0.958 | 0.808 |
| male | high | Y | pain_score | 0.902 | 0.384 |
| male | high | Z | pain_score | 0.955 | 0.784 |
| male | low | X | pain_score | 0.982 | 0.962 |
| male | low | Y | pain_score | 0.920 | 0.507 |
| male | low | Z | pain_score | 0.924 | 0.535 |
| female | high | X | pain_score | 0.922 | 0.543 |
| female | high | Y | pain_score | 0.978 | 0.923 |
| female | high | Z | pain_score | 0.977 | 0.882 |
| female | low | X | pain_score | 0.933 | 0.600 |
| female | low | Y | pain_score | 0.927 | 0.555 |
| female | low | Z | pain_score | 0.958 | 0.801 |
colors <- c("male" = "blue", "female" = "pink")
ggqqplot(headache, x = "pain_score",
color = "gender",
shape = "gender",
fill = "gender",
title = "QQ Plot of Pain Scores by Gender and Treatment",
caption = "Data source: headache") +
scale_color_manual(values = colors) +
theme_dark() +
theme(plot.title = element_text(hjust = 0.5)) +
facet_grid(risk ~ gender + treatment, scales = "free")
We perform the shapiro test on all possible combinations and make a
table that summarizes the information. We see that the pvalue is greater
than 0.05 in all cases.Therefore, we assume normality. In addition, we
can see normality with the qq plot graphs.
Homogeneity of variance
levene_test_result <- headache %>%
levene_test(pain_score~gender*treatment*risk)
styled_table <- levene_test_result %>%
kable("html") %>%
kable_styling("striped", full_width = FALSE) %>%
add_header_above(c(" " = 2, "Levene's Test" = 2)) %>%
row_spec(0, bold = T, color = "white", background = "#1a1a1a") %>%
column_spec(1:4, bold = T, color = "black", background = "#add8e6")
styled_table| df1 | df2 | statistic | p |
|---|---|---|---|
| 11 | 56 | 0.896 | 0.55 |
##
## Call:
## lm(formula = pain_score ~ gender * treatment * risk, data = headache)
##
## Residuals:
## Min 1Q Median 3Q Max
## -6.445 -2.914 0.174 2.129 8.837
##
## Coefficients:
## Estimate Std. Error t value
## (Intercept) 92.74 1.62 57.40
## genderfemale -11.77 2.40 -4.91
## treatmentY -10.40 2.29 -4.55
## treatmentZ -13.06 2.29 -5.71
## risklow -16.69 2.29 -7.30
## genderfemale:treatmentY 12.21 3.39 3.60
## genderfemale:treatmentZ 13.11 3.50 3.74
## genderfemale:risklow 9.88 3.31 2.98
## treatmentY:risklow 7.48 3.23 2.32
## treatmentZ:risklow 11.46 3.23 3.55
## genderfemale:treatmentY:risklow -15.09 4.68 -3.22
## genderfemale:treatmentZ:risklow -15.89 4.77 -3.33
## Pr(>|t|)
## (Intercept) < 0.0000000000000002 ***
## genderfemale 0.0000081928 ***
## treatmentY 0.0000292646 ***
## treatmentZ 0.0000004407 ***
## risklow 0.0000000011 ***
## genderfemale:treatmentY 0.00067 ***
## genderfemale:treatmentZ 0.00043 ***
## genderfemale:risklow 0.00422 **
## treatmentY:risklow 0.02424 *
## treatmentZ:risklow 0.00080 ***
## genderfemale:treatmentY:risklow 0.00212 **
## genderfemale:treatmentZ:risklow 0.00152 **
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 3.96 on 56 degrees of freedom
## Multiple R-squared: 0.769, Adjusted R-squared: 0.723
## F-statistic: 16.9 on 11 and 56 DF, p-value: 0.0000000000000476
We prove homogeneity of variances with Levene’s test. As pvalue is greater than 0.05, the variances are homogeneous, i.e, there is no difference between variances across groups. We can alse see it graphically with the linear regression graphs.
Anova
## Analysis of Variance Table
##
## Response: pain_score
## Df Sum Sq Mean Sq F value Pr(>F)
## gender 1 301 301 19.22 0.0000518047511780 ***
## risk 1 1897 1897 121.07 0.0000000000000013 ***
## treatment 2 352 176 11.24 0.0000784919250981 ***
## gender:risk 1 1 1 0.04 0.8516
## gender:treatment 2 82 41 2.62 0.0817 .
## risk:treatment 2 57 28 1.81 0.1727
## gender:risk:treatment 2 226 113 7.22 0.0016 **
## Residuals 56 877 16
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
We performed an anova test. With this table the most important thing we see is that there is interaction between the three variables (the pvalue is greater than 0.05). We study the rest of the relationships later descomposing the interaction effect.
Post-hoc tests
If there is a significant 3-way interaction effect, you can decompose it into:
- Simple two-way interaction: run two-way interaction at each level of third variable,
- Simple simple main effect: run one-way model at each level of second variable,
- Simple simple pairwise comparisons: run pairwise or other post-hoc comparisons if necessary.
If you do not have a statistically significant three-way interaction, you need to determine whether you have any statistically significant two-way interaction from the ANOVA output. You can follow up a significant two-way interaction by simple main effects analyses and pairwise comparisons between groups if necessary.
Two-way interactions
modelgen <- lm(pain_score ~ gender * risk * treatment, data = headache)
headache %>%
group_by(gender) %>%
anova_test(pain_score~ risk*treatment, error = modelgen)## # A tibble: 6 × 8
## gender Effect DFn DFd F p `p<.05` ges
## * <fct> <chr> <dbl> <dbl> <dbl> <dbl> <chr> <dbl>
## 1 male risk 1 56 61.8 1.31e-10 "*" 0.525
## 2 male treatment 2 56 12.6 3.10e- 5 "*" 0.31
## 3 male risk:treatment 2 56 6.49 3 e- 3 "*" 0.188
## 4 female risk 1 56 58.6 2.90e-10 "*" 0.511
## 5 female treatment 2 56 1.28 2.85e- 1 "" 0.044
## 6 female risk:treatment 2 56 2.54 8.8 e- 2 "" 0.083
modelrisk <- lm(pain_score ~ gender * risk * treatment, data = headache)
headache %>%
group_by(risk) %>%
anova_test(pain_score~ gender*treatment, error = modelrisk)## # A tibble: 6 × 8
## risk Effect DFn DFd F p `p<.05` ges
## * <fct> <chr> <dbl> <dbl> <dbl> <dbl> <chr> <dbl>
## 1 high gender 1 56 6.23 0.016 "*" 0.1
## 2 high treatment 2 56 9.46 0.000288 "*" 0.253
## 3 high gender:treatment 2 56 9.10 0.000378 "*" 0.245
## 4 low gender 1 56 8.22 0.006 "*" 0.128
## 5 low treatment 2 56 3.80 0.028 "*" 0.119
## 6 low gender:treatment 2 56 0.512 0.602 "" 0.018
modeltreat <- lm(pain_score ~ gender * risk * treatment, data = headache)
headache %>%
group_by(treatment) %>%
anova_test(pain_score~ gender*risk, error = modeltreat)## # A tibble: 9 × 8
## treatment Effect DFn DFd F p `p<.05` ges
## * <fct> <chr> <dbl> <dbl> <dbl> <dbl> <chr> <dbl>
## 1 X gender 1 56 15.9 0.000194 "*" 0.221
## 2 X risk 1 56 52.5 0.00000000136 "*" 0.484
## 3 X gender:risk 1 56 8.90 0.004 "*" 0.137
## 4 Y gender 1 56 1.92 0.171 "" 0.033
## 5 Y risk 1 56 49.9 0.00000000268 "*" 0.471
## 6 Y gender:risk 1 56 2.48 0.121 "" 0.042
## 7 Z gender 1 56 1.38 0.245 "" 0.024
## 8 Z risk 1 56 21.5 0.0000216 "*" 0.277
## 9 Z gender:risk 1 56 3.08 0.085 "" 0.052
Now, we run two-way interaction at each level of third variable.
Table 1: Fix gender. It is observed that risk significantly influences both men and women (pvalue greater than 0.05). The treatment influences men, but not women. Furthermore, the risk:treatment interaction in men shows a statistically significant effect on pain_score.
Table 2: Fix risk. It is observed that gender significantly influences both high and low risk people, and the same with treatment (pvalue greater than 0.05). However, the gender:treatment interaction only shows a statistically significant effect on pain_score in males.
Table 3: Fix treatment. It is observed that level of risk significantly influences the 3 treatments. Gender only affects X treatment. Moreover, the gender:risk interaction only shows a statistically significant effect on X treatment.
Main effects
modelgen <- lm(pain_score ~ gender * risk * treatment, data = headache)
headache %>%
group_by(gender,risk) %>%
anova_test(pain_score~treatment, error = model)## # A tibble: 4 × 9
## gender risk Effect DFn DFd F p `p<.05` ges
## * <fct> <fct> <chr> <dbl> <dbl> <dbl> <dbl> <chr> <dbl>
## 1 male high treatment 2 56 18.2 0.000000795 "*" 0.394
## 2 male low treatment 2 56 0.815 0.448 "" 0.028
## 3 female high treatment 2 56 0.329 0.721 "" 0.012
## 4 female low treatment 2 56 3.49 0.037 "*" 0.111
modelrisk <- lm(pain_score ~ gender * risk * treatment, data = headache)
headache %>%
group_by(risk,treatment) %>%
anova_test(pain_score~ gender, error = model)## # A tibble: 6 × 9
## risk treatment Effect DFn DFd F p `p<.05` ges
## * <fct> <fct> <chr> <dbl> <dbl> <dbl> <dbl> <chr> <dbl>
## 1 high X gender 1 56 24.1 0.00000819 "*" 0.301
## 2 high Y gender 1 56 0.034 0.855 "" 0.000601
## 3 high Z gender 1 56 0.275 0.602 "" 0.005
## 4 low X gender 1 56 0.688 0.41 "" 0.012
## 5 low Y gender 1 56 4.37 0.041 "*" 0.072
## 6 low Z gender 1 56 4.19 0.045 "*" 0.07
modeltreat <- lm(pain_score ~ gender * risk * treatment, data = headache)
headache %>%
group_by(treatment,gender) %>%
anova_test(pain_score~ risk, error = model)## # A tibble: 6 × 9
## gender treatment Effect DFn DFd F p `p<.05` ges
## * <fct> <fct> <chr> <dbl> <dbl> <dbl> <dbl> <chr> <dbl>
## 1 male X risk 1 56 53.3 0.00000000109 * 0.488
## 2 female X risk 1 56 8.07 0.006 * 0.126
## 3 male Y risk 1 56 16.2 0.000171 * 0.225
## 4 female Y risk 1 56 36.2 0.000000143 * 0.393
## 5 male Z risk 1 56 5.23 0.026 * 0.085
## 6 female Z risk 1 56 19.4 0.0000491 * 0.257
Table 1. Fix gender and risk. In the group of men with a high level of risk and in women with low level of risk, there is a statistically significant difference in the pain score depending on the treatment. Since we have 3 types of medications, with the code below we see which ones specifically differ.
Table 2. Fix risk and treatment. For the low level of risk and Y and Z of treatment, and for high level of risk and X treatment, there is a statistically significant difference in the response variable depending on gender.
Table 3. Fix gender and treatment. The treatment variable has a significant impact on the pain_score for different levels of the risk variable (X, Y, Z) in both genders, male and female.
pwc1 <- headache %>%
group_by(gender,risk) %>%
emmeans_test(pain_score ~ treatment, p.adjust.method = "bonferroni")
head(pwc1,3)## # A tibble: 3 × 11
## gender risk term .y. group1 group2 df statistic p p.adj
## <fct> <fct> <chr> <chr> <chr> <chr> <dbl> <dbl> <dbl> <dbl>
## 1 male high treatment pain_sco… X Y 56 4.55 2.93e-5 8.78e-5
## 2 male high treatment pain_sco… X Z 56 5.71 4.41e-7 1.32e-6
## 3 male high treatment pain_sco… Y Z 56 1.16 2.49e-1 7.48e-1
## # ℹ 1 more variable: p.adj.signif <chr>
pwc2 <- headache %>%
group_by(risk,treatment) %>%
emmeans_test(pain_score ~ gender, p.adjust.method = "bonferroni")
pwc2## # A tibble: 6 × 11
## risk treatment term .y. group1 group2 df statistic p p.adj
## * <fct> <fct> <chr> <chr> <chr> <chr> <dbl> <dbl> <dbl> <dbl>
## 1 high X gender pain_sco… male female 56 4.91 8.19e-6 8.19e-6
## 2 high Y gender pain_sco… male female 56 -0.184 8.55e-1 8.55e-1
## 3 high Z gender pain_sco… male female 56 -0.524 6.02e-1 6.02e-1
## 4 low X gender pain_sco… male female 56 0.830 4.10e-1 4.10e-1
## 5 low Y gender pain_sco… male female 56 2.09 4.11e-2 4.11e-2
## 6 low Z gender pain_sco… male female 56 2.05 4.54e-2 4.54e-2
## # ℹ 1 more variable: p.adj.signif <chr>
pwc3 <- headache %>%
group_by(treatment,gender) %>%
emmeans_test(pain_score ~ risk, p.adjust.method = "bonferroni")
head(pwc1,3)## # A tibble: 3 × 11
## gender risk term .y. group1 group2 df statistic p p.adj
## <fct> <fct> <chr> <chr> <chr> <chr> <dbl> <dbl> <dbl> <dbl>
## 1 male high treatment pain_sco… X Y 56 4.55 2.93e-5 8.78e-5
## 2 male high treatment pain_sco… X Z 56 5.71 4.41e-7 1.32e-6
## 3 male high treatment pain_sco… Y Z 56 1.16 2.49e-1 7.48e-1
## # ℹ 1 more variable: p.adj.signif <chr>
In these tables we can see, within each group, which ones differ significantly.
Pairwise comparisons
pairwise1 <- headache %>%
pairwise_t_test(
pain_score ~ treatment,
p.adjust.method = "bonferroni"
)
pairwise1## # A tibble: 3 × 9
## .y. group1 group2 n1 n2 p p.signif p.adj p.adj.signif
## * <chr> <chr> <chr> <int> <int> <dbl> <chr> <dbl> <chr>
## 1 pain_score X Y 23 23 0.0358 * 0.107 ns
## 2 pain_score X Z 23 22 0.0196 * 0.0589 ns
## 3 pain_score Y Z 23 22 0.786 ns 1 ns
pairwise2 <- headache %>%
pairwise_t_test(
pain_score ~ risk,
p.adjust.method = "bonferroni"
)
pairwise2## # A tibble: 1 × 9
## .y. group1 group2 n1 n2 p p.signif p.adj p.adj.signif
## * <chr> <chr> <chr> <int> <int> <dbl> <chr> <dbl> <chr>
## 1 pain_score high low 32 36 3.22e-12 **** 3.22e-12 ****
pairwise3 <- headache %>%
pairwise_t_test(
pain_score ~ gender,
p.adjust.method = "bonferroni"
)
pairwise3## # A tibble: 1 × 9
## .y. group1 group2 n1 n2 p p.signif p.adj p.adj.signif
## * <chr> <chr> <chr> <int> <int> <dbl> <chr> <dbl> <chr>
## 1 pain_score male female 36 32 0.0199 * 0.0199 *