Statistical Inference

3-way Anova

3-Way ANOVA

The three-way ANOVA is an extension of the two-way ANOVA for assessing whether there is an interaction effect between three independent categorical variables on a continuous outcome variable.

We’ll use the headache dataset [datarium package], which contains the measures of migraine headache episode pain score in 72 participants treated with three different treatments. The participants include 36 males and 36 females. Males and females were further subdivided into whether they were at low or high risk of migraine.

We want to understand how each independent variable (type of treatments, risk of migraine and gender) interact to predict the pain score.

Descriptive statistics

summary(headache)
##        id          gender     risk    treatment   pain_score   
##  Min.   : 1.0   male  :36   high:36   X:24      Min.   : 63.7  
##  1st Qu.:18.8   female:36   low :36   Y:24      1st Qu.: 72.9  
##  Median :36.5                         Z:24      Median : 77.9  
##  Mean   :36.5                                   Mean   : 77.6  
##  3rd Qu.:54.2                                   3rd Qu.: 81.5  
##  Max.   :72.0                                   Max.   :100.0
ggplot(headache, aes(x = treatment, y = pain_score, fill = risk)) +
  geom_boxplot() +
  facet_grid(. ~ gender) +
  labs(title = "Distribution of Pain Scores by Treatment, Risk, and Gender")

Assumptions

model <- lm(pain_score ~ treatment * risk * gender, data = headache)
# Outliers
headache %>%
  group_by(gender, risk, treatment) %>%
  identify_outliers(pain_score)
## # A tibble: 4 × 7
##   gender risk  treatment    id pain_score is.outlier is.extreme
##   <fct>  <fct> <fct>     <int>      <dbl> <lgl>      <lgl>     
## 1 female high  X            57       68.4 TRUE       TRUE      
## 2 female high  Y            62       73.1 TRUE       FALSE     
## 3 female high  Z            67       75.0 TRUE       FALSE     
## 4 female high  Z            71       87.1 TRUE       FALSE

We can observe 4 outliers, from which 1 is an extreme

#### Normality
shapiro_test <- shapiro.test(headache$pain_score)
shapiro_test
## 
##  Shapiro-Wilk normality test
## 
## data:  headache$pain_score
## W = 1, p-value = 0.08

From shapiro test we can assume that the data is normal

ggqqplot(headache, "pain_score") +
  facet_grid(gender~risk~treatment)

This assumption is supported by the qqplot, where in all expect one plot, points fall approximetely along reference line. The one plot that is an exception is the one where we identified an outlier.

### Homogeneity of variance
levene_test <- levene_test(model)
levene_test
## # A tibble: 1 × 4
##     df1   df2 statistic     p
##   <int> <int>     <dbl> <dbl>
## 1    11    60     0.179 0.998

The p-value is > 0.05, which is not significant. This means that, there is not significant difference between variances across groups. Therefore, we can assume the homogeneity of variances in the different treatment groups

spread_location_plot <- ggplot(data = augment(model), aes(.fitted, .resid)) +
  geom_point() +
  geom_smooth() +
  labs(title = "Spread-Location Plot") +
  theme_minimal()
spread_location_plot
## `geom_smooth()` using method = 'loess' and formula = 'y ~ x'

Anova

anova_model <- aov(model)
summary(anova_model)
##                       Df Sum Sq Mean Sq F value            Pr(>F)    
## treatment              2    283     142    7.32           0.00143 ** 
## risk                   1   1794    1794   92.70 0.000000000000088 ***
## gender                 1    313     313   16.20           0.00016 ***
## treatment:risk         2     28      14    0.71           0.49422    
## treatment:gender       2    129      65    3.34           0.04220 *  
## risk:gender            1      3       3    0.14           0.70849    
## treatment:risk:gender  2    287     143    7.41           0.00133 ** 
## Residuals             60   1161      19                              
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

There is a significant 3-way interaction between treatment, risk and gender, and significant 2-way interaction between treatment and risk. All other interactions are not significant.

Post-hoc tests

If there is a significant 3-way interaction effect, you can decompose it into:

  • Simple two-way interaction: run two-way interaction at each level of third variable,
  • Simple simple main effect: run one-way model at each level of second variable,
  • Simple simple pairwise comparisons: run pairwise or other post-hoc comparisons if necessary.

If you do not have a statistically significant three-way interaction, you need to determine whether you have any statistically significant two-way interaction from the ANOVA output. You can follow up a significant two-way interaction by simple main effects analyses and pairwise comparisons between groups if necessary.

Two-way interactions

headache %>%
  group_by(gender) %>%
  anova_test(pain_score ~ risk * treatment)
## # A tibble: 6 × 8
##   gender Effect           DFn   DFd      F           p `p<.05`   ges
## * <fct>  <chr>          <dbl> <dbl>  <dbl>       <dbl> <chr>   <dbl>
## 1 male   risk               1    30 45.0   0.000000195 "*"     0.6  
## 2 male   treatment          2    30  9.15  0.000788    "*"     0.379
## 3 male   risk:treatment     2    30  4.72  0.016       "*"     0.24 
## 4 female risk               1    30 48.2   0.000000104 "*"     0.616
## 5 female treatment          2    30  0.542 0.587       ""      0.035
## 6 female risk:treatment     2    30  3.23  0.054       ""      0.177

We can conclude that the risk factor significantly influences pain scores for both genders. Among males, both treatment and the interaction between risk and treatment are significant contributors, while for females, treatment does not significantly affect pain scores, and there is a marginal effect of the interaction term

headache %>%
  group_by(risk) %>%
  anova_test(pain_score ~ gender * treatment)
## # A tibble: 6 × 8
##   risk  Effect             DFn   DFd     F       p `p<.05`   ges
## * <fct> <chr>            <dbl> <dbl> <dbl>   <dbl> <chr>   <dbl>
## 1 high  gender               1    30 8.45  0.007   "*"     0.22 
## 2 high  treatment            2    30 4.32  0.022   "*"     0.224
## 3 high  gender:treatment     2    30 9.01  0.00086 "*"     0.375
## 4 low   gender               1    30 7.79  0.009   "*"     0.206
## 5 low   treatment            2    30 3.6   0.04    "*"     0.194
## 6 low   gender:treatment     2    30 0.485 0.62    ""      0.031

The data indicates significant interactions between high risk and both gender and treatment, influencing the dependent variable. For low risk, gender and treatment individually show significance.

headache %>%
  group_by(treatment) %>%
  anova_test(pain_score ~ gender * risk)
## # A tibble: 9 × 8
##   treatment Effect        DFn   DFd      F         p `p<.05`   ges
## * <fct>     <chr>       <dbl> <dbl>  <dbl>     <dbl> <chr>   <dbl>
## 1 X         gender          1    20 18.0   0.0004    "*"     0.474
## 2 X         risk            1    20 33.1   0.0000124 "*"     0.624
## 3 X         gender:risk     1    20 10.4   0.004     "*"     0.342
## 4 Y         gender          1    20  2.47  0.132     ""      0.11 
## 5 Y         risk            1    20 33.9   0.0000107 "*"     0.629
## 6 Y         gender:risk     1    20  0.913 0.351     ""      0.044
## 7 Z         gender          1    20  1.04  0.32      ""      0.05 
## 8 Z         risk            1    20 25.6   0.0000593 "*"     0.562
## 9 Z         gender:risk     1    20  3.43  0.079     ""      0.147

We can observe significant interactions between treatment and both gender and risk for conditions X and Y, as indicated by the low p-values (p<0.05). Specifically, for X and Y treatments, gender, risk, and their interaction significantly influence the dependent variable. Treatment Z, however, shows no significant effects.

Main effects

headache %>%
  group_by(gender, risk) %>%
  anova_test(pain_score ~ treatment)
## # A tibble: 4 × 9
##   gender risk  Effect      DFn   DFd      F        p `p<.05`   ges
## * <fct>  <fct> <chr>     <dbl> <dbl>  <dbl>    <dbl> <chr>   <dbl>
## 1 male   high  treatment     2    15 12.7   0.000595 "*"     0.628
## 2 male   low   treatment     2    15  0.623 0.55     ""      0.077
## 3 female high  treatment     2    15  0.461 0.639    ""      0.058
## 4 female low   treatment     2    15  4.36  0.032    "*"     0.368

We can observe significant interactions between gender, risk, and treatment. Specifically, treatment significantly influences the dependent variable for males with high risk and females with low risk.

headache %>%
  group_by(gender, treatment) %>%
  anova_test(pain_score ~ risk)
## # A tibble: 6 × 9
##   gender treatment Effect   DFn   DFd     F         p `p<.05`   ges
## * <fct>  <fct>     <chr>  <dbl> <dbl> <dbl>     <dbl> <chr>   <dbl>
## 1 male   X         risk       1    10 40.7  0.0000803 "*"     0.803
## 2 male   Y         risk       1    10 10.6  0.009     "*"     0.516
## 3 male   Z         risk       1    10  4.07 0.071     ""      0.289
## 4 female X         risk       1    10  3.18 0.105     ""      0.241
## 5 female Y         risk       1    10 25.9  0.00047   "*"     0.722
## 6 female Z         risk       1    10 32.7  0.000194  "*"     0.766

We can observe significant influence of risk on its dependent variables for males with treatment Z, and females with treatment X

headache %>%
  group_by(treatment, risk) %>%
  anova_test(pain_score ~ gender)
## # A tibble: 6 × 9
##   risk  treatment Effect   DFn   DFd      F        p `p<.05`   ges
## * <fct> <fct>     <chr>  <dbl> <dbl>  <dbl>    <dbl> <chr>   <dbl>
## 1 high  X         gender     1    10 21.2   0.000971 "*"     0.68 
## 2 low   X         gender     1    10  0.757 0.405    ""      0.07 
## 3 high  Y         gender     1    10  0.176 0.684    ""      0.017
## 4 low   Y         gender     1    10  3.48  0.092    ""      0.258
## 5 high  Z         gender     1    10  0.341 0.572    ""      0.033
## 6 low   Z         gender     1    10  4.19  0.068    ""      0.295

We can observe significant influence of gender on its dependent variables for high risk with treatment Z and Y, and low risk with treatment Y and X #### Pairwise comparisons

pairwise_trt <- emmeans(model, pairwise ~ treatment)
## NOTE: Results may be misleading due to involvement in interactions
pairwise_trt_summary <- summary(pairwise_trt)
pairwise_trt_summary
## $emmeans
##  treatment emmean    SE df lower.CL upper.CL
##  X           80.5 0.898 60     78.7     82.2
##  Y           76.3 0.898 60     74.5     78.1
##  Z           76.2 0.898 60     74.4     78.0
## 
## Results are averaged over the levels of: risk, gender 
## Confidence level used: 0.95 
## 
## $contrasts
##  contrast estimate   SE df t.ratio p.value
##  X - Y        4.20 1.27 60   3.310  0.0040
##  X - Z        4.22 1.27 60   3.320  0.0040
##  Y - Z        0.02 1.27 60   0.010  1.0000
## 
## Results are averaged over the levels of: risk, gender 
## P value adjustment: tukey method for comparing a family of 3 estimates

There is a significant difference between treatments X - Y and X - Z, but there is no significant difference between Y - Z

pairwise_risk <- emmeans(model, pairwise ~ risk)
## NOTE: Results may be misleading due to involvement in interactions
pairwise_risk_summary <- summary(pairwise_risk)
pairwise_risk_summary
## $emmeans
##  risk emmean    SE df lower.CL upper.CL
##  high   82.6 0.733 60     81.2     84.1
##  low    72.7 0.733 60     71.2     74.1
## 
## Results are averaged over the levels of: treatment, gender 
## Confidence level used: 0.95 
## 
## $contrasts
##  contrast   estimate   SE df t.ratio p.value
##  high - low     9.98 1.04 60   9.630  <.0001
## 
## Results are averaged over the levels of: treatment, gender

There is a significant difference between risk levels of high - low

pairwise_gender <- emmeans(model, pairwise ~ gender)
## NOTE: Results may be misleading due to involvement in interactions
pairwise_gender_summary <- summary(pairwise_gender)
pairwise_gender_summary
## $emmeans
##  gender emmean    SE df lower.CL upper.CL
##  male     79.7 0.733 60     78.3     81.2
##  female   75.6 0.733 60     74.1     77.0
## 
## Results are averaged over the levels of: treatment, risk 
## Confidence level used: 0.95 
## 
## $contrasts
##  contrast      estimate   SE df t.ratio p.value
##  male - female     4.17 1.04 60   4.020  0.0002
## 
## Results are averaged over the levels of: treatment, risk

There is a significant difference between males and females