Statistical Inference

3-way Anova

3-Way ANOVA

The three-way ANOVA is an extension of the two-way ANOVA for assessing whether there is an interaction effect between three independent categorical variables on a continuous outcome variable.

We’ll use the headache dataset [datarium package], which contains the measures of migraine headache episode pain score in 72 participants treated with three different treatments. The participants include 36 males and 36 females. Males and females were further subdivided into whether they were at low or high risk of migraine.

We want to understand how each independent variable (type of treatments, risk of migraine and gender) interact to predict the pain score.

Descriptive statistics

headache %>%
  group_by(gender, risk, treatment) %>%
  get_summary_stats(pain_score, type = "mean_sd")
## # A tibble: 12 × 7
##    gender risk  treatment variable       n  mean    sd
##    <fct>  <fct> <fct>     <fct>      <dbl> <dbl> <dbl>
##  1 male   high  X         pain_score     6  92.7  5.12
##  2 male   high  Y         pain_score     6  82.3  5.00
##  3 male   high  Z         pain_score     6  79.7  4.05
##  4 male   low   X         pain_score     6  76.1  3.86
##  5 male   low   Y         pain_score     6  73.1  4.76
##  6 male   low   Z         pain_score     6  74.5  4.89
##  7 female high  X         pain_score     6  78.9  5.32
##  8 female high  Y         pain_score     6  81.2  4.62
##  9 female high  Z         pain_score     6  81.0  3.98
## 10 female low   X         pain_score     6  74.2  3.69
## 11 female low   Y         pain_score     6  68.4  4.08
## 12 female low   Z         pain_score     6  69.8  2.72

Assumptions

Outliers

headache %>%
group_by(gender, risk, treatment) %>%
identify_outliers(pain_score)
## # A tibble: 4 × 7
##   gender risk  treatment    id pain_score is.outlier is.extreme
##   <fct>  <fct> <fct>     <int>      <dbl> <lgl>      <lgl>     
## 1 female high  X            57       68.4 TRUE       TRUE      
## 2 female high  Y            62       73.1 TRUE       FALSE     
## 3 female high  Z            67       75.0 TRUE       FALSE     
## 4 female high  Z            71       87.1 TRUE       FALSE

Normality

headache %>%
group_by(gender, risk, treatment) %>%
shapiro_test(pain_score)
## # A tibble: 12 × 6
##    gender risk  treatment variable   statistic       p
##    <fct>  <fct> <fct>     <chr>          <dbl>   <dbl>
##  1 male   high  X         pain_score     0.958 0.808  
##  2 male   high  Y         pain_score     0.902 0.384  
##  3 male   high  Z         pain_score     0.955 0.784  
##  4 male   low   X         pain_score     0.982 0.962  
##  5 male   low   Y         pain_score     0.920 0.507  
##  6 male   low   Z         pain_score     0.924 0.535  
##  7 female high  X         pain_score     0.714 0.00869
##  8 female high  Y         pain_score     0.939 0.654  
##  9 female high  Z         pain_score     0.971 0.901  
## 10 female low   X         pain_score     0.933 0.600  
## 11 female low   Y         pain_score     0.927 0.555  
## 12 female low   Z         pain_score     0.958 0.801

Homogeneity of variance

headache %>%
levene_test(pain_score~gender*risk*treatment)
## # A tibble: 1 × 4
##     df1   df2 statistic     p
##   <int> <int>     <dbl> <dbl>
## 1    11    60     0.179 0.998

Anova

results <- headache %>% anova_test(pain_score~gender*risk*treatment)
results
## ANOVA Table (type II tests)
## 
##                  Effect DFn DFd      F                 p p<.05   ges
## 1                gender   1  60 16.196 0.000163000000000     * 0.213
## 2                  risk   1  60 92.699 0.000000000000088     * 0.607
## 3             treatment   2  60  7.318 0.001000000000000     * 0.196
## 4           gender:risk   1  60  0.141 0.708000000000000       0.002
## 5      gender:treatment   2  60  3.338 0.042000000000000     * 0.100
## 6        risk:treatment   2  60  0.713 0.494000000000000       0.023
## 7 gender:risk:treatment   2  60  7.406 0.001000000000000     * 0.198

Post-hoc tests

If there is a significant 3-way interaction effect, you can decompose it into:

  • Simple two-way interaction: run two-way interaction at each level of third variable,
  • Simple simple main effect: run one-way model at each level of second variable,
  • Simple simple pairwise comparisons: run pairwise or other post-hoc comparisons if necessary.

If you do not have a statistically significant three-way interaction, you need to determine whether you have any statistically significant two-way interaction from the ANOVA output. You can follow up a significant two-way interaction by simple main effects analyses and pairwise comparisons between groups if necessary.

Two-way interactions

model  <- lm(pain_score ~ gender*risk*treatment, data = headache)
headache %>%
  group_by(gender) %>%
  anova_test(pain_score ~ risk*treatment, error = model)
## # A tibble: 6 × 8
##   gender Effect           DFn   DFd      F             p `p<.05`   ges
## * <fct>  <chr>          <dbl> <dbl>  <dbl>         <dbl> <chr>   <dbl>
## 1 male   risk               1    60 50.0   0.00000000187 "*"     0.455
## 2 male   treatment          2    60 10.2   0.000157      "*"     0.253
## 3 male   risk:treatment     2    60  5.25  0.008         "*"     0.149
## 4 female risk               1    60 42.8   0.000000015   "*"     0.416
## 5 female treatment          2    60  0.482 0.62          ""      0.016
## 6 female risk:treatment     2    60  2.87  0.065         ""      0.087

Main effects

We can see high significance of results for men with high risk. P-value is very low there so we can treat this result as not random.

model <- lm(pain_score ~ gender*risk*treatment, data = headache)
headache %>%
  group_by(gender,risk) %>%
    anova_test(pain_score ~ treatment, error = model)
## # A tibble: 4 × 9
##   gender risk  Effect      DFn   DFd     F         p `p<.05`   ges
## * <fct>  <fct> <chr>     <dbl> <dbl> <dbl>     <dbl> <chr>   <dbl>
## 1 male   high  treatment     2    60 14.8  0.0000061 "*"     0.33 
## 2 male   low   treatment     2    60  0.66 0.521     ""      0.022
## 3 female high  treatment     2    60  0.52 0.597     ""      0.017
## 4 female low   treatment     2    60  2.83 0.067     ""      0.086

Pairwise comparisons

Pairwise comparison for most significant combination, which is males with high risk.

pwc <- headache %>%
  group_by(gender, risk) %>%
  emmeans_test(pain_score ~ treatment, p.adjust.method = "bonferroni")

# Results for male at high risk
pwc %>% filter(gender == "male", risk == "high")
## # A tibble: 3 × 11
##   gender risk  term    .y.   group1 group2    df stati…¹       p   p.adj p.adj…²
##   <chr>  <chr> <chr>   <chr> <chr>  <chr>  <dbl>   <dbl>   <dbl>   <dbl> <chr>  
## 1 male   high  treatm… pain… X      Y         60    4.09 1.29e-4 3.86e-4 ***    
## 2 male   high  treatm… pain… X      Z         60    5.14 3.14e-6 9.42e-6 ****   
## 3 male   high  treatm… pain… Y      Z         60    1.05 2.99e-1 8.97e-1 ns     
## # … with abbreviated variable names ¹​statistic, ²​p.adj.signif