3-way Anova

3-Way ANOVA

The three-way ANOVA is an extension of the two-way ANOVA for assessing whether there is an interaction effect between three independent categorical variables on a continuous outcome variable.

We’ll use the headache dataset [datarium package], which contains the measures of migraine headache episode pain score in 72 participants treated with three different treatments. The participants include 36 males and 36 females. Males and females were further subdivided into whether they were at low or high risk of migraine.

We want to understand how each independent variable (type of treatments, risk of migraine and gender) interact to predict the pain score.

Descriptive statistics

## # A tibble: 12 x 7
##    gender risk  treatment variable       n  mean    sd
##    <fct>  <fct> <fct>     <chr>      <dbl> <dbl> <dbl>
##  1 male   high  X         pain_score     6  92.7  5.12
##  2 male   high  Y         pain_score     6  82.3  5.00
##  3 male   high  Z         pain_score     6  79.7  4.05
##  4 male   low   X         pain_score     6  76.1  3.86
##  5 male   low   Y         pain_score     6  73.1  4.76
##  6 male   low   Z         pain_score     6  74.5  4.89
##  7 female high  X         pain_score     6  78.9  5.32
##  8 female high  Y         pain_score     6  81.2  4.62
##  9 female high  Z         pain_score     6  81.0  3.98
## 10 female low   X         pain_score     6  74.2  3.69
## 11 female low   Y         pain_score     6  68.4  4.08
## 12 female low   Z         pain_score     6  69.8  2.72

Assumptions

Outliers

## # A tibble: 4 x 7
##   gender risk  treatment    id pain_score is.outlier is.extreme
##   <fct>  <fct> <fct>     <int>      <dbl> <lgl>      <lgl>     
## 1 female high  X            57       68.4 TRUE       TRUE      
## 2 female high  Y            62       73.1 TRUE       FALSE     
## 3 female high  Z            67       75.0 TRUE       FALSE     
## 4 female high  Z            71       87.1 TRUE       FALSE

There is one extreme outlier, which is a female with id = 57, with high risk and treatment X

Normality

## # A tibble: 1 x 3
##   variable         statistic p.value
##   <chr>                <dbl>   <dbl>
## 1 residuals(model)     0.982   0.398
## # A tibble: 12 x 6
##    gender risk  treatment variable   statistic       p
##    <fct>  <fct> <fct>     <chr>          <dbl>   <dbl>
##  1 male   high  X         pain_score     0.958 0.808  
##  2 male   high  Y         pain_score     0.902 0.384  
##  3 male   high  Z         pain_score     0.955 0.784  
##  4 male   low   X         pain_score     0.982 0.962  
##  5 male   low   Y         pain_score     0.920 0.507  
##  6 male   low   Z         pain_score     0.924 0.535  
##  7 female high  X         pain_score     0.714 0.00869
##  8 female high  Y         pain_score     0.939 0.654  
##  9 female high  Z         pain_score     0.971 0.901  
## 10 female low   X         pain_score     0.933 0.600  
## 11 female low   Y         pain_score     0.927 0.555  
## 12 female low   Z         pain_score     0.958 0.801

Only one group is not normally distributed, which is group of females with high risk and treatment X.

As it can be observed on the plot, data in all groups is normally distributed, except for the group of females with high risk and treatment X (as was already identified by Shapiro test and an outlier was found in this group).

Homogeneity of variance

## # A tibble: 1 x 4
##     df1   df2 statistic     p
##   <int> <int>     <dbl> <dbl>
## 1    11    60     0.179 0.998

Obtained p-value is bigger than 0.05, hence test is not significant. So, we can assume homogenity of variance in different groups.

Anova

## Analysis of Variance Table
## 
## Response: pain_score
##                       Df Sum Sq Mean Sq F value            Pr(>F)    
## gender                 1    313     313   16.20           0.00016 ***
## risk                   1   1794    1794   92.70 0.000000000000088 ***
## treatment              2    283     142    7.32           0.00143 ** 
## gender:risk            1      3       3    0.14           0.70849    
## gender:treatment       2    129      65    3.34           0.04220 *  
## risk:treatment         2     28      14    0.71           0.49422    
## gender:risk:treatment  2    287     143    7.41           0.00133 ** 
## Residuals             60   1161      19                              
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## Coefficient covariances computed by hccm()
## ANOVA Table (type II tests)
## 
##                  Effect DFn DFd      F                 p p<.05   ges
## 1                gender   1  60 16.196 0.000163000000000     * 0.213
## 2                  risk   1  60 92.699 0.000000000000088     * 0.607
## 3             treatment   2  60  7.318 0.001000000000000     * 0.196
## 4           gender:risk   1  60  0.141 0.708000000000000       0.002
## 5      gender:treatment   2  60  3.338 0.042000000000000     * 0.100
## 6        risk:treatment   2  60  0.713 0.494000000000000       0.023
## 7 gender:risk:treatment   2  60  7.406 0.001000000000000     * 0.198

There is a statistically significant interaction between gender, risk and treatment, where F(2,60) = 7.406 and p = 0.001.

Post-hoc tests

If there is a significant 3-way interaction effect, you can decompose it into:

  • Simple two-way interaction: run two-way interaction at each level of third variable,
  • Simple simple main effect: run one-way model at each level of second variable,
  • Simple simple pairwise comparisons: run pairwise or other post-hoc comparisons if necessary.

If you do not have a statistically significant three-way interaction, you need to determine whether you have any statistically significant two-way interaction from the ANOVA output. You can follow up a significant two-way interaction by simple main effects analyses and pairwise comparisons between groups if necessary.

Two-way interactions

## Coefficient covariances computed by hccm()
## Coefficient covariances computed by hccm()
## # A tibble: 6 x 8
##   gender Effect           DFn   DFd      F             p `p<.05`   ges
## * <fct>  <chr>          <dbl> <dbl>  <dbl>         <dbl> <chr>   <dbl>
## 1 male   risk               1    60 50.0   0.00000000187 "*"     0.455
## 2 male   treatment          2    60 10.2   0.000157      "*"     0.253
## 3 male   risk:treatment     2    60  5.25  0.008         "*"     0.149
## 4 female risk               1    60 42.8   0.000000015   "*"     0.416
## 5 female treatment          2    60  0.482 0.62          ""      0.016
## 6 female risk:treatment     2    60  2.87  0.065         ""      0.087

We can observe statistically significant to way interaction between risk and treatment for males, with F(2,60) = 5.252 and p = 0.008.

Main effects

## Coefficient covariances computed by hccm()
## Coefficient covariances computed by hccm()
## Coefficient covariances computed by hccm()
## Coefficient covariances computed by hccm()
## # A tibble: 4 x 9
##   gender risk  Effect      DFn   DFd     F         p `p<.05`   ges
## * <fct>  <fct> <chr>     <dbl> <dbl> <dbl>     <dbl> <chr>   <dbl>
## 1 male   high  treatment     2    60 14.8  0.0000061 "*"     0.33 
## 2 male   low   treatment     2    60  0.66 0.521     ""      0.022
## 3 female high  treatment     2    60  0.52 0.597     ""      0.017
## 4 female low   treatment     2    60  2.83 0.067     ""      0.086

There is a statistically significant main effect of treatment for males with high risk, with F(2,60) = 14.77 and p = 0.0000061.

So, for this group the type of treatment has a significant impact on the pain score.

Pairwise comparisons

## # A tibble: 12 x 11
##    gender risk  term      .y.     group1 group2    df statistic        p   p.adj
##  * <chr>  <chr> <chr>     <chr>   <chr>  <chr>  <dbl>     <dbl>    <dbl>   <dbl>
##  1 female high  treatment pain_s~ X      Y         60   -0.910   3.67e-1 1   e+0
##  2 female high  treatment pain_s~ X      Z         60   -0.855   3.96e-1 1   e+0
##  3 female high  treatment pain_s~ Y      Z         60    0.0552  9.56e-1 1   e+0
##  4 female low   treatment pain_s~ X      Y         60    2.28    2.61e-2 7.82e-2
##  5 female low   treatment pain_s~ X      Z         60    1.72    9.00e-2 2.70e-1
##  6 female low   treatment pain_s~ Y      Z         60   -0.558   5.79e-1 1   e+0
##  7 male   high  treatment pain_s~ X      Y         60    4.09    1.29e-4 3.86e-4
##  8 male   high  treatment pain_s~ X      Z         60    5.14    3.14e-6 9.42e-6
##  9 male   high  treatment pain_s~ Y      Z         60    1.05    2.99e-1 8.97e-1
## 10 male   low   treatment pain_s~ X      Y         60    1.15    2.56e-1 7.68e-1
## 11 male   low   treatment pain_s~ X      Z         60    0.628   5.32e-1 1   e+0
## 12 male   low   treatment pain_s~ Y      Z         60   -0.519   6.06e-1 1   e+0
## # ... with 1 more variable: p.adj.signif <chr>
## # A tibble: 3 x 9
##   gender risk  treatment emmean    se    df conf.low conf.high method      
##   <fct>  <fct> <fct>      <dbl> <dbl> <dbl>    <dbl>     <dbl> <chr>       
## 1 male   high  X           92.7  1.80    60     89.1      96.3 Emmeans test
## 2 male   high  Y           82.3  1.80    60     78.7      85.9 Emmeans test
## 3 male   high  Z           79.7  1.80    60     76.1      83.3 Emmeans test

There is statistically significant difference between those pairs of treatment: - X-Y: p.adj = 0.00038591 - X-Z: p.adj = 0.00000942

## # A tibble: 3 x 9
##   .y.        group1 group2    n1    n2      p p.signif p.adj p.adj.signif
## * <chr>      <chr>  <chr>  <int> <int>  <dbl> <chr>    <dbl> <chr>       
## 1 pain_score X      Y         24    24 0.0514 ns       0.154 ns          
## 2 pain_score X      Z         24    24 0.0505 ns       0.152 ns          
## 3 pain_score Y      Z         24    24 0.994  ns       1     ns
## # A tibble: 1 x 9
##   .y.        group1 group2    n1    n2      p p.signif  p.adj p.adj.signif
## * <chr>      <chr>  <chr>  <int> <int>  <dbl> <chr>     <dbl> <chr>       
## 1 pain_score male   female    36    36 0.0172 *        0.0172 *
## # A tibble: 1 x 9
##   .y.        group1 group2    n1    n2        p p.signif    p.adj p.adj.signif
## * <chr>      <chr>  <chr>  <int> <int>    <dbl> <chr>       <dbl> <chr>       
## 1 pain_score high   low       36    36 1.24e-10 ****     1.24e-10 ****

#Summary

As it was stated before, the distribution is normal and we can assume homogenity of variances.

  • There is a statistically significant interaction between gender, risk and treatment, where F(2,60) = 7.406 and p = 0.001.

  • There is statistically significant two way interaction between risk and treatment for males, with F(2,60) = 5.252 and p = 0.008.

  • There is a statistically significant main effect of treatment for males with high risk, with F(2,60) = 14.77 and p = 0.0000061.

There was observed one extreme outlier, and it is a female with id = 57, with high risk and treatment X.