Statistical Inference

3-way Anova

3-Way ANOVA

The three-way ANOVA is an extension of the two-way ANOVA for assessing whether there is an interaction effect between three independent categorical variables on a continuous outcome variable.

We’ll use the headache dataset [datarium package], which contains the measures of migraine headache episode pain score in 72 participants treated with three different treatments. The participants include 36 males and 36 females. Males and females were further subdivided into whether they were at low or high risk of migraine.

We want to understand how each independent variable (type of treatments, risk of migraine and gender) interact to predict the pain score.

Descriptive statistics

We plot the data in order to check that your assumptions look okay.

Histogram:

ggplot(headache, aes(x = pain_score)) + 
  geom_histogram(bins = 10, fill = "blue", color = "black", alpha = 0.7) + 
  facet_grid(treatment ~ risk * gender) + 
  labs(title = "Pain Score distribution",
       x = "Pain Score ",
       y = "Frequency") +
  theme_classic()

Boxplot:

attach(headache)

  bxp <- ggboxplot(headache, y = "pain_score", x = "gender", color = "risk", palette = "jco", facet.by = c("treatment")) +
  geom_jitter(aes(color = risk)) +
  labs(title = "Dependency of pain score in relation to other factors",
       x = "Gender",
       y = "Pain Score") +
  theme_light()


bxp

At first glance we can see how there is a significant difference in the risk of men who have taken treatment X. Or also the difference in the risk of women who have taken treatment Y.

xtabs(~ treatment + risk + gender, data = headache)

## , , gender = male
## 
##          risk
## treatment high low
##         X    6   6
##         Y    6   6
##         Z    6   6
## 
## , , gender = female
## 
##          risk
## treatment high low
##         X    6   6
##         Y    6   6
##         Z    6   6

Compute the mean and the SD (standard deviation) of the score by groups:

headache %>%
  group_by(gender, risk, treatment) %>%
  get_summary_stats(pain_score, type = "mean_sd")

## # A tibble: 12 × 7
##    gender risk  treatment variable       n  mean    sd
##    <fct>  <fct> <fct>     <fct>      <dbl> <dbl> <dbl>
##  1 male   high  X         pain_score     6  92.7  5.12
##  2 male   high  Y         pain_score     6  82.3  5.00
##  3 male   high  Z         pain_score     6  79.7  4.05
##  4 male   low   X         pain_score     6  76.1  3.86
##  5 male   low   Y         pain_score     6  73.1  4.76
##  6 male   low   Z         pain_score     6  74.5  4.89
##  7 female high  X         pain_score     6  78.9  5.32
##  8 female high  Y         pain_score     6  81.2  4.62
##  9 female high  Z         pain_score     6  81.0  3.98
## 10 female low   X         pain_score     6  74.2  3.69
## 11 female low   Y         pain_score     6  68.4  4.08
## 12 female low   Z         pain_score     6  69.8  2.72

ASSUMPTIONS

Outliers

 remove_outliers <- function(data, group_vars, outlier_var) {
  # Identify outliers
  outliers <- data %>%
    group_by(across(all_of(group_vars))) %>%
    identify_outliers(!!sym(outlier_var))

  # Remove outliers from the original dataset
  data_clean <- data %>%
    anti_join(outliers, by = c(group_vars, outlier_var))

  return(data_clean)
}


#Remove outliers as they can distort the estimates and affect interpretation.´Now, let's work with that new data.
headache <- remove_outliers(headache, c("gender", "risk", "treatment"), "pain_score")
headache

## # A tibble: 68 × 5
##       id gender risk  treatment pain_score
##    <int> <fct>  <fct> <fct>          <dbl>
##  1     1 male   low   X               79.3
##  2     2 male   low   X               76.8
##  3     3 male   low   X               70.8
##  4     4 male   low   X               81.2
##  5     5 male   low   X               75.1
##  6     6 male   low   X               73.1
##  7     7 male   low   Y               68.2
##  8     8 male   low   Y               80.7
##  9     9 male   low   Y               75.3
## 10    10 male   low   Y               73.4
## # ℹ 58 more rows

There are 4 outliers, that are clearly seen when looking at the boxplot.

Normality

grouped_data <- headache %>%
  group_by(gender, risk, treatment)

normality_test <- shapiro_test(grouped_data, pain_score)

table <- normality_test %>%
  kbl() %>%
  kable_material_dark()
table

gender	risk	treatment	variable	statistic	p
male	high	X	pain_score	0.958	0.808
male	high	Y	pain_score	0.902	0.384
male	high	Z	pain_score	0.955	0.784
male	low	X	pain_score	0.982	0.962
male	low	Y	pain_score	0.920	0.507
male	low	Z	pain_score	0.924	0.535
female	high	X	pain_score	0.922	0.543
female	high	Y	pain_score	0.978	0.923
female	high	Z	pain_score	0.977	0.882
female	low	X	pain_score	0.933	0.600
female	low	Y	pain_score	0.927	0.555
female	low	Z	pain_score	0.958	0.801

colors <- c("male" = "blue", "female" = "pink")
ggqqplot(headache, x = "pain_score", 
         color = "gender", 
         shape = "gender",
         fill = "gender", 
         title = "QQ Plot of Pain Scores by Gender and Treatment",
         caption = "Data source: headache") +
  scale_color_manual(values = colors) + 
  theme_dark() +
  theme(plot.title = element_text(hjust = 0.5)) + 
  facet_grid(risk ~ gender + treatment, scales = "free")

We perform the shapiro test on all possible combinations and make a table that summarizes the information. We see that the pvalue is greater than 0.05 in all cases.Therefore, we assume normality. In addition, we can see normality with the qq plot graphs.

Homogeneity of variance

levene_test_result <- headache %>%
  levene_test(pain_score~gender*treatment*risk)

styled_table <- levene_test_result %>%
  kable("html") %>%
  kable_styling("striped", full_width = FALSE) %>%
  add_header_above(c(" " = 2, "Levene's Test" = 2)) %>%
  row_spec(0, bold = T, color = "white", background = "#1a1a1a") %>%
  column_spec(1:4, bold = T, color = "black", background = "#add8e6")
styled_table

		Levene’s Test
df1	df2	statistic	p
11	56	0.896	0.55

model <- lm(pain_score ~ gender * treatment * risk, data = headache)
summary(model)

## 
## Call:
## lm(formula = pain_score ~ gender * treatment * risk, data = headache)
## 
## Residuals:
##    Min     1Q Median     3Q    Max 
## -6.445 -2.914  0.174  2.129  8.837 
## 
## Coefficients:
##                                 Estimate Std. Error t value
## (Intercept)                        92.74       1.62   57.40
## genderfemale                      -11.77       2.40   -4.91
## treatmentY                        -10.40       2.29   -4.55
## treatmentZ                        -13.06       2.29   -5.71
## risklow                           -16.69       2.29   -7.30
## genderfemale:treatmentY            12.21       3.39    3.60
## genderfemale:treatmentZ            13.11       3.50    3.74
## genderfemale:risklow                9.88       3.31    2.98
## treatmentY:risklow                  7.48       3.23    2.32
## treatmentZ:risklow                 11.46       3.23    3.55
## genderfemale:treatmentY:risklow   -15.09       4.68   -3.22
## genderfemale:treatmentZ:risklow   -15.89       4.77   -3.33
##                                             Pr(>|t|)    
## (Intercept)                     < 0.0000000000000002 ***
## genderfemale                            0.0000081928 ***
## treatmentY                              0.0000292646 ***
## treatmentZ                              0.0000004407 ***
## risklow                                 0.0000000011 ***
## genderfemale:treatmentY                      0.00067 ***
## genderfemale:treatmentZ                      0.00043 ***
## genderfemale:risklow                         0.00422 ** 
## treatmentY:risklow                           0.02424 *  
## treatmentZ:risklow                           0.00080 ***
## genderfemale:treatmentY:risklow              0.00212 ** 
## genderfemale:treatmentZ:risklow              0.00152 ** 
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 3.96 on 56 degrees of freedom
## Multiple R-squared:  0.769,  Adjusted R-squared:  0.723 
## F-statistic: 16.9 on 11 and 56 DF,  p-value: 0.0000000000000476

par(mfrow = c(2, 2))
plot(model)

headache$predicted_pain_score <- predict(model, newdata = headache)

We prove homogeneity of variances with Levene’s test. As pvalue is greater than 0.05, the variances are homogeneous, i.e, there is no difference between variances across groups. We can alse see it graphically with the linear regression graphs.

Anova

results <- aov(data=headache,pain_score ~ gender*risk*treatment)
anova(results)

## Analysis of Variance Table
## 
## Response: pain_score
##                       Df Sum Sq Mean Sq F value             Pr(>F)    
## gender                 1    301     301   19.22 0.0000518047511780 ***
## risk                   1   1897    1897  121.07 0.0000000000000013 ***
## treatment              2    352     176   11.24 0.0000784919250981 ***
## gender:risk            1      1       1    0.04             0.8516    
## gender:treatment       2     82      41    2.62             0.0817 .  
## risk:treatment         2     57      28    1.81             0.1727    
## gender:risk:treatment  2    226     113    7.22             0.0016 ** 
## Residuals             56    877      16                               
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

We performed an anova test. With this table the most important thing we see is that there is interaction between the three variables (the pvalue is greater than 0.05). We study the rest of the relationships later descomposing the interaction effect.

Post-hoc tests

If there is a significant 3-way interaction effect, you can decompose it into:

Simple two-way interaction: run two-way interaction at each level of third variable,
Simple simple main effect: run one-way model at each level of second variable,
Simple simple pairwise comparisons: run pairwise or other post-hoc comparisons if necessary.

If you do not have a statistically significant three-way interaction, you need to determine whether you have any statistically significant two-way interaction from the ANOVA output. You can follow up a significant two-way interaction by simple main effects analyses and pairwise comparisons between groups if necessary.

Two-way interactions

modelgen <- lm(pain_score ~ gender * risk * treatment, data = headache)
headache %>%
  group_by(gender) %>%
  anova_test(pain_score~ risk*treatment, error = modelgen)

## # A tibble: 6 × 8
##   gender Effect           DFn   DFd     F        p `p<.05`   ges
## * <fct>  <chr>          <dbl> <dbl> <dbl>    <dbl> <chr>   <dbl>
## 1 male   risk               1    56 61.8  1.31e-10 "*"     0.525
## 2 male   treatment          2    56 12.6  3.10e- 5 "*"     0.31 
## 3 male   risk:treatment     2    56  6.49 3   e- 3 "*"     0.188
## 4 female risk               1    56 58.6  2.90e-10 "*"     0.511
## 5 female treatment          2    56  1.28 2.85e- 1 ""      0.044
## 6 female risk:treatment     2    56  2.54 8.8 e- 2 ""      0.083

modelrisk <- lm(pain_score ~ gender * risk * treatment, data = headache)
headache %>%
  group_by(risk) %>%
  anova_test(pain_score~ gender*treatment, error = modelrisk)

## # A tibble: 6 × 8
##   risk  Effect             DFn   DFd     F        p `p<.05`   ges
## * <fct> <chr>            <dbl> <dbl> <dbl>    <dbl> <chr>   <dbl>
## 1 high  gender               1    56 6.23  0.016    "*"     0.1  
## 2 high  treatment            2    56 9.46  0.000288 "*"     0.253
## 3 high  gender:treatment     2    56 9.10  0.000378 "*"     0.245
## 4 low   gender               1    56 8.22  0.006    "*"     0.128
## 5 low   treatment            2    56 3.80  0.028    "*"     0.119
## 6 low   gender:treatment     2    56 0.512 0.602    ""      0.018

modeltreat <- lm(pain_score ~ gender * risk * treatment, data = headache)
headache %>%
  group_by(treatment) %>%
  anova_test(pain_score~ gender*risk, error = modeltreat)

## # A tibble: 9 × 8
##   treatment Effect        DFn   DFd     F             p `p<.05`   ges
## * <fct>     <chr>       <dbl> <dbl> <dbl>         <dbl> <chr>   <dbl>
## 1 X         gender          1    56 15.9  0.000194      "*"     0.221
## 2 X         risk            1    56 52.5  0.00000000136 "*"     0.484
## 3 X         gender:risk     1    56  8.90 0.004         "*"     0.137
## 4 Y         gender          1    56  1.92 0.171         ""      0.033
## 5 Y         risk            1    56 49.9  0.00000000268 "*"     0.471
## 6 Y         gender:risk     1    56  2.48 0.121         ""      0.042
## 7 Z         gender          1    56  1.38 0.245         ""      0.024
## 8 Z         risk            1    56 21.5  0.0000216     "*"     0.277
## 9 Z         gender:risk     1    56  3.08 0.085         ""      0.052

Now, we run two-way interaction at each level of third variable.

Table 1: Fix gender. It is observed that risk significantly influences both men and women (pvalue greater than 0.05). The treatment influences men, but not women. Furthermore, the risk:treatment interaction in men shows a statistically significant effect on pain_score.

Table 2: Fix risk. It is observed that gender significantly influences both high and low risk people, and the same with treatment (pvalue greater than 0.05). However, the gender:treatment interaction only shows a statistically significant effect on pain_score in males.

Table 3: Fix treatment. It is observed that level of risk significantly influences the 3 treatments. Gender only affects X treatment. Moreover, the gender:risk interaction only shows a statistically significant effect on X treatment.

Main effects

modelgen <- lm(pain_score ~ gender * risk * treatment, data = headache)
headache %>%
  group_by(gender,risk) %>%
  anova_test(pain_score~treatment, error = model)

## # A tibble: 4 × 9
##   gender risk  Effect      DFn   DFd      F           p `p<.05`   ges
## * <fct>  <fct> <chr>     <dbl> <dbl>  <dbl>       <dbl> <chr>   <dbl>
## 1 male   high  treatment     2    56 18.2   0.000000795 "*"     0.394
## 2 male   low   treatment     2    56  0.815 0.448       ""      0.028
## 3 female high  treatment     2    56  0.329 0.721       ""      0.012
## 4 female low   treatment     2    56  3.49  0.037       "*"     0.111

modelrisk <- lm(pain_score ~ gender * risk * treatment, data = headache)
headache %>%
  group_by(risk,treatment) %>%
  anova_test(pain_score~ gender, error = model)

## # A tibble: 6 × 9
##   risk  treatment Effect   DFn   DFd      F          p `p<.05`      ges
## * <fct> <fct>     <chr>  <dbl> <dbl>  <dbl>      <dbl> <chr>      <dbl>
## 1 high  X         gender     1    56 24.1   0.00000819 "*"     0.301   
## 2 high  Y         gender     1    56  0.034 0.855      ""      0.000601
## 3 high  Z         gender     1    56  0.275 0.602      ""      0.005   
## 4 low   X         gender     1    56  0.688 0.41       ""      0.012   
## 5 low   Y         gender     1    56  4.37  0.041      "*"     0.072   
## 6 low   Z         gender     1    56  4.19  0.045      "*"     0.07

modeltreat <- lm(pain_score ~ gender * risk * treatment, data = headache)
headache %>%
  group_by(treatment,gender) %>%
  anova_test(pain_score~ risk, error = model)

## # A tibble: 6 × 9
##   gender treatment Effect   DFn   DFd     F             p `p<.05`   ges
## * <fct>  <fct>     <chr>  <dbl> <dbl> <dbl>         <dbl> <chr>   <dbl>
## 1 male   X         risk       1    56 53.3  0.00000000109 *       0.488
## 2 female X         risk       1    56  8.07 0.006         *       0.126
## 3 male   Y         risk       1    56 16.2  0.000171      *       0.225
## 4 female Y         risk       1    56 36.2  0.000000143   *       0.393
## 5 male   Z         risk       1    56  5.23 0.026         *       0.085
## 6 female Z         risk       1    56 19.4  0.0000491     *       0.257

Table 1. Fix gender and risk. In the group of men with a high level of risk and in women with low level of risk, there is a statistically significant difference in the pain score depending on the treatment. Since we have 3 types of medications, with the code below we see which ones specifically differ.

Table 2. Fix risk and treatment. For the low level of risk and Y and Z of treatment, and for high level of risk and X treatment, there is a statistically significant difference in the response variable depending on gender.

Table 3. Fix gender and treatment. The treatment variable has a significant impact on the pain_score for different levels of the risk variable (X, Y, Z) in both genders, male and female.

pwc1 <- headache %>% 
  group_by(gender,risk) %>%
  emmeans_test(pain_score ~ treatment, p.adjust.method = "bonferroni") 
head(pwc1,3)

## # A tibble: 3 × 11
##   gender risk  term      .y.       group1 group2    df statistic       p   p.adj
##   <fct>  <fct> <chr>     <chr>     <chr>  <chr>  <dbl>     <dbl>   <dbl>   <dbl>
## 1 male   high  treatment pain_sco… X      Y         56      4.55 2.93e-5 8.78e-5
## 2 male   high  treatment pain_sco… X      Z         56      5.71 4.41e-7 1.32e-6
## 3 male   high  treatment pain_sco… Y      Z         56      1.16 2.49e-1 7.48e-1
## # ℹ 1 more variable: p.adj.signif <chr>

pwc2 <- headache %>% 
  group_by(risk,treatment) %>%
  emmeans_test(pain_score ~ gender, p.adjust.method = "bonferroni") 
pwc2

## # A tibble: 6 × 11
##   risk  treatment term   .y.       group1 group2    df statistic       p   p.adj
## * <fct> <fct>     <chr>  <chr>     <chr>  <chr>  <dbl>     <dbl>   <dbl>   <dbl>
## 1 high  X         gender pain_sco… male   female    56     4.91  8.19e-6 8.19e-6
## 2 high  Y         gender pain_sco… male   female    56    -0.184 8.55e-1 8.55e-1
## 3 high  Z         gender pain_sco… male   female    56    -0.524 6.02e-1 6.02e-1
## 4 low   X         gender pain_sco… male   female    56     0.830 4.10e-1 4.10e-1
## 5 low   Y         gender pain_sco… male   female    56     2.09  4.11e-2 4.11e-2
## 6 low   Z         gender pain_sco… male   female    56     2.05  4.54e-2 4.54e-2
## # ℹ 1 more variable: p.adj.signif <chr>

pwc3 <- headache %>% 
  group_by(treatment,gender) %>%
  emmeans_test(pain_score ~ risk, p.adjust.method = "bonferroni") 
head(pwc1,3)

## # A tibble: 3 × 11
##   gender risk  term      .y.       group1 group2    df statistic       p   p.adj
##   <fct>  <fct> <chr>     <chr>     <chr>  <chr>  <dbl>     <dbl>   <dbl>   <dbl>
## 1 male   high  treatment pain_sco… X      Y         56      4.55 2.93e-5 8.78e-5
## 2 male   high  treatment pain_sco… X      Z         56      5.71 4.41e-7 1.32e-6
## 3 male   high  treatment pain_sco… Y      Z         56      1.16 2.49e-1 7.48e-1
## # ℹ 1 more variable: p.adj.signif <chr>

In these tables we can see, within each group, which ones differ significantly.

Pairwise comparisons

pairwise1 <- headache %>%
  pairwise_t_test(
    pain_score ~ treatment, 
    p.adjust.method = "bonferroni"
    )
pairwise1

## # A tibble: 3 × 9
##   .y.        group1 group2    n1    n2      p p.signif  p.adj p.adj.signif
## * <chr>      <chr>  <chr>  <int> <int>  <dbl> <chr>     <dbl> <chr>       
## 1 pain_score X      Y         23    23 0.0358 *        0.107  ns          
## 2 pain_score X      Z         23    22 0.0196 *        0.0589 ns          
## 3 pain_score Y      Z         23    22 0.786  ns       1      ns

pairwise2 <- headache %>%
  pairwise_t_test(
    pain_score ~ risk, 
    p.adjust.method = "bonferroni"
    )
pairwise2

## # A tibble: 1 × 9
##   .y.        group1 group2    n1    n2        p p.signif    p.adj p.adj.signif
## * <chr>      <chr>  <chr>  <int> <int>    <dbl> <chr>       <dbl> <chr>       
## 1 pain_score high   low       32    36 3.22e-12 ****     3.22e-12 ****

pairwise3 <- headache %>%
  pairwise_t_test(
    pain_score ~ gender, 
    p.adjust.method = "bonferroni"
    )
pairwise3

## # A tibble: 1 × 9
##   .y.        group1 group2    n1    n2      p p.signif  p.adj p.adj.signif
## * <chr>      <chr>  <chr>  <int> <int>  <dbl> <chr>     <dbl> <chr>       
## 1 pain_score male   female    36    32 0.0199 *        0.0199 *