Two-way ANOVA

Analysis of variance with one continuous dependent variable and two categorical (factor) independent variables

  1. Open a R script named two_way_anova.R
  2. Keep the datasets.xlsx and two_way_anova.R in the same folder
  3. Set working directory
  4. Load required library
  5. Read the dataset with a name: rabbit
library(readxl)
library(car)
## Loading required package: carData
library(agricolae)
library(tidyverse)
## ── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ──
## ✔ dplyr     1.1.4     ✔ readr     2.1.5
## ✔ forcats   1.0.0     ✔ stringr   1.5.1
## ✔ ggplot2   3.5.1     ✔ tibble    3.2.1
## ✔ lubridate 1.9.4     ✔ tidyr     1.3.1
## ✔ purrr     1.0.4
## ── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
## ✖ dplyr::filter() masks stats::filter()
## ✖ dplyr::lag()    masks stats::lag()
## ✖ dplyr::recode() masks car::recode()
## ✖ purrr::some()   masks car::some()
## ℹ Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to become errors
library(sjstats)
library(ggpubr)
# read the dataset
rabbit <- read_excel("datasets.xlsx", sheet = "anova", range = "J14:O77")
str(rabbit)
## tibble [63 × 6] (S3: tbl_df/tbl/data.frame)
##  $ sr.        : num [1:63] 1 2 3 4 5 6 7 8 9 10 ...
##  $ Treatment  : chr [1:63] "Control" "Control" "Control" "Probiotic" ...
##  $ Replication: chr [1:63] "R1" "R2" "R3" "R1" ...
##  $ Day        : num [1:63] 0 0 0 0 0 0 0 0 0 15 ...
##  $ Weight     : num [1:63] 10.5 11 11.8 12 11.5 10.7 10 12.5 11.3 12 ...
##  $ Mortality  : num [1:63] 1 1 1 0 1 0 2 0 1 1 ...
# Dependent variable: Weight
# Independent variable: Treatment, Replication, Day = as.factor()
rabbit$Replication = as.factor(rabbit$Replication)
rabbit$Treatment = as.factor(rabbit$Treatment)
rabbit$Day = as.factor(rabbit$Day)
str(rabbit)
## tibble [63 × 6] (S3: tbl_df/tbl/data.frame)
##  $ sr.        : num [1:63] 1 2 3 4 5 6 7 8 9 10 ...
##  $ Treatment  : Factor w/ 3 levels "Antibiotic","Control",..: 2 2 2 3 3 3 1 1 1 2 ...
##  $ Replication: Factor w/ 3 levels "R1","R2","R3": 1 2 3 1 2 3 1 2 3 1 ...
##  $ Day        : Factor w/ 7 levels "0","15","30",..: 1 1 1 1 1 1 1 1 1 2 ...
##  $ Weight     : num [1:63] 10.5 11 11.8 12 11.5 10.7 10 12.5 11.3 12 ...
##  $ Mortality  : num [1:63] 1 1 1 0 1 0 2 0 1 1 ...
# anova model: aov(dv ~ iv + iv + ..., data)
# 3 independent variables: 3 ways anova
mod = aov(Weight ~ Replication + Treatment + Day, data = rabbit)
anova(mod)
## Analysis of Variance Table
## 
## Response: Weight
##             Df Sum Sq Mean Sq  F value    Pr(>F)    
## Replication  2    7.9    3.95   0.7489    0.4779    
## Treatment    2  535.3  267.65  50.7622 5.963e-13 ***
## Day          6 7687.2 1281.21 242.9953 < 2.2e-16 ***
## Residuals   52  274.2    5.27                       
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

In this additive model, replications do not have any effects on the weight of the rabbits. So, local control was alright. Now, we will remove this replication from the model. However, we will add interaction term in the model.

# Interaction: Treatment + Day + Treatment:Day
# or Interaction: Treatment*Day
mod_int = aov(Weight ~ Treatment + Day + Treatment:Day, data = rabbit)
summary(mod_int)
##               Df Sum Sq Mean Sq F value   Pr(>F)    
## Treatment      2    535   267.6  243.35  < 2e-16 ***
## Day            6   7687  1281.2 1164.90  < 2e-16 ***
## Treatment:Day 12    236    19.7   17.87 8.87e-13 ***
## Residuals     42     46     1.1                     
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

We see that Treatment (control, probiotic and antibiotic) and day intervals along with their interactions have significant effects on the rabbit weights. Let us see the interaction plot.

# interaction.plot(x.factor = rabbit$Day, trace.factor = rabbit$Treatment, response = rabbit$Weight)
# with(data, ............), does not need $ signs

with(rabbit, interaction.plot(Day, Treatment, Weight))

This figure shows that the effect of treatments increased with the increase in the day of observation. In the earlier days, the effects of the treatments were lower while in the later days the effects were larger. If there was no interaction effect, the lines would be parallel (not converging or diverging).

Assumptions of ANOVA: Normality of dependent variable. It is not usually checked. Because, when observations per group less than 10, insufficient observations to check normality. So, we can check the normality of the residuals.

boxplot(Weight ~ Day*Treatment, rabbit)

ggqqplot(rabbit, x = 'Weight', facet.by = c('Treatment', 'Day'))

We will now check the residuals (observed value - predicted value).

plot(mod_int, which = 2)

Now, check the equality of variances across the groups/factors. \(H_0: Variances~are~equal~for~all~factor~levels\)

car::leveneTest(Weight ~ Day*Treatment, rabbit)
## Levene's Test for Homogeneity of Variance (center = median)
##       Df F value Pr(>F)
## group 20  0.5276 0.9375
##       42

Equality of variance of residuals can also be checked by the first diagnostic plot.

plot(mod_int, which = 1)

Few other important statistics of ANOVA

CV of dependent variable = 46.31%

CV of model: quality of model fit = 4.15% (very low level of errors in the model)

\(\eta^2\) = variance explained by the factor

Post-hoc analysis to see which means are different

# Create a function to calculate the CV
cv = function (x) {sd(x)/mean(x)*100}
# Calculate cv of the dependent variable
cv(rabbit$Weight)
## [1] 46.30991
# Model cv
# Model standard error = Residuals standard error = Residual standard deviation = 1.048733
mod_int
## Call:
##    aov(formula = Weight ~ Treatment + Day + Treatment:Day, data = rabbit)
## 
## Terms:
##                 Treatment      Day Treatment:Day Residuals
## Sum of Squares    535.292 7687.232       235.877    46.193
## Deg. of Freedom         2        6            12        42
## 
## Residual standard error: 1.048733
## Estimated effects may be unbalanced
# model cv = 4.15%
1.048733/mean(rabbit$Weight)*100
## [1] 4.146751

Variance explained by different factors (\(\eta^2\)): Treatment explained 6.3% of the variance in the weight of the rabbit and Day explained most of the variance (90.4%).

anova_stats(mod_int)
## etasq | partial.etasq | omegasq | partial.omegasq | epsilonsq | cohens.f
## ------------------------------------------------------------------------
## 0.063 |         0.921 |   0.063 |           0.885 |     0.063 |    3.404
## 0.904 |         0.994 |   0.903 |           0.991 |     0.903 |   12.900
## 0.028 |         0.836 |   0.026 |           0.763 |     0.026 |    2.260
##       |               |         |                 |           |         
## 
## etasq |          term |    sumsq | df |   meansq | statistic | p.value | power
## ------------------------------------------------------------------------------
## 0.063 |     Treatment |  535.292 |  2 |  267.646 |   243.350 |  < .001 |     1
## 0.904 |           Day | 7687.232 |  6 | 1281.205 |  1164.900 |  < .001 |     1
## 0.028 | Treatment:Day |  235.877 | 12 |   19.656 |    17.872 |  < .001 |     1
##       |     Residuals |   46.193 | 42 |    1.100 |           |         |

Post hoc analysis with Tukey HSD

posthoc = HSD.test(mod_int, trt = c('Treatment', 'Day'), console = TRUE)
## 
## Study: mod_int ~ c("Treatment", "Day")
## 
## HSD Test for Weight 
## 
## Mean Square Error:  1.099841 
## 
## Treatment:Day,  means
## 
##                 Weight       std r        se  Min  Max   Q25  Q50   Q75
## Antibiotic:0  11.26667 1.2503333 3 0.6054864 10.0 12.5 10.65 11.3 11.90
## Antibiotic:15 13.73333 0.6658328 3 0.6054864 13.0 14.3 13.45 13.9 14.10
## Antibiotic:30 18.96667 0.2516611 3 0.6054864 18.7 19.2 18.85 19.0 19.10
## Antibiotic:45 24.80000 0.4358899 3 0.6054864 24.3 25.1 24.65 25.0 25.05
## Antibiotic:60 30.90000 1.5716234 3 0.6054864 29.2 32.3 30.20 31.2 31.75
## Antibiotic:75 38.16667 1.6653328 3 0.6054864 36.3 39.5 37.50 38.7 39.10
## Antibiotic:90 47.56667 0.6027714 3 0.6054864 47.0 48.2 47.25 47.5 47.85
## Control:0     11.10000 0.6557439 3 0.6054864 10.5 11.8 10.75 11.0 11.40
## Control:15    12.80000 0.7549834 3 0.6054864 12.0 13.5 12.45 12.9 13.20
## Control:30    15.63333 0.4041452 3 0.6054864 15.2 16.0 15.45 15.7 15.85
## Control:45    20.13333 1.0503968 3 0.6054864 19.1 21.2 19.60 20.1 20.65
## Control:60    24.30000 0.8000000 3 0.6054864 23.5 25.1 23.90 24.3 24.70
## Control:75    28.86667 2.1548395 3 0.6054864 27.2 31.3 27.65 28.1 29.70
## Control:90    36.10000 0.9000000 3 0.6054864 35.2 37.0 35.65 36.1 36.55
## Probiotic:0   11.40000 0.6557439 3 0.6054864 10.7 12.0 11.10 11.5 11.75
## Probiotic:15  15.50000 1.1789826 3 0.6054864 14.2 16.5 15.00 15.8 16.15
## Probiotic:30  20.46667 0.8736895 3 0.6054864 19.5 21.2 20.10 20.7 20.95
## Probiotic:45  26.50000 0.6557439 3 0.6054864 25.9 27.2 26.15 26.4 26.80
## Probiotic:60  34.50000 1.1789826 3 0.6054864 33.5 35.8 33.85 34.2 35.00
## Probiotic:75  40.06667 1.3203535 3 0.6054864 38.9 41.5 39.35 39.8 40.65
## Probiotic:90  48.33333 0.8504901 3 0.6054864 47.5 49.2 47.90 48.3 48.75
## 
## Alpha: 0.05 ; DF Error: 42 
## Critical Value of Studentized Range: 5.382464 
## 
## Minimun Significant Difference: 3.259009 
## 
## Treatments with the same letter are not significantly different.
## 
##                 Weight groups
## Probiotic:90  48.33333      a
## Antibiotic:90 47.56667      a
## Probiotic:75  40.06667      b
## Antibiotic:75 38.16667     bc
## Control:90    36.10000     cd
## Probiotic:60  34.50000      d
## Antibiotic:60 30.90000      e
## Control:75    28.86667     ef
## Probiotic:45  26.50000     fg
## Antibiotic:45 24.80000      g
## Control:60    24.30000      g
## Probiotic:30  20.46667      h
## Control:45    20.13333      h
## Antibiotic:30 18.96667      h
## Control:30    15.63333      i
## Probiotic:15  15.50000      i
## Antibiotic:15 13.73333     ij
## Control:15    12.80000     ij
## Probiotic:0   11.40000      j
## Antibiotic:0  11.26667      j
## Control:0     11.10000      j

Visualization of group means with letters and 95% confidence intervals

group1 = posthoc$means[ , c(2:3)] %>% rownames_to_column('Combination')
group2 = posthoc$groups %>% rownames_to_column('Combination')
group = left_join(group1, group2, by = 'Combination')
plot_data = group %>% mutate(
  ll = Weight - 1.96*(std/sqrt(r)),
  ul = Weight + 1.96*(std/sqrt(r))
)
plot_data
##      Combination       std r   Weight groups        ll       ul
## 1   Antibiotic:0 1.2503333 3 11.26667      j  9.851781 12.68155
## 2  Antibiotic:15 0.6658328 3 13.73333     ij 12.979873 14.48679
## 3  Antibiotic:30 0.2516611 3 18.96667      h 18.681885 19.25145
## 4  Antibiotic:45 0.4358899 3 24.80000      g 24.306744 25.29326
## 5  Antibiotic:60 1.5716234 3 30.90000      e 29.121541 32.67846
## 6  Antibiotic:75 1.6653328 3 38.16667     bc 36.282165 40.05117
## 7  Antibiotic:90 0.6027714 3 47.56667      a 46.884567 48.24877
## 8      Control:0 0.6557439 3 11.10000      j 10.357956 11.84204
## 9     Control:15 0.7549834 3 12.80000     ij 11.945656 13.65434
## 10    Control:30 0.4041452 3 15.63333      i 15.176000 16.09067
## 11    Control:45 1.0503968 3 20.13333      h 18.944698 21.32197
## 12    Control:60 0.8000000 3 24.30000      g 23.394715 25.20529
## 13    Control:75 2.1548395 3 28.86667     ef 26.428236 31.30510
## 14    Control:90 0.9000000 3 36.10000     cd 35.081554 37.11845
## 15   Probiotic:0 0.6557439 3 11.40000      j 10.657956 12.14204
## 16  Probiotic:15 1.1789826 3 15.50000      i 14.165856 16.83414
## 17  Probiotic:30 0.8736895 3 20.46667      h 19.477994 21.45534
## 18  Probiotic:45 0.6557439 3 26.50000     fg 25.757956 27.24204
## 19  Probiotic:60 1.1789826 3 34.50000      d 33.165856 35.83414
## 20  Probiotic:75 1.3203535 3 40.06667      b 38.572546 41.56079
## 21  Probiotic:90 0.8504901 3 48.33333      a 47.370913 49.29575
ggplot(plot_data) +
  aes(x = Combination, y = Weight, fill = Combination) +
  geom_col() +
  geom_text(aes(label = round(Weight, 0), y = Weight - 5 )) +
  geom_text(aes(label = groups, y = Weight + 5), color = 'blue') +
  geom_errorbar(aes(ymin = ll, ymax = ul), width = 0.2, alpha = 0.7) +
  theme_test() +
  theme(legend.position = 'none') +
  labs(x = 'Combination (Treatment:Day)', y = 'Average weight (kg)') +
  theme(axis.text.x = element_text(angle = 90, vjust = 0.4, hjust = 1))

# ggsave('Anova.png', dpi = 600, height = 6, width = 8, units = 'in')