ANOVA (Analysis of Variance) with two categorical (factor) independent variables. It has multiple conditions:

Dependent variable continuous (can also be discrete) and normally distributed. Independent variables must be factor (categorical).
Equal variances for different groups or levels or categories of the independent variables.
Residuals are normally and independently distributed.
Begin your analysis with reading the data set.

library(readxl)
library(car)

## Loading required package: carData

library(agricolae)
library(tidyverse)

## ── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ──
## ✔ dplyr     1.1.4     ✔ readr     2.1.5
## ✔ forcats   1.0.0     ✔ stringr   1.5.1
## ✔ ggplot2   3.5.1     ✔ tibble    3.2.1
## ✔ lubridate 1.9.4     ✔ tidyr     1.3.1
## ✔ purrr     1.0.2

## ── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
## ✖ dplyr::filter() masks stats::filter()
## ✖ dplyr::lag()    masks stats::lag()
## ✖ dplyr::recode() masks car::recode()
## ✖ purrr::some()   masks car::some()
## ℹ Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to become errors

library(sjstats)
library(ggpubr)

rabbit = readxl::read_excel('datasets.xlsx', sheet = 'anova', range = 'J14:O77')
str(rabbit)

## tibble [63 × 6] (S3: tbl_df/tbl/data.frame)
##  $ sr.        : num [1:63] 1 2 3 4 5 6 7 8 9 10 ...
##  $ Treatment  : chr [1:63] "Control" "Control" "Control" "Probiotic" ...
##  $ Replication: chr [1:63] "R1" "R2" "R3" "R1" ...
##  $ Day        : num [1:63] 0 0 0 0 0 0 0 0 0 15 ...
##  $ Weight     : num [1:63] 10.5 11 11.8 12 11.5 10.7 10 12.5 11.3 12 ...
##  $ Mortality  : num [1:63] 1 1 1 0 1 0 2 0 1 1 ...

The data set has 6 columns or variables, we will use only Treatment, Replication, Day as independent variables and Weight as dependent variables. The independent variables have been read as chr (character) or num (numeric). We have to convert them to factor.

rabbit$Treatment = as.factor(rabbit$Treatment)
rabbit$Replication = as.factor(rabbit$Replication)
rabbit$Day = as.factor(rabbit$Day)
str(rabbit)

## tibble [63 × 6] (S3: tbl_df/tbl/data.frame)
##  $ sr.        : num [1:63] 1 2 3 4 5 6 7 8 9 10 ...
##  $ Treatment  : Factor w/ 3 levels "Antibiotic","Control",..: 2 2 2 3 3 3 1 1 1 2 ...
##  $ Replication: Factor w/ 3 levels "R1","R2","R3": 1 2 3 1 2 3 1 2 3 1 ...
##  $ Day        : Factor w/ 7 levels "0","15","30",..: 1 1 1 1 1 1 1 1 1 2 ...
##  $ Weight     : num [1:63] 10.5 11 11.8 12 11.5 10.7 10 12.5 11.3 12 ...
##  $ Mortality  : num [1:63] 1 1 1 0 1 0 2 0 1 1 ...

We will solve the anova model first, then check the assumptions. We will name the model as mod1. The formula of the model: dependent variable (y) ~ independent variable 1 (x1) + independent variable 2 (x2). This is called additive model. If we want to build an interactive model, we have to add an interaction term (x1:x2), which can be done in two ways: 1. y ~ x1*x2 2. y ~ x1 + x2 + x1:x2

First, we will see the interactive model. If the interaction term is not significant, we will drop this term from the final model. However, if any variables in the interaction term is not separately significant, but the interaction term is significant, we cannot drop the interaction term. Another consideration is the Repalication. This is done in the epxerimental design to improve the accuracy. It is actually not a factor. Still, we can see whether the variances in the dependent variable is due to poor local control by adding this replication factor in the model. Poor local control means, the researcher failed to maintain homogenous conditions for all the experimental units while chaning the levels of treatments.

mod1 = aov(Weight ~ Replication + Treatment*Day, data = rabbit)
mod1

## Call:
##    aov(formula = Weight ~ Replication + Treatment * Day, data = rabbit)
## 
## Terms:
##                 Replication Treatment      Day Treatment:Day Residuals
## Sum of Squares        7.897   535.292 7687.232       235.877    38.296
## Deg. of Freedom           2         2        6            12        40
## 
## Residual standard error: 0.9784706
## Estimated effects may be unbalanced

summary(mod1)

##               Df Sum Sq Mean Sq  F value   Pr(>F)    
## Replication    2      8     3.9    4.124   0.0235 *  
## Treatment      2    535   267.6  279.554  < 2e-16 ***
## Day            6   7687  1281.2 1338.207  < 2e-16 ***
## Treatment:Day 12    236    19.7   20.531 2.08e-13 ***
## Residuals     40     38     1.0                      
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

The residual standard error (0.9785) or Root Mean Square Error (RMSE) indicates the standard deviation of the errors. If we compare this with the overall mean of the dependent variable \(\bar{y}\), we will get the idea of how much variance in the dependent variable is explained or decreased by the model.

mean_y = mean(rabbit$Weight)
paste('mean weight =', round(mean_y, 2))

## [1] "mean weight = 25.29"

sd_y = sd(rabbit$Weight)
paste('standard deviation of weight =', round(sd_y, 2))

## [1] "standard deviation of weight = 11.71"

cv_y = sd_y/mean_y*100
paste('cv of weight (%) =', round(cv_y, 2))

## [1] "cv of weight (%) = 46.31"

rmse = 0.9784706
cv_model = rmse/mean_y*100
paste('cv of model (%) =', round(cv_model, 2))

## [1] "cv of model (%) = 3.87"

We see that the dependent variable has 46.31% variance which has decreased to 3.87% by the model. Low model cv indicates good fit of the model. Our model has only 3.87% error compared to the mean weight of the rabbit. However, this will not indicate how much variation can be explained by the independent variables. For this, we need eta squared (\(\eta^2\)) value (similar to \(R^2\) in linear regression). Partial eta squared value (\(\eta p^2\)) can also be calculated considering only the independent variables of interest. Formulas are: \[ \eta^2 = \frac{SS_{effect}}{SS_{total}}, \eta p^2 = \frac{SS_{effect}}{SS_{effect} + SS_{error}} \]

We have two independe variables, manual calculations is encouraged here. We use anova_stats() from sjstats package.

anova_stats(mod1)

## etasq | partial.etasq | omegasq | partial.omegasq | epsilonsq | cohens.f
## ------------------------------------------------------------------------
## 0.001 |         0.171 |   0.001 |           0.090 |     0.001 |    0.454
## 0.063 |         0.933 |   0.063 |           0.898 |     0.063 |    3.739
## 0.904 |         0.995 |   0.903 |           0.992 |     0.903 |   14.168
## 0.028 |         0.860 |   0.026 |           0.788 |     0.026 |    2.482
##       |               |         |                 |           |         
## 
## etasq |          term |    sumsq | df |   meansq | statistic | p.value | power
## ------------------------------------------------------------------------------
## 0.001 |   Replication |    7.897 |  2 |    3.949 |     4.124 |   0.024 | 0.730
## 0.063 |     Treatment |  535.292 |  2 |  267.646 |   279.554 |  < .001 | 1.000
## 0.904 |           Day | 7687.232 |  6 | 1281.205 |  1338.207 |  < .001 | 1.000
## 0.028 | Treatment:Day |  235.877 | 12 |   19.656 |    20.531 |  < .001 | 1.000
##       |     Residuals |   38.296 | 40 |    0.957 |           |         |

Now, we see the \(\eta p^2\) values for Replication (0.1%), Treatment (6.3%), Day (90.4%), and interaction between Treatment and Day (2.8%), a total of 0.1+6.3+90.4+2.8 = 99.6% variance in the dependent variable can be explained the independent by four terms. We also notice that although replication is significant, it has only 0.1%. Therefore, the experimental settings can be accepted as satisfactory.

Now, let us explore the model assumptions. First, as we have added the interaction term, let us explain the interaction and how it works.

with(rabbit, interaction.plot(Day, Treatment, Weight))

This interaction plot shows that during the initial days, the differences in effects of Antibiotic, Probiotic and Conrol were small, which increased in the later periods. The lines were diverging as the days increased. Therefore, the effects of treatments were dependent on the days, and hence, there were interaction effects of treatment and days on the weights of the rabbits.

boxplot(Weight ~ Treatment + Day, rabbit)

ggqqplot(data = rabbit, x = 'Weight', facet.by =  c('Treatment', 'Day'))

We see that there are only 3 observations in each group, which are not sufficient to check the normality. At least 10 observations should be used to test normality. For \(\ge30\) observations, we can apply the central limit theorem (CLT) to claim that the sampling distribution is normal. However, in our present example, we will check the residual distribution. If the residual distribution is independently (not correlated with any X’s) and normally distributed, we can say that the model has sufficiently extracted variannce from the dependend variable and the residuals do not contain any variance. Then we residuals will be considered as completely random.

leveneTest(Weight ~ Treatment*Day, rabbit)

## Levene's Test for Homogeneity of Variance (center = median)
##       Df F value Pr(>F)
## group 20  0.5276 0.9375
##       42

Levene test shows that the p = 0.9375. So, \(H_0\): The variances are equal, cannot be rejected. So, the ANOVA is a valid test for this. Now we will check the residuals for their normality and also the distributions of the variances around the fitted lines across the levels of the independent variables.

plot(mod1, which = 1)

Residuals are evenly spread around the horizonal zero line. In addition, the points have similar distances from the zero line across the x axis values that indicate the predicted values.

car::qqPlot(mod1)

## [1] 13 48

This plot shows that 95% confidence interval band is not far from the diagonal line. So, we can tell that the residuals are normally distributed and they are independent of the independent variables. These residuals do not contain an systematic unexplained variance that should have been explained by the factors. It also indicate that the 13th and 48th observations are outliers and influential that can change the slope the diagonal line.

Let us see the model summary again.

summary(mod1)

##               Df Sum Sq Mean Sq  F value   Pr(>F)    
## Replication    2      8     3.9    4.124   0.0235 *  
## Treatment      2    535   267.6  279.554  < 2e-16 ***
## Day            6   7687  1281.2 1338.207  < 2e-16 ***
## Treatment:Day 12    236    19.7   20.531 2.08e-13 ***
## Residuals     40     38     1.0                      
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Now we are sure from the model that F-values are significant. Therefore, \(H_0:Mean_{control} = Mean_{probiotic} = Mean_{Antibiotic}\) is rejected. So, at least one mean is different. We need to find out which mean(s) is/are different. We can also extend the \(H_0\) by including the interaction term.

Note: Considering the context, we can remove the replication factor. However, if it is important to explain the nature of replication (biological or technical), we can keep it with proper interpretation. Homogeneity in biological replication is difficult to maintain, while significant variations caused by technical replication indicate measurement errors. If you remove the term, you need to check the residuals again.

Now, we will conduct post-hoc test to see which means are different. There are several post-hoc methods, including but not limited to Tukey HSD [HSD.test()], Duncan New Multiple Range Test or DMRD [duncan.test()], least significant difference test or lsd with Bonferroni adjustment [LSD.test(p.adj = ‘bonferroni’)]. Tukey is suitable for large sample \(\ge 30\) as it is less sensitive for small sample, where DMRT works better. the lsd cannot maintain family wise error rate (FWER) meaning if we reject a hypothesis based on 5% level of probability, in can increase as the levels of independent variables increase. Therefore, conclusions made considering 5% level can be 15% in the case of three levels or groups in an independent variable.

Post hoc analysis with visualization

posthoc = HSD.test(mod1, trt = c('Treatment', 'Day'), console = TRUE)

## 
## Study: mod1 ~ c("Treatment", "Day")
## 
## HSD Test for Weight 
## 
## Mean Square Error:  0.9574048 
## 
## Treatment:Day,  means
## 
##                 Weight       std r        se  Min  Max   Q25  Q50   Q75
## Antibiotic:0  11.26667 1.2503333 3 0.5649203 10.0 12.5 10.65 11.3 11.90
## Antibiotic:15 13.73333 0.6658328 3 0.5649203 13.0 14.3 13.45 13.9 14.10
## Antibiotic:30 18.96667 0.2516611 3 0.5649203 18.7 19.2 18.85 19.0 19.10
## Antibiotic:45 24.80000 0.4358899 3 0.5649203 24.3 25.1 24.65 25.0 25.05
## Antibiotic:60 30.90000 1.5716234 3 0.5649203 29.2 32.3 30.20 31.2 31.75
## Antibiotic:75 38.16667 1.6653328 3 0.5649203 36.3 39.5 37.50 38.7 39.10
## Antibiotic:90 47.56667 0.6027714 3 0.5649203 47.0 48.2 47.25 47.5 47.85
## Control:0     11.10000 0.6557439 3 0.5649203 10.5 11.8 10.75 11.0 11.40
## Control:15    12.80000 0.7549834 3 0.5649203 12.0 13.5 12.45 12.9 13.20
## Control:30    15.63333 0.4041452 3 0.5649203 15.2 16.0 15.45 15.7 15.85
## Control:45    20.13333 1.0503968 3 0.5649203 19.1 21.2 19.60 20.1 20.65
## Control:60    24.30000 0.8000000 3 0.5649203 23.5 25.1 23.90 24.3 24.70
## Control:75    28.86667 2.1548395 3 0.5649203 27.2 31.3 27.65 28.1 29.70
## Control:90    36.10000 0.9000000 3 0.5649203 35.2 37.0 35.65 36.1 36.55
## Probiotic:0   11.40000 0.6557439 3 0.5649203 10.7 12.0 11.10 11.5 11.75
## Probiotic:15  15.50000 1.1789826 3 0.5649203 14.2 16.5 15.00 15.8 16.15
## Probiotic:30  20.46667 0.8736895 3 0.5649203 19.5 21.2 20.10 20.7 20.95
## Probiotic:45  26.50000 0.6557439 3 0.5649203 25.9 27.2 26.15 26.4 26.80
## Probiotic:60  34.50000 1.1789826 3 0.5649203 33.5 35.8 33.85 34.2 35.00
## Probiotic:75  40.06667 1.3203535 3 0.5649203 38.9 41.5 39.35 39.8 40.65
## Probiotic:90  48.33333 0.8504901 3 0.5649203 47.5 49.2 47.90 48.3 48.75
## 
## Alpha: 0.05 ; DF Error: 40 
## Critical Value of Studentized Range: 5.399433 
## 
## Minimun Significant Difference: 3.050249 
## 
## Treatments with the same letter are not significantly different.
## 
##                 Weight groups
## Probiotic:90  48.33333      a
## Antibiotic:90 47.56667      a
## Probiotic:75  40.06667      b
## Antibiotic:75 38.16667     bc
## Control:90    36.10000     cd
## Probiotic:60  34.50000      d
## Antibiotic:60 30.90000      e
## Control:75    28.86667     ef
## Probiotic:45  26.50000     fg
## Antibiotic:45 24.80000      g
## Control:60    24.30000      g
## Probiotic:30  20.46667      h
## Control:45    20.13333      h
## Antibiotic:30 18.96667      h
## Control:30    15.63333      i
## Probiotic:15  15.50000      i
## Antibiotic:15 13.73333     ij
## Control:15    12.80000     ij
## Probiotic:0   11.40000      j
## Antibiotic:0  11.26667      j
## Control:0     11.10000      j

Creating the data set for visualization

group1 = posthoc$means[ , c(2:3)] %>% rownames_to_column('Combination')
group2 = posthoc$groups %>% rownames_to_column('Combination')
group = left_join(group1, group2, by = 'Combination')
plot_data = group %>% mutate(
  ll = Weight - 1.96*(std/sqrt(r)),
  ul = Weight + 1.96*(std/sqrt(r))
)
plot_data

##      Combination       std r   Weight groups        ll       ul
## 1   Antibiotic:0 1.2503333 3 11.26667      j  9.851781 12.68155
## 2  Antibiotic:15 0.6658328 3 13.73333     ij 12.979873 14.48679
## 3  Antibiotic:30 0.2516611 3 18.96667      h 18.681885 19.25145
## 4  Antibiotic:45 0.4358899 3 24.80000      g 24.306744 25.29326
## 5  Antibiotic:60 1.5716234 3 30.90000      e 29.121541 32.67846
## 6  Antibiotic:75 1.6653328 3 38.16667     bc 36.282165 40.05117
## 7  Antibiotic:90 0.6027714 3 47.56667      a 46.884567 48.24877
## 8      Control:0 0.6557439 3 11.10000      j 10.357956 11.84204
## 9     Control:15 0.7549834 3 12.80000     ij 11.945656 13.65434
## 10    Control:30 0.4041452 3 15.63333      i 15.176000 16.09067
## 11    Control:45 1.0503968 3 20.13333      h 18.944698 21.32197
## 12    Control:60 0.8000000 3 24.30000      g 23.394715 25.20529
## 13    Control:75 2.1548395 3 28.86667     ef 26.428236 31.30510
## 14    Control:90 0.9000000 3 36.10000     cd 35.081554 37.11845
## 15   Probiotic:0 0.6557439 3 11.40000      j 10.657956 12.14204
## 16  Probiotic:15 1.1789826 3 15.50000      i 14.165856 16.83414
## 17  Probiotic:30 0.8736895 3 20.46667      h 19.477994 21.45534
## 18  Probiotic:45 0.6557439 3 26.50000     fg 25.757956 27.24204
## 19  Probiotic:60 1.1789826 3 34.50000      d 33.165856 35.83414
## 20  Probiotic:75 1.3203535 3 40.06667      b 38.572546 41.56079
## 21  Probiotic:90 0.8504901 3 48.33333      a 47.370913 49.29575

ggplot(plot_data) +
  aes(x = Combination, y = Weight, fill = Combination) +
  geom_col() +
  geom_text(aes(label = round(Weight, 0), y = Weight - 5 )) +
  geom_text(aes(label = groups, y = Weight + 5), color = 'blue') +
  geom_errorbar(aes(ymin = ll, ymax = ul), width = 0.2, alpha = 0.7) +
  theme_test() +
  theme(legend.position = 'none') +
  labs(x = 'Combination (Treatment:Day)', y = 'Average weight (kg)') +
  theme(axis.text.x = element_text(angle = 90, vjust = 0.4, hjust = 1))

Groups with the same letter are not statistically different. The error bars shows 95% confidence interval of the group means. Now we can conclude on which groups (combination of treatment and day) have the higher average weights. If we compare the expenditures on antibiotic or probiotic and return from selling the rabbits for the increased weight, we can suggest which level is better for profit. Even the effects of antibiotic and probiotic are sometimes significantly different, we can consider health hazard to decide which levels are recommended. There are trade-offs to be considered for the final recommendations.

Two-way ANOVA

Prof Dr Md Kamrul Hasan

Post hoc analysis with visualization

Creating the data set for visualization