One-Way ANOVA

Analysis of variance (ANOVA) tests the equality of mean responses \(\mu_1, \mu_2, ..., \mu_m\) among the \(m\) levels (groups) of a categorical explanatory variable. ANOVA determines whether the variability among the sample means is too large to be from chance alone. \(H_0\) is that all means are equal, and \(H_a\) is at least one mean differs from the others. The ANOVA F test statistic is the ratio of the between-group variability to the within-group variability. If the between-group variability dominates, the F statistic is large and the associated p-value is small, leading to rejection of \(H_0\).

The test does not indicate which populations cause the rejection of \(H_0\). ANOVA returns reliable results if the following conditions are met:

For each level \(i\) in a sample of one factor with \(m\) levels, the sum of squared differences between the level mean and the overall mean is \(SS_B = \sum_{i=1}^m n_i (\bar{y}_i - \bar{y})^2\). \(SS_B\) divided by its degrees of freedom is the between-sample variance, \(MS_B = \sigma_B^2 = \frac{SS_B}{m-1}\). The sum of squared differences between each value and its corresponding level mean is \(SS_W = \sum_{i,j} (y_{ij} - \bar{y}_{i.})^2 = \sum_{i=1}^m (n_i - 1) s_i^2\). \(SS_W\) divided by its degrees of freedom the within-sample variance, \(MS_W = \sigma_W^2 = \frac{SS_W}{n-m}\). The test statistic \(F = \frac{MS_B}{MS_W}\) has an F distribution with \(m-1\) numerator degrees of freedom and \(n-m\) denominator degrees of freedom. Under \(H_0\), both \(MS_B\) and \(MS_W\) estimate \(\sigma_\epsilon^2\), the variance common to all \(m\) populations. Under the alternative hypothesis, \(MS_B\) estimates \(\sigma_\epsilon^2 + \theta\), whereas \(MS_W\) still estimates \(\sigma_\epsilon^2\). The larger is F, the less likely \(H_0\) is true. Summarize the results in an analysis of variance (ANOVA) table.

To determine which group(s) differ(s) from the others, conduct a post-hoc test. Options include the Sidak and the Holm T test, Fisher’s Least Significant Difference test, Tukey’s Honestly Significant Difference test, the Scheffee test, the Newman-Keuls test, Dunnett’s Multiple Comparison test, the Duncan Multiple Range test, and the Bonferroni Procedure.

SS	df	MS	F Test
\(SS_B\)	\(m-1\)	\(\frac{SS_B}{m-1}\)	\(\frac{MS_B}{MS_W}\)
\(SS_W\)	\(n-m\)	\(\frac{SS_W}{n - m}\)
$SS	\(n-1\)

Example

In a completely randomized design experiment, 20 young pigs are assigned at random among 4 experimental groups, and each group is fed a different diet. The response variable is the pig’s weight in kg after consuming the diet for 10 months. Are the mean pig weights the same for all 4 diets?

library(tidyr)  # for gather()

## Warning: package 'tidyr' was built under R version 3.4.4

library(ggplot2)

## Warning: package 'ggplot2' was built under R version 3.4.4

pig_weight <- read.delim(file = "Data/pig_weight.txt", header = TRUE, sep = ",")
pig_gath <- gather(pig_weight, diet, weight)
pig_gath$diet <- factor(pig_gath$diet, levels = c("Feed.1", "Feed.2", "Feed.3", "Feed.4"))

ggplot(data = pig_gath, aes(x = diet, y = weight)) +
  geom_boxplot()

The measurements are independent because this is a completely randomized experiment. The individual populations could be assumed normally distributed if \(n >= 30\), but \(n = 20\), so we need to check for normality. The sample sizes are similar (5 per each of the 4 factor levels), so the equality of sample variances is less critical, but we can check anyway.

First a check of the normality condition. Test for normality by starting with the assumption that the distribution are normal, \(H_0: normal\), then falsifying the assumption if sufficient evidence exists. In these normal Q-Q plots, look for substantial deviations from a straight line. These plots looks good.

layout(rbind(c(1, 2), c(3, 4)))
qqnorm(pig_gath[pig_gath$diet == "Feed.1",]$weight)
qqline(pig_gath[pig_gath$diet == "Feed.1",]$weight)
qqnorm(pig_gath[pig_gath$diet == "Feed.2",]$weight)
qqline(pig_gath[pig_gath$diet == "Feed.2",]$weight)
qqnorm(pig_gath[pig_gath$diet == "Feed.3",]$weight)
qqline(pig_gath[pig_gath$diet == "Feed.3",]$weight)
qqnorm(pig_gath[pig_gath$diet == "Feed.4",]$weight)
qqline(pig_gath[pig_gath$diet == "Feed.4",]$weight)

There are statistical tests for that provide a quantitative evaluation, but the sample sizes are two small for them to be useful.

Now check for equal variances with Bartlett’s test of homogeneity of variances. The p-value is >>.05, so do not reject \(H_0\) of equal variances.

bartlett.test(weight ~ diet, data = pig_gath)

## 
##  Bartlett test of homogeneity of variances
## 
## data:  weight by diet
## Bartlett's K-squared = 0.46965, df = 3, p-value = 0.9255

Now we are ready for the one-way ANOVA test. The null hypothesis is that all means are equal. The p-value is <.0001, so we reject \(H_0\).

summary(aov_pig <- aov(weight ~ diet, data = pig_gath))

##             Df Sum Sq Mean Sq F value   Pr(>F)    
## diet         3   4703  1567.7   206.7 5.28e-13 ***
## Residuals   16    121     7.6                     
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Perform a post-hoc test to see which of the groups differ. Here we use Tukey’s test. All pairs differed from each other.

TukeyHSD(aov_pig)

##   Tukey multiple comparisons of means
##     95% family-wise confidence level
## 
## Fit: aov(formula = weight ~ diet, data = pig_gath)
## 
## $diet
##                 diff        lwr       upr     p adj
## Feed.2-Feed.1   8.56   3.576977 13.543023 0.0008075
## Feed.3-Feed.1  39.66  34.676977 44.643023 0.0000000
## Feed.4-Feed.1  25.70  20.716977 30.683023 0.0000000
## Feed.3-Feed.2  31.10  26.116977 36.083023 0.0000000
## Feed.4-Feed.2  17.14  12.156977 22.123023 0.0000002
## Feed.4-Feed.3 -13.96 -18.943023 -8.976977 0.0000030

plot(TukeyHSD(aov_pig))

One-Way ANOVA

Conducting hypothesis tests for the means of a quantitative variable grouped by the levels of a single multi-nomial categorical variable using R.

Michael Foley

2019-03-21

Example