Statistical Analysis: One-Way ANOVA

Comparing Means Across Multiple Groups with Post-Hoc Tukey & CLD

Author

Abdullah Al Shamim

Published

February 18, 2026

Introduction to One-Way ANOVA

A One-Way Analysis of Variance (ANOVA) is used to determine whether there are any statistically significant differences between the means of three or more independent groups.


1. Environment Setup & Data Preparation

We will use the built-in ToothGrowth dataset, which measures the effect of Vitamin C on tooth growth in guinea pigs. We must ensure the independent variable (dose) is treated as a factor.

Code
# 1. Load required packages
library(tidyverse)
library(multcompView)
library(emmeans)  # For Post-Hoc tests

# 2. Prepare Data
data(ToothGrowth)
ToothGrowth$dose <- as.factor(ToothGrowth$dose) # Convert dose to factor

2. Performing One-Way ANOVA

The ANOVA test evaluates the Null Hypothesis () that all group means are equal.

Code
# 3. ANOVA Analysis
anova_result <- aov(len ~ dose, data = ToothGrowth)
summary(anova_result)
            Df Sum Sq Mean Sq F value   Pr(>F)    
dose         2   2426    1213   67.42 9.53e-16 ***
Residuals   57   1026      18                     
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Code
# Alternative: Perform ANOVA using pipes
# ToothGrowth %>% 
#   aov(len ~ dose, data = .) %>% 
#   summary()

3. Post-Hoc Analysis (Tukey HSD) & CLD

If the ANOVA p-value is significant (< 0.05), we use Tukey’s Honest Significant Difference (HSD) test to find which specific groups differ. We then use Compact Letter Display (CLD) to simplify the results: groups sharing a letter are not significantly different.

Code
# 4. Tukey Test & CLD (Compact Letter Display)
tukey_result <- TukeyHSD(anova_result)

# Generate CLD letters
cld_letters <- multcompLetters4(anova_result, tukey_result)
print(cld_letters)
$dose
  2   1 0.5 
"a" "b" "c" 

4. Data Summarization

We calculate the mean and standard deviation for each group and append the CLD letters for visualization.

Code
# 5. Calculate Group Means and add CLD
summary_data <- ToothGrowth %>%
  group_by(dose) %>%
  summarise(
    mean_len = mean(len),
    sd_len = sd(len)
  ) %>%
  mutate(
    cld = cld_letters$dose$Letters  # Add CLD letters
  )

print(summary_data)
# A tibble: 3 × 4
  dose  mean_len sd_len cld  
  <fct>    <dbl>  <dbl> <chr>
1 0.5       10.6   4.50 a    
2 1         19.7   4.42 b    
3 2         26.1   3.77 c    

5. Professional Visualization

We create a publication-quality bar chart that includes error bars and the significance letters (CLD).

Code
# 6. Create Bar Graph with CLD
summary_data %>% 
  ggplot(aes(dose, mean_len)) +
  geom_bar(stat = "identity", 
           fill = "steelblue", 
           alpha = 0.7) +
  geom_errorbar(aes(ymin = mean_len - sd_len, 
                    ymax = mean_len + sd_len),
                width = 0.2, 
                color = "black") +
  geom_text(aes(label = cld, 
                y = mean_len + sd_len + 1.5), 
            size = 5, 
            fontface = "bold",
            color = "black") + 
  labs(x = "Vitamin C Dose (mg/day)", 
       y = "Tooth Length (mm)",
       title = "One-Way ANOVA: Impact of Dose on Tooth Growth",
       subtitle = "Groups with different letters are significantly different (p < 0.05)") +
  theme_test(base_size = 15) +
  theme(plot.title = element_text(hjust = 0.5, face = "bold"),
        plot.subtitle = element_text(hjust = 0.5))

Code
# 7. Save the Plot
# ggsave("anova_cld_plot.png", width = 6, height = 5, dpi = 300)

Systematic Checklist (Cheat Sheet):

  • ANOVA Test: aov(numeric ~ factor, data = df)
  • Significance Check: summary(anova_object)
  • Post-Hoc Comparison: TukeyHSD(anova_object)
  • Compact Letter Display: multcompLetters4()
  • Interpretation: Groups sharing the same letter are not statistically different.

Great Work! You have now moved from basic t-tests to multi-group ANOVA with professional significance mapping.