Time: ~30 minutes
Goal: Practice one-way ANOVA analysis from start to finish using real public health data
Learning Objectives:
Structure:
Submission: Upload your completed .Rmd file and published to Brightspace by the end of class.
Why ANOVA? We have one continuous outcome (SBP) and one categorical predictor with THREE groups (BMI category). Using multiple t-tests would inflate our Type I error rate.
# Load necessary libraries
library(tidyverse) # For data manipulation and visualization
library(knitr) # For nice tables
library(car) # For Levene's test
library(NHANES) # NHANES dataset
# Load the NHANES data
data(NHANES)Create analysis dataset:
# Set seed for reproducibility
set.seed(553)
# Create BMI categories and prepare data
bp_bmi_data <- NHANES %>%
filter(Age >= 18 & Age <= 65) %>% # Adults 18-65
filter(!is.na(BPSysAve) & !is.na(BMI)) %>%
mutate(
bmi_category = case_when(
BMI < 25 ~ "Normal",
BMI >= 25 & BMI < 30 ~ "Overweight",
BMI >= 30 ~ "Obese",
TRUE ~ NA_character_
),
bmi_category = factor(bmi_category,
levels = c("Normal", "Overweight", "Obese"))
) %>%
filter(!is.na(bmi_category)) %>%
select(ID, Age, Gender, BPSysAve, BMI, bmi_category)
# Display first few rows
head(bp_bmi_data) %>%
kable(caption = "Blood Pressure and BMI Dataset (first 6 rows)")| ID | Age | Gender | BPSysAve | BMI | bmi_category |
|---|---|---|---|---|---|
| 51624 | 34 | male | 113 | 32.22 | Obese |
| 51624 | 34 | male | 113 | 32.22 | Obese |
| 51624 | 34 | male | 113 | 32.22 | Obese |
| 51630 | 49 | female | 112 | 30.57 | Obese |
| 51647 | 45 | female | 118 | 27.24 | Overweight |
| 51647 | 45 | female | 118 | 27.24 | Overweight |
##
## Normal Overweight Obese
## 1939 1937 2150
Interpretation: We have 6026 adults with complete BP and BMI data across three BMI categories.
# Calculate summary statistics by BMI category
summary_stats <- bp_bmi_data %>%
group_by(bmi_category) %>%
summarise(
n = n(),
Mean = mean(BPSysAve),
SD = sd(BPSysAve),
Median = median(BPSysAve),
Min = min(BPSysAve),
Max = max(BPSysAve)
)
summary_stats %>%
kable(digits = 2,
caption = "Descriptive Statistics: Systolic BP by BMI Category")| bmi_category | n | Mean | SD | Median | Min | Max |
|---|---|---|---|---|---|---|
| Normal | 1939 | 114.23 | 15.01 | 113 | 78 | 221 |
| Overweight | 1937 | 118.74 | 13.86 | 117 | 83 | 186 |
| Obese | 2150 | 121.62 | 15.27 | 120 | 82 | 226 |
Observation: The mean SBP appears to increase from Normal (114.2) to Overweight (118.7) to Obese (121.6).
# Create boxplots with individual points
ggplot(bp_bmi_data,
aes(x = bmi_category, y = BPSysAve, fill = bmi_category)) +
geom_boxplot(alpha = 0.7, outlier.shape = NA) +
geom_jitter(width = 0.2, alpha = 0.1, size = 0.5) +
scale_fill_brewer(palette = "Set2") +
labs(
title = "Systolic Blood Pressure by BMI Category",
subtitle = "NHANES Data, Adults aged 18-65",
x = "BMI Category",
y = "Systolic Blood Pressure (mmHg)",
fill = "BMI Category"
) +
theme_minimal(base_size = 12) +
theme(legend.position = "none")What the plot tells us:
Null Hypothesis (H₀): μ_Normal = μ_Overweight =
μ_Obese
(All three population means are equal)
Alternative Hypothesis (H₁): At least one population mean differs from the others
Significance level: α = 0.05
# Fit the one-way ANOVA model
anova_model <- aov(BPSysAve ~ bmi_category, data = bp_bmi_data)
# Display the ANOVA table
summary(anova_model)## Df Sum Sq Mean Sq F value Pr(>F)
## bmi_category 2 56212 28106 129.2 <2e-16 ***
## Residuals 6023 1309859 217
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Interpretation:
Why do we need this? The F-test tells us that groups differ, but not which groups differ. Tukey’s Honest Significant Difference controls the family-wise error rate for multiple pairwise comparisons.
## Tukey multiple comparisons of means
## 95% family-wise confidence level
##
## Fit: aov(formula = BPSysAve ~ bmi_category, data = bp_bmi_data)
##
## $bmi_category
## diff lwr upr p adj
## Overweight-Normal 4.507724 3.397134 5.618314 0
## Obese-Normal 7.391744 6.309024 8.474464 0
## Obese-Overweight 2.884019 1.801006 3.967033 0
Interpretation:
| Comparison | Mean Diff | 95% CI | p-value | Significant? |
|---|---|---|---|---|
| Overweight - Normal | 4.51 | [3.4, 5.62] | 1.98e-13 | Yes |
| Obese - Normal | 7.39 | [6.31, 8.47] | < 0.001 | Yes |
| Obese - Overweight | 2.88 | [1.8, 3.97] | 1.38e-09 | Yes |
Conclusion: All three pairwise comparisons are statistically significant. Obese adults have higher SBP than overweight adults, who in turn have higher SBP than normal-weight adults.
# Extract sum of squares from ANOVA table
anova_summary <- summary(anova_model)[[1]]
ss_treatment <- anova_summary$`Sum Sq`[1]
ss_total <- sum(anova_summary$`Sum Sq`)
# Calculate eta-squared
eta_squared <- ss_treatment / ss_total
cat("Eta-squared (η²):", round(eta_squared, 4), "\n")## Eta-squared (η²): 0.0411
## Percentage of variance explained: 4.11 %
Interpretation: BMI category explains 4.11% of the variance in systolic BP.
While statistically significant, the practical effect is modest—BMI category alone doesn’t explain most of the variation in blood pressure.
ANOVA Assumptions:
Diagnostic Plot Interpretation:
# Levene's test for homogeneity of variance
levene_test <- leveneTest(BPSysAve ~ bmi_category, data = bp_bmi_data)
print(levene_test)## Levene's Test for Homogeneity of Variance (center = median)
## Df F value Pr(>F)
## group 2 2.7615 0.06328 .
## 6023
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Levene’s Test Interpretation:
Overall Assessment: With n > 2000, ANOVA is robust to minor violations. Our assumptions are reasonably satisfied.
Example Results Section:
We conducted a one-way ANOVA to examine whether mean systolic blood pressure (SBP) differs across BMI categories (Normal, Overweight, Obese) among 6,026 adults aged 18-65 from NHANES. Descriptive statistics showed mean SBP of 114.2 mmHg (SD = 15) for normal weight, 118.7 mmHg (SD = 13.9) for overweight, and 121.6 mmHg (SD = 15.3) for obese individuals.
The ANOVA revealed a statistically significant difference in mean SBP across BMI categories, F(2, 6023) = 129.24, p < 0.001. Tukey’s HSD post-hoc tests indicated that all pairwise comparisons were significant (p < 0.05): obese adults had on average 7.4 mmHg higher SBP than normal-weight adults, and 2.9 mmHg higher than overweight adults.
The effect size (η² = 0.041) indicates that BMI category explains 4.1% of the variance in systolic blood pressure, representing a small practical effect. These findings support the well-established relationship between higher BMI and elevated blood pressure, though other factors account for most of the variation in SBP.
Your Task: Complete the same 9-step analysis workflow you just practiced, but now on a different outcome and predictor.
# Prepare the dataset
set.seed(553)
mental_health_data <- NHANES %>%
filter(Age >= 18) %>%
filter(!is.na(DaysMentHlthBad) & !is.na(PhysActive)) %>%
mutate(
activity_level = case_when(
PhysActive == "No" ~ "None",
PhysActive == "Yes" & !is.na(PhysActiveDays) & PhysActiveDays < 3 ~ "Moderate",
PhysActive == "Yes" & !is.na(PhysActiveDays) & PhysActiveDays >= 3 ~ "Vigorous",
TRUE ~ NA_character_
),
activity_level = factor(activity_level,
levels = c("None", "Moderate", "Vigorous"))
) %>%
filter(!is.na(activity_level)) %>%
select(ID, Age, Gender, DaysMentHlthBad, PhysActive, activity_level)
# YOUR TURN: Display the first 6 rows and check sample sizes
# View first 6 rows
head(mental_health_data)## # A tibble: 6 × 6
## ID Age Gender DaysMentHlthBad PhysActive activity_level
## <int> <int> <fct> <int> <fct> <fct>
## 1 51624 34 male 15 No None
## 2 51624 34 male 15 No None
## 3 51624 34 male 15 No None
## 4 51630 49 female 10 No None
## 5 51647 45 female 3 Yes Vigorous
## 6 51647 45 female 3 Yes Vigorous
##
## None Moderate Vigorous
## 3139 768 1850
YOUR TURN - Answer these questions:
# YOUR TURN: Calculate summary statistics by activity level
# Hint: Follow the same structure as the guided example
# Variables to summarize: n, Mean, SD, Median, Min, Max
mental_health_summary <- mental_health_data %>%
group_by(activity_level) %>%
summarise(
N = n(),
Mean = mean(DaysMentHlthBad),
SD = sd(DaysMentHlthBad),
Median = median(DaysMentHlthBad),
Min = min(DaysMentHlthBad),
Max = max(DaysMentHlthBad),
.groups = "drop"
)
mental_health_summary## # A tibble: 3 × 7
## activity_level N Mean SD Median Min Max
## <fct> <int> <dbl> <dbl> <dbl> <int> <int>
## 1 None 3139 5.08 9.01 0 0 30
## 2 Moderate 768 3.81 6.87 0 0 30
## 3 Vigorous 1850 3.54 7.17 0 0 30
YOUR TURN - Interpret:
# YOUR TURN: Create boxplots comparing DaysMentHlthBad across activity levels
# Hint: Use the same ggplot code structure as the example
# Change variable names and labels appropriately
ggplot(mental_health_data,
aes(x = activity_level, y = DaysMentHlthBad)) +
geom_boxplot(fill = "lightblue", alpha = 0.7) +
labs(
title = "Poor Mental Health Days by Physical Activity Level",
x = "Physical Activity Level",
y = "Days of Poor Mental Health (Past 30 Days)"
) +
theme_minimal()YOUR TURN - Describe what you see:
YOUR TURN - Write the hypotheses:
Null Hypothesis (H₀): Mean number of poor mental health days is equal across all physical activity levels.
Alternative Hypothesis (H₁): At least one physical activity level has a different mean number of poor mental health days.
Significance level: α = 0.05
# YOUR TURN: Fit the ANOVA model
# Outcome: DaysMentHlthBad
# Predictor: activity_level
anova_model <- aov(DaysMentHlthBad ~ activity_level,
data = mental_health_data)
summary(anova_model)## Df Sum Sq Mean Sq F value Pr(>F)
## activity_level 2 3109 1554.6 23.17 9.52e-11 ***
## Residuals 5754 386089 67.1
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
YOUR TURN - Extract and interpret the results:
## Tukey multiple comparisons of means
## 95% family-wise confidence level
##
## Fit: aov(formula = DaysMentHlthBad ~ activity_level, data = mental_health_data)
##
## $activity_level
## diff lwr upr p adj
## Moderate-None -1.2725867 -2.045657 -0.4995169 0.0003386
## Vigorous-None -1.5464873 -2.109345 -0.9836298 0.0000000
## Vigorous-Moderate -0.2739006 -1.098213 0.5504114 0.7159887
YOUR TURN - Complete the table:
| Comparison | Mean Difference | 95% CI Lower | 95% CI Upper | p-value | Significant? |
|---|---|---|---|---|---|
| Moderate - None | -1.2726 | -2.046 | -0.4995 | 0.0003 | |
| Vigorous - None | -1.5465 | -2.109 | -0.9836 | 0.0000 | |
| Vigorous - Moderate | -0.2739 | -1.098 | -0.9836 | 0.7160 |
Interpretation:
Which specific groups differ significantly?
Ans: while any activity is better than none, increasing intensity from moderate to vigorous did not result in a statistically significant further reduction in poor mental health days for this sample.
# YOUR TURN: Calculate eta-squared
# Hint: Extract Sum Sq from the ANOVA summary
anova_table <- summary(anova_model)[[1]]
eta_sq <- anova_table["activity_level", "Sum Sq"] /
sum(anova_table[, "Sum Sq"])
eta_sq## [1] 0.007988564
YOUR TURN - Interpret:
YOUR TURN - Evaluate each plot:
Residuals vs Fitted: Residuals are scattered fairly. No obvious pattern reasonibily met.
Q-Q Plot: Residuals are approximately normally distributed
Scale-Location: Points mostly flat along horizontal line, slight spread at higher fitted values
Residuals vs Leverage: No single observation is highly influential, so ANOVA results are robust.
# YOUR TURN: Conduct Levene's test
library(car)
leveneTest(DaysMentHlthBad ~ activity_level,
data = mental_health_data)## Levene's Test for Homogeneity of Variance (center = median)
## Df F value Pr(>F)
## group 2 23.168 9.517e-11 ***
## 5754
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
YOUR TURN - Overall assessment:
ASns: The p-value is 9.5 × 10⁻¹¹, which is far below 0.05. Interpretation: This indicates that the assumption of homogeneity of variance is violated—the variances across your groups are not equal. —
YOUR TURN - Write a complete 2-3 paragraph results section:
Include: 1. Sample description and descriptive statistics 2. F-test results 3. Post-hoc comparisons (if applicable) 4. Effect size interpretation 5. Public health significance
Your Results Section:
Ans: A one-way ANOVA was conducted to evaluate the relationship between physical activity levels (None, Moderate, and Vigorous) and the number of poor mental health days reported by adults in the NHANES dataset (N = 5,757). Descriptive statistics revealed that individuals with no physical activity reported the highest average of poor mental health days (M = 5.08, SD = 9.01), followed by those with moderate activity (M = 3.81, SD = 6.87) and vigorous activity $M = 3.54, SD = 7.17). While the descriptive data suggests a trend of better mental health with increased activity, diagnostic checks—specifically Levene’s Test (F(2, 5754) = 23.2, p < .001) and the Q-Q plot—indicated a violation of the assumption of homogeneity of variance and normality, likely due to the high frequency of “0” days reported in the dataset.Despite these distributional challenges, the ANOVA was statistically significant, F(2, 5754) = 23.2, p < .001. Post-hoc comparisons using the Tukey HSD test indicated that individuals in the “None” category had significantly more poor mental health days than both the “Moderate” group (diff = -1.27, p < .001) and the “Vigorous” group (diff = -1.55, p < .001). However, there was no statistically significant difference between the Moderate and Vigorous groups (p = .716), suggesting that the primary benefit is associated with engaging in any consistent activity rather than the intensity of that activity. The calculated effect size, eta-squared ( = 0.008), indicates that activity level explains approximately 0.8% of the variance in mental health days. While this effect size is small by traditional standards, the results hold notable public health significance. Even a small reduction in poor mental health days across a large population can translate to substantial improvements in quality of life and economic productivity. These findings support public health initiatives that encourage sedentary individuals to adopt at least moderate physical activity to help manage mental well-being.
1. How does the effect size help you understand the practical vs. statistical significance?
Ans: Effect size quantifies the magnitude of a difference or relationship, independent of sample size. While a p-value tells you whether an effect is statistically significant, it doesn’t indicate whether the difference is large enough to matter in practice.
2. Why is it important to check ANOVA assumptions? What might happen if they’re violated?
Ans: Checking assumptions ensures that your conclusions are valid and reliable, and if assumptions are violated, alternative methods (e.g., Welch ANOVA or nonparametric tests) may be necessary.
3. In public health practice, when might you choose to use ANOVA?
Ans: ANOVA helps determine whether group differences are statistically significant, guiding decisions about where to focus interventions, allocate resources, or further investigate health disparities.
4. What was the most challenging part of this lab activity? The step 8 and 9 description. —
Before submitting, verify you have:
To submit: Upload both your .Rmd file and the HTML output to Brightspace.
Lab completed on: February 05, 2026
Total Points: 15
| Category | Criteria | Points | Notes |
|---|---|---|---|
| Code Execution | All code chunks run without errors | 4 | - Deduct 1 pt per major error - Deduct 0.5 pt per minor warning |
| Completion | All “YOUR TURN” sections attempted | 4 | - Part B Steps 1-9 completed - All fill-in-the-blank answered - Tukey table filled in |
| Interpretation | Correct statistical interpretation | 4 | - Hypotheses correctly stated (1 pt) - ANOVA results interpreted (1 pt) - Post-hoc results interpreted (1 pt) - Assumptions evaluated (1 pt) |
| Results Section | Professional, complete write-up | 3 | - Includes descriptive stats (1 pt) - Reports F-test & post-hoc (1 pt) - Effect size & significance (1 pt) |
Code Execution (4 points):
Completion (4 points):
Interpretation (4 points):
Results Section (3 points):