Time: ~30 minutes
Goal: Practice one-way ANOVA analysis from start to finish using real public health data
Learning Objectives:
Structure:
Submission: Upload your completed .Rmd file and published to Brightspace by the end of class.
Why ANOVA? We have one continuous outcome (SBP) and one categorical predictor with THREE groups (BMI category). Using multiple t-tests would inflate our Type I error rate.
# Load necessary libraries
library(tidyverse) # For data manipulation and visualization
library(knitr) # For nice tables
library(car) # For Levene's test
library(NHANES) # NHANES dataset
# Load the NHANES data
data(NHANES)Create analysis dataset:
# Set seed for reproducibility
set.seed(553)
# Create BMI categories and prepare data
bp_bmi_data <- NHANES %>%
filter(Age >= 18 & Age <= 65) %>% # Adults 18-65
filter(!is.na(BPSysAve) & !is.na(BMI)) %>%
mutate(
bmi_category = case_when(
BMI < 25 ~ "Normal",
BMI >= 25 & BMI < 30 ~ "Overweight",
BMI >= 30 ~ "Obese",
TRUE ~ NA_character_
),
bmi_category = factor(bmi_category,
levels = c("Normal", "Overweight", "Obese"))
) %>%
filter(!is.na(bmi_category)) %>%
select(ID, Age, Gender, BPSysAve, BMI, bmi_category)
# Display first few rows
head(bp_bmi_data) %>%
kable(caption = "Blood Pressure and BMI Dataset (first 6 rows)")| ID | Age | Gender | BPSysAve | BMI | bmi_category |
|---|---|---|---|---|---|
| 51624 | 34 | male | 113 | 32.22 | Obese |
| 51624 | 34 | male | 113 | 32.22 | Obese |
| 51624 | 34 | male | 113 | 32.22 | Obese |
| 51630 | 49 | female | 112 | 30.57 | Obese |
| 51647 | 45 | female | 118 | 27.24 | Overweight |
| 51647 | 45 | female | 118 | 27.24 | Overweight |
##
## Normal Overweight Obese
## 1939 1937 2150
Interpretation: We have 6026 adults with complete BP and BMI data across three BMI categories.
# Calculate summary statistics by BMI category
summary_stats <- bp_bmi_data %>%
group_by(bmi_category) %>%
summarise(
n = n(),
Mean = mean(BPSysAve),
SD = sd(BPSysAve),
Median = median(BPSysAve),
Min = min(BPSysAve),
Max = max(BPSysAve)
)
summary_stats %>%
kable(digits = 2,
caption = "Descriptive Statistics: Systolic BP by BMI Category")| bmi_category | n | Mean | SD | Median | Min | Max |
|---|---|---|---|---|---|---|
| Normal | 1939 | 114.23 | 15.01 | 113 | 78 | 221 |
| Overweight | 1937 | 118.74 | 13.86 | 117 | 83 | 186 |
| Obese | 2150 | 121.62 | 15.27 | 120 | 82 | 226 |
Observation: The mean SBP appears to increase from Normal (114.2) to Overweight (118.7) to Obese (121.6).
# Create boxplots with individual points
ggplot(bp_bmi_data,
aes(x = bmi_category, y = BPSysAve, fill = bmi_category)) +
geom_boxplot(alpha = 0.7, outlier.shape = NA) +
geom_jitter(width = 0.2, alpha = 0.1, size = 0.5) +
scale_fill_brewer(palette = "Set2") +
labs(
title = "Systolic Blood Pressure by BMI Category",
subtitle = "NHANES Data, Adults aged 18-65",
x = "BMI Category",
y = "Systolic Blood Pressure (mmHg)",
fill = "BMI Category"
) +
theme_minimal(base_size = 12) +
theme(legend.position = "none")What the plot tells us:
Null Hypothesis (H₀): μ_Normal = μ_Overweight =
μ_Obese
(All three population means are equal)
Alternative Hypothesis (H₁): At least one population mean differs from the others
Significance level: α = 0.05
# Fit the one-way ANOVA model
anova_model <- aov(BPSysAve ~ bmi_category, data = bp_bmi_data)
# Display the ANOVA table
summary(anova_model)## Df Sum Sq Mean Sq F value Pr(>F)
## bmi_category 2 56212 28106 129.2 <2e-16 ***
## Residuals 6023 1309859 217
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Interpretation:
Why do we need this? The F-test tells us that groups differ, but not which groups differ. Tukey’s Honest Significant Difference controls the family-wise error rate for multiple pairwise comparisons.
## Tukey multiple comparisons of means
## 95% family-wise confidence level
##
## Fit: aov(formula = BPSysAve ~ bmi_category, data = bp_bmi_data)
##
## $bmi_category
## diff lwr upr p adj
## Overweight-Normal 4.507724 3.397134 5.618314 0
## Obese-Normal 7.391744 6.309024 8.474464 0
## Obese-Overweight 2.884019 1.801006 3.967033 0
Interpretation:
| Comparison | Mean Diff | 95% CI | p-value | Significant? |
|---|---|---|---|---|
| Overweight - Normal | 4.51 | [3.4, 5.62] | 1.98e-13 | Yes |
| Obese - Normal | 7.39 | [6.31, 8.47] | < 0.001 | Yes |
| Obese - Overweight | 2.88 | [1.8, 3.97] | 1.38e-09 | Yes |
Conclusion: All three pairwise comparisons are statistically significant. Obese adults have higher SBP than overweight adults, who in turn have higher SBP than normal-weight adults.
# Extract sum of squares from ANOVA table
anova_summary <- summary(anova_model)[[1]]
ss_treatment <- anova_summary$`Sum Sq`[1]
ss_total <- sum(anova_summary$`Sum Sq`)
# Calculate eta-squared
eta_squared <- ss_treatment / ss_total
cat("Eta-squared (η²):", round(eta_squared, 4), "\n")## Eta-squared (η²): 0.0411
## Percentage of variance explained: 4.11 %
Interpretation: BMI category explains 4.11% of the variance in systolic BP.
While statistically significant, the practical effect is modest—BMI category alone doesn’t explain most of the variation in blood pressure.
ANOVA Assumptions:
Diagnostic Plot Interpretation:
# Levene's test for homogeneity of variance
levene_test <- leveneTest(BPSysAve ~ bmi_category, data = bp_bmi_data)
print(levene_test)## Levene's Test for Homogeneity of Variance (center = median)
## Df F value Pr(>F)
## group 2 2.7615 0.06328 .
## 6023
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Levene’s Test Interpretation:
Overall Assessment: With n > 2000, ANOVA is robust to minor violations. Our assumptions are reasonably satisfied.
Example Results Section:
We conducted a one-way ANOVA to examine whether mean systolic blood pressure (SBP) differs across BMI categories (Normal, Overweight, Obese) among 6,026 adults aged 18-65 from NHANES. Descriptive statistics showed mean SBP of 114.2 mmHg (SD = 15) for normal weight, 118.7 mmHg (SD = 13.9) for overweight, and 121.6 mmHg (SD = 15.3) for obese individuals.
The ANOVA revealed a statistically significant difference in mean SBP across BMI categories, F(2, 6023) = 129.24, p < 0.001. Tukey’s HSD post-hoc tests indicated that all pairwise comparisons were significant (p < 0.05): obese adults had on average 7.4 mmHg higher SBP than normal-weight adults, and 2.9 mmHg higher than overweight adults.
The effect size (η² = 0.041) indicates that BMI category explains 4.1% of the variance in systolic blood pressure, representing a small practical effect. These findings support the well-established relationship between higher BMI and elevated blood pressure, though other factors account for most of the variation in SBP.
Your Task: Complete the same 9-step analysis workflow you just practiced, but now on a different outcome and predictor.
# Prepare the dataset
set.seed(553)
mental_health_data <- NHANES %>%
filter(Age >= 18) %>%
filter(!is.na(DaysMentHlthBad) & !is.na(PhysActive)) %>%
mutate(
activity_level = case_when(
PhysActive == "No" ~ "None",
PhysActive == "Yes" & !is.na(PhysActiveDays) & PhysActiveDays < 3 ~ "Moderate",
PhysActive == "Yes" & !is.na(PhysActiveDays) & PhysActiveDays >= 3 ~ "Vigorous",
TRUE ~ NA_character_
),
activity_level = factor(activity_level,
levels = c("None", "Moderate", "Vigorous"))
) %>%
filter(!is.na(activity_level)) %>%
select(ID, Age, Gender, DaysMentHlthBad, PhysActive, activity_level)
# YOUR TURN: Display the first 6 rows and check sample sizes
# Display first few rows
head(mental_health_data) %>%
kable(caption = "Mental Health Days and Physical Activity (first 6 rows)")| ID | Age | Gender | DaysMentHlthBad | PhysActive | activity_level |
|---|---|---|---|---|---|
| 51624 | 34 | male | 15 | No | None |
| 51624 | 34 | male | 15 | No | None |
| 51624 | 34 | male | 15 | No | None |
| 51630 | 49 | female | 10 | No | None |
| 51647 | 45 | female | 3 | Yes | Vigorous |
| 51647 | 45 | female | 3 | Yes | Vigorous |
##
## None Moderate Vigorous
## 3139 768 1850
YOUR TURN - Answer these questions:
# YOUR TURN: Calculate summary statistics by activity level
# Hint: Follow the same structure as the guided example
# Variables to summarize: n, Mean, SD, Median, Min, Max
# Calculate summary statistics by Activity Level
summary_stats_pb <- mental_health_data %>%
group_by(activity_level) %>%
summarise(
n = n(),
Mean = mean(DaysMentHlthBad),
SD = sd(DaysMentHlthBad),
Median = median(DaysMentHlthBad),
Min = min(DaysMentHlthBad),
Max = max(DaysMentHlthBad)
)
summary_stats_pb %>%
kable(digits = 2,
caption = "Descriptive Statistics: Poor Mental Health Days by Activity Level")| activity_level | n | Mean | SD | Median | Min | Max |
|---|---|---|---|---|---|---|
| None | 3139 | 5.08 | 9.01 | 0 | 0 | 30 |
| Moderate | 768 | 3.81 | 6.87 | 0 | 0 | 30 |
| Vigorous | 1850 | 3.54 | 7.17 | 0 | 0 | 30 |
YOUR TURN - Interpret: - Which group has the highest mean number of bad mental health days?
–The group that had no or none physical activity had the highest mean number of bad mental health days with a mean of 5.08 days.
–The group that had vigorous phyisical activity had the lowest mean number of bad mental health days, with a mean of 3.54 days.
# YOUR TURN: Create boxplots comparing DaysMentHlthBad across activity levels
# Hint: Use the same ggplot code structure as the example
# Change variable names and labels appropriately
# Create boxplots with individual points
ggplot(mental_health_data,
aes(x = activity_level, y = DaysMentHlthBad, fill = activity_level)) +
geom_boxplot(alpha = 0.7, outlier.shape = NA) +
geom_jitter(width = 0.2, alpha = 0.1, size = 0.5) +
scale_fill_brewer(palette = "Set1") +
labs(
title = "Poor Mental Health Days by Activity Level",
subtitle = "NHANES Data, Adults aged 18-65",
x = "Activity Level",
y = "Poor Mental Health Days (Days)",
fill = "Activity Level"
) +
theme_minimal(base_size = 12) +
theme(legend.position = "none")YOUR TURN - Describe what you see:
Do the groups appear to differ? – Yes, the groups appear to be different. The vigorous group is smaller compared to the None and Moderate group.
Are the variances similar across groups? –The varainces don’t seem to be similar across groups.The None group seem to have a noticeable grouping at around 15 days and 30 days unlike Moderate and Vigorous. —
YOUR TURN - Write the hypotheses:
Null Hypothesis (H₀): μ_None = μ_Moderate = μ_Vigorous (All three population means are equal)
Alternative Hypothesis (H₁): At least one population mean differs from the others
Significance level: α = 0.05
# YOUR TURN: Fit the ANOVA model
# Outcome: DaysMentHlthBad
# Predictor: activity_level
# Fit the one-way ANOVA model
anova_model_pb <- aov(DaysMentHlthBad ~ activity_level, data = mental_health_data)
# Display the ANOVA table
summary(anova_model_pb)## Df Sum Sq Mean Sq F value Pr(>F)
## activity_level 2 3109 1554.6 23.17 9.52e-11 ***
## Residuals 5754 386089 67.1
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
YOUR TURN - Extract and interpret the results:
# YOUR TURN: Conduct Tukey HSD test
# Only if your ANOVA p-value < 0.05
# Conduct Tukey HSD test
tukey_results_pb <- TukeyHSD(anova_model_pb)
print(tukey_results_pb)## Tukey multiple comparisons of means
## 95% family-wise confidence level
##
## Fit: aov(formula = DaysMentHlthBad ~ activity_level, data = mental_health_data)
##
## $activity_level
## diff lwr upr p adj
## Moderate-None -1.2725867 -2.045657 -0.4995169 0.0003386
## Vigorous-None -1.5464873 -2.109345 -0.9836298 0.0000000
## Vigorous-Moderate -0.2739006 -1.098213 0.5504114 0.7159887
YOUR TURN - Complete the table:
Interpretation:
| Comparison | Mean Diff | 95% CI | p-value | Significant? |
|---|---|---|---|---|
| Moderate - None | -1.27 | [3.4, -0.5] | 0.000339 | Yes |
| Vigorous - None | -1.55 | [-2.11, -0.98] | < 0.001 | Yes |
| Vigorous - Moderate | -0.27 | [-1.1, 0.55] | 0.716 | No |
Conclusion: Two of the three pairwise comparisons are statistically significant. Individuals with no or none physical activity have more poor mental health days compared to individuals with moderate or vigorous physical activity. There are no statistical difference between those who with vigorous or moderate physical activity when it comes to poor mental health days.
# YOUR TURN: Calculate eta-squared
# Hint: Extract Sum Sq from the ANOVA summary
# Extract sum of squares from ANOVA table
anova_summary_pb <- summary(anova_model_pb)[[1]]
ss_treatment_pb <- anova_summary_pb$`Sum Sq`[1]
ss_total_pb <- sum(anova_summary_pb$`Sum Sq`)
# Calculate eta-squared
eta_squared_pb <- ss_treatment_pb / ss_total_pb
cat("Eta-squared (η²):", round(eta_squared_pb, 4), "\n")## Eta-squared (η²): 0.008
## Percentage of variance explained: 0.8 %
YOUR TURN - Interpret:
η² = 0.008
Percentage of variance explained: –0.8%
Effect size classification (small/medium/large): –Large effect size
What does this mean practically? – While statistically significant, the practical effect is none-physical activity category alone can explain most of the variation in poor mentak health days.
# YOUR TURN: Create diagnostic plots
# Create diagnostic plots
par(mfrow = c(2, 2))
plot(anova_model_pb)YOUR TURN - Evaluate each plot:
Diagnostic Plot Interpretation:
# YOUR TURN: Conduct Levene's test
levene_test_pb <- leveneTest(DaysMentHlthBad ~ activity_level, data = mental_health_data)
print(levene_test_pb)## Levene's Test for Homogeneity of Variance (center = median)
## Df F value Pr(>F)
## group 2 23.168 9.517e-11 ***
## 5754
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
YOUR TURN - Overall assessment:
Are assumptions reasonably met? – yes, I believe that the assumptions are reasonably met.
Do any violations threaten your conclusions?
p-value: 9.52e-11
If p < 0.05, we would reject equal variances
Here: Variances differ significantly, but with large n, ANOVA is robust
Overall Assessment: –With n > 2000, ANOVA is robust to minor violations. Our assumptions are reasonably satisfied. —
YOUR TURN - Write a complete 2-3 paragraph results section:
Include: 1. Sample description and descriptive statistics 2. F-test results 3. Post-hoc comparisons (if applicable) 4. Effect size interpretation 5. Public health significance
Your Results Section:
We conducted a one-way ANOVA to examine whether poor mental health days differs across physical activity levels (None, Moderate, Vigorous) among 5,757 adults aged 18-65 from NHANES. Descriptive statistics showed poor mental health days of 5.1 Days (SD = 9) for None, 3.8 Days (SD = 6.9) for Moderate, and 3.5 Days (SD = 7.2) for Vigorous individuals.
The ANOVA revealed a statistically significant difference in poor mental health between no physical activity and both moderate and vigorous physical activity, F(2, 5754) = 23.17, p < 0.001. Tukey’s HSD post-hoc tests indicated that all pairwise comparisons were significant (p < 0.05): No physical activity individuals had on average -1.5 more poor mental health days than moderate physical activity individuals, and -0.3 more poor mental health than physical activity individuals.
The effect size (η² = 0.008) indicates that physical activity level explains 0.8% of the variance in poor mental health days, representing a small practical effect. These findings support the well-established relationship between physical activity and poor mental health days, though other factors account for some of the variation in poor mental health days.
1. How does the effect size help you understand the practical vs. statistical significance?
The effect size helps in understand whether or not the difference that are seen can truly be attributed to the relationship between the variables or if there is another factor that accounts for some or most of the difference that are present. There could be significant statistical difference between groups but the effect size could be small which would indicate that something else is occurring.
2. Why is it important to check ANOVA assumptions? What might happen if they’re violated?
It is important to check ANOVA assumptions because if they are not met they are violated and you have a small sample size then any difference between groups can not be attributed to the predictor variable.
3. In public health practice, when might you choose to use ANOVA?
ANOVA is an acceptable test in public health practice when you have 3 or more group means that you want to test if there are any difference between them. The health outcome must be a continuous variable while the predictor variable must be categorical.
4. What was the most challenging part of this lab activity?
The most challenging part of this lab was working through the an error in the data visualization chuck. For whatever reason the chuck was giving an error when nothing was wrong so troubleshooting that was challenging to deal with while working through the activity.
Before submitting, verify you have:
To submit: Upload both your .Rmd file and the HTML output to Brightspace.
Lab completed on: February 05, 2026
Total Points: 15
| Category | Criteria | Points | Notes |
|---|---|---|---|
| Code Execution | All code chunks run without errors | 4 | - Deduct 1 pt per major error - Deduct 0.5 pt per minor warning |
| Completion | All “YOUR TURN” sections attempted | 4 | - Part B Steps 1-9 completed - All fill-in-the-blank answered - Tukey table filled in |
| Interpretation | Correct statistical interpretation | 4 | - Hypotheses correctly stated (1 pt) - ANOVA results interpreted (1 pt) - Post-hoc results interpreted (1 pt) - Assumptions evaluated (1 pt) |
| Results Section | Professional, complete write-up | 3 | - Includes descriptive stats (1 pt) - Reports F-test & post-hoc (1 pt) - Effect size & significance (1 pt) |
Code Execution (4 points):
Completion (4 points):
Interpretation (4 points):
Results Section (3 points):