Time: ~30 minutes
Goal: Practice one-way ANOVA analysis from start to finish using real public health data
Learning Objectives:
Your Task: Complete the same 9-step analysis workflow you just practiced, but now on a different outcome and predictor.
# Prepare the dataset
set.seed(553)
mental_health_data <- NHANES %>%
filter(Age >= 18) %>%
filter(!is.na(DaysMentHlthBad) & !is.na(PhysActive)) %>%
mutate(
activity_level = case_when(
PhysActive == "No" ~ "None",
PhysActive == "Yes" & !is.na(PhysActiveDays) & PhysActiveDays < 3 ~ "Moderate",
PhysActive == "Yes" & !is.na(PhysActiveDays) & PhysActiveDays >= 3 ~ "Vigorous",
TRUE ~ NA_character_
),
activity_level = factor(activity_level,
levels = c("None", "Moderate", "Vigorous"))
) %>%
filter(!is.na(activity_level)) %>%
select(ID, Age, Gender, DaysMentHlthBad, PhysActive, activity_level)
# YOUR TURN: Display the first 6 rows and check sample sizes
# Display first 6 rows
head(mental_health_data) %>%
kable(caption = "Physical Activity and Mental Health (first 6 rows)")| ID | Age | Gender | DaysMentHlthBad | PhysActive | activity_level |
|---|---|---|---|---|---|
| 51624 | 34 | male | 15 | No | None |
| 51624 | 34 | male | 15 | No | None |
| 51624 | 34 | male | 15 | No | None |
| 51630 | 49 | female | 10 | No | None |
| 51647 | 45 | female | 3 | Yes | Vigorous |
| 51647 | 45 | female | 3 | Yes | Vigorous |
##
## None Moderate Vigorous
## 3139 768 1850
YOUR TURN - Answer these questions:
# YOUR TURN: Calculate summary statistics by activity level
# Hint: Follow the same structure as the guided example
# Variables to summarize: n, Mean, SD, Median, Min, Max
# Calculate summary statistics by BMI category
summary_stats <- mental_health_data %>%
group_by(activity_level) %>%
summarise(
n = n(),
Mean = mean(DaysMentHlthBad),
SD = sd(DaysMentHlthBad),
Median = median(DaysMentHlthBad),
Min = min(DaysMentHlthBad),
Max = max(DaysMentHlthBad)
)
summary_stats %>%
kable(digits = 2,
caption = "Descriptive Statistics: Days with Bad Mental Health by Physical Activity Category")| activity_level | n | Mean | SD | Median | Min | Max |
|---|---|---|---|---|---|---|
| None | 3139 | 5.08 | 9.01 | 0 | 0 | 30 |
| Moderate | 768 | 3.81 | 6.87 | 0 | 0 | 30 |
| Vigorous | 1850 | 3.54 | 7.17 | 0 | 0 | 30 |
YOUR TURN - Interpret:
# YOUR TURN: Create boxplots comparing DaysMentHlthBad across activity levels
# Hint: Use the same ggplot code structure as the example
# Change variable names and labels appropriately
# Create boxplots with individual points
ggplot(mental_health_data,
aes(x = activity_level, y = DaysMentHlthBad, fill = activity_level)) +
geom_boxplot(alpha = 0.7, outlier.shape = NA) +
geom_jitter(width = 0.2, alpha = 0.1, size = 0.5) +
scale_fill_brewer(palette = "Set2") +
labs(
title = "Days with Bad Mental Health by Physical Activity Category",
subtitle = "NHANES Data, Adults aged 18-65",
x = "Physical Activity Level",
y = "Days with Bad Mental Health",
fill = "Physical Activity Level"
) +
theme_minimal(base_size = 12) +
theme(legend.position = "none")YOUR TURN - Describe what you see:
YOUR TURN - Write the hypotheses:
Null Hypothesis (H₀): μ_None = μ_Moderate =
μ_Vigorous
(All three population means are equal)
Alternative Hypothesis (H₁): At least one population mean differs from the others
Significance level: α = 0.05
# YOUR TURN: Fit the ANOVA model
# Outcome: DaysMentHlthBad
# Predictor: activity_level
# Fit the one-way ANOVA model
anova_model <- aov(DaysMentHlthBad ~ activity_level, data = mental_health_data)
# Display the ANOVA table
summary(anova_model)## Df Sum Sq Mean Sq F value Pr(>F)
## activity_level 2 3109 1554.6 23.17 9.52e-11 ***
## Residuals 5754 386089 67.1
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
YOUR TURN - Extract and interpret the results:
# YOUR TURN: Conduct Tukey HSD test
# Only if your ANOVA p-value < 0.05
# Conduct Tukey HSD test
tukey_results <- TukeyHSD(anova_model)
print(tukey_results)## Tukey multiple comparisons of means
## 95% family-wise confidence level
##
## Fit: aov(formula = DaysMentHlthBad ~ activity_level, data = mental_health_data)
##
## $activity_level
## diff lwr upr p adj
## Moderate-None -1.2725867 -2.045657 -0.4995169 0.0003386
## Vigorous-None -1.5464873 -2.109345 -0.9836298 0.0000000
## Vigorous-Moderate -0.2739006 -1.098213 0.5504114 0.7159887
YOUR TURN - Complete the table:
| Comparison | Mean Difference | 95% CI Lower | 95% CI Upper | p-value | Significant? |
|---|---|---|---|---|---|
| Moderate - None | -1.27 | -2.05 | -0.50 | 0.0003 | Yes |
| Vigorous - None | -1.55 | -2.11 | -0.98 | 0.0000 | Yes |
| Vigorous - Moderate | -0.27 | -1.10 | 0.55 | 0.7160 | No |
Interpretation:
Which specific groups differ significantly? None and moderate physical activity. None and moderate have confidence intervals that do not include zero, therefore we can reject the null hypothesis, whereas vigorous does include zero so we cannot reject the null. This suggests that physical activity has a benefit plateau at moderate levels.
# YOUR TURN: Calculate eta-squared
# Hint: Extract Sum Sq from the ANOVA summary
# Extract sum of squares from ANOVA table
anova_summary <- summary(anova_model)[[1]]
ss_treatment <- anova_summary$`Sum Sq`[1]
ss_total <- sum(anova_summary$`Sum Sq`)
# Calculate eta-squared
eta_squared <- ss_treatment / ss_total
cat("Eta-squared (η²):", round(eta_squared, 4), "\n")## Eta-squared (η²): 0.008
## Percentage of variance explained: 0.8 %
YOUR TURN - Interpret:
# YOUR TURN: Create diagnostic plots
# Create diagnostic plots
par(mfrow = c(2, 2))
plot(anova_model)YOUR TURN - Evaluate each plot:
Residuals vs Fitted: Points do not show random scatter around zero, suggesting there may be outliers or evidence of heteroscedasticity
Q-Q Plot: Points do not follow the diagonal line reasonably well; therefore, the normality assumption may not be reasonable
Scale-Location: The red line is rising, suggesting that the equal variance assumption may not be reasonable
Residuals vs Leverage: There are no points beyond Cook’s distance lines so there are no highly influential outliers
# YOUR TURN: Conduct Levene's test
# Levene's test for homogeneity of variance
levene_test <- leveneTest(DaysMentHlthBad ~ activity_level, data = mental_health_data)
print(levene_test)## Levene's Test for Homogeneity of Variance (center = median)
## Df F value Pr(>F)
## group 2 23.168 9.517e-11 ***
## 5754
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
YOUR TURN - Overall assessment:
YOUR TURN - Write a complete 2-3 paragraph results section:
Include: 1. Sample description and descriptive statistics 2. F-test results 3. Post-hoc comparisons (if applicable) 4. Effect size interpretation 5. Public health significance
Your Results Section:
This NHANES sample divided participants into categories based on their levels of physical activity in order to compare exercise and mental health, as defined by mean number of days with bad mental health. The total sample of 5,757 comprised 3,139 (54.5%) individuals who engage in no physical activity, 768 (13.3%) with moderate physical activity totaling less than 3 days per week, and 1,850 (32.1%) with vigorous levels of physical activity totaling more than 3 days per week. Those in the none category of physical activity had a mean of 5.08 number of bad mental health days, whereas the moderate group had 3.81 and vigorous group had 3.54.
The F-statistic of 23.17 means the between-group variation is about 23 times larger than the within-group variation. The p-value (< 0.001 or =9.52e-11) indicates this difference is extremely unlikely to have occurred by chance if all groups truly had the same mean.
Post-hoc Tukey HSD tests revealed that individuals with vigorous physical activity had significantly lower mean number of days with bad mental health compared to those with no physical activity (mean difference = -1.55, 95% CI [-2.11, -0.98], p < 0.001). Similarly, moderate activity was associated with lower higher mean number of days with bad mental health compared to low activity (mean difference = -1.27, 95% CI [-2.05, -0.50], p < 0.001). The difference between moderate and vigorous activity groups was not statistically significant (p = 0.716), suggesting that there is a physical activity benefit limit that approximates moderate levels.
While statistically significant, the effect size was small (η² = 0.008), indicating that physical activity explains only 0.08% of variance in bad mental health days. Other unmeasured factors such as mental health conditions, genetics, cardiovascular disease, and other overall health metrics likely play larger roles in bad mental health days.
1. How does the effect size help you understand the practical vs. statistical significance?
Effect size shows the true impact of a variable on the outcome in interest. This adds critical context when discussing p-values and statistical significance, especially when communicating the larger public health importance.
2. Why is it important to check ANOVA assumptions? What might happen if they’re violated?
It’s important as a significant ANOVA test cannot tell you which group is significantly assoicated with an outcome. Other post-hoc test can tell you more about the balance and makeup of your data. When these post-hoc tests are violated researchers can be more specific in their interpretation of their results and describe any limitations in their data, which could lead to further data transformations or weighting.
3. In public health practice, when might you choose to use ANOVA?
When comparing at least 3 groups by their mean of an outcome of interest. An example could be comparing maternal mortality by race/ethnicity.
4. What was the most challenging part of this lab activity?
Interpreting post-hoc testing results.