Step 1: Setup and Data Preparation

# Load necessary libraries
library(tidyverse)   # For data manipulation and visualization

## Warning: package 'tidyverse' was built under R version 4.4.3

## Warning: package 'ggplot2' was built under R version 4.4.3

## Warning: package 'tibble' was built under R version 4.4.3

## Warning: package 'tidyr' was built under R version 4.4.3

## Warning: package 'readr' was built under R version 4.4.3

## Warning: package 'purrr' was built under R version 4.4.3

## Warning: package 'dplyr' was built under R version 4.4.3

## Warning: package 'stringr' was built under R version 4.4.3

## Warning: package 'forcats' was built under R version 4.4.3

## Warning: package 'lubridate' was built under R version 4.4.3

## ── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ──
## ✔ dplyr     1.1.4     ✔ readr     2.1.6
## ✔ forcats   1.0.1     ✔ stringr   1.6.0
## ✔ ggplot2   4.0.1     ✔ tibble    3.3.1
## ✔ lubridate 1.9.4     ✔ tidyr     1.3.2
## ✔ purrr     1.2.1     
## ── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
## ✖ dplyr::filter() masks stats::filter()
## ✖ dplyr::lag()    masks stats::lag()
## ℹ Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to become errors

library(knitr)       # For nice tables

## Warning: package 'knitr' was built under R version 4.4.3

library(car)         # For Levene's test

## Warning: package 'car' was built under R version 4.4.3

## Loading required package: carData

## Warning: package 'carData' was built under R version 4.4.3

## 
## Attaching package: 'car'
## 
## The following object is masked from 'package:dplyr':
## 
##     recode
## 
## The following object is masked from 'package:purrr':
## 
##     some

library(NHANES)      # NHANES dataset

## Warning: package 'NHANES' was built under R version 4.4.3

# Load the NHANES data
data(NHANES)

PART B: YOUR TURN - INDEPENDENT PRACTICE

Practice Problem: Physical Activity and Depression

Research Question: Is there a difference in the number of days with poor mental health across three physical activity levels (None, Moderate, Vigorous)?

Your Task: Complete the same 9-step analysis workflow you just practiced, but now on a different outcome and predictor.

Step 1: Data Preparation

# Prepare the dataset
set.seed(553)

mental_health_data <- NHANES %>%
  filter(Age >= 18) %>%
  filter(!is.na(DaysMentHlthBad) & !is.na(PhysActive)) %>%
  mutate(
    activity_level = case_when(
      PhysActive == "No" ~ "None",
      PhysActive == "Yes" & !is.na(PhysActiveDays) & PhysActiveDays < 3 ~ "Moderate",
      PhysActive == "Yes" & !is.na(PhysActiveDays) & PhysActiveDays >= 3 ~ "Vigorous",
      TRUE ~ NA_character_
    ),
    activity_level = factor(activity_level, 
                           levels = c("None", "Moderate", "Vigorous"))
  ) %>%
  filter(!is.na(activity_level)) %>%
  select(ID, Age, Gender, DaysMentHlthBad, PhysActive, activity_level)

# YOUR TURN: Display the first 6 rows and check sample sizes
head(mental_health_data)%>%
  kable(caption="Physical Activity Level (first 6 rows)")

Physical Activity Level (first 6 rows)
ID	Age	Gender	DaysMentHlthBad	PhysActive	activity_level
51624	34	male	15	No	None
51624	34	male	15	No	None
51624	34	male	15	No	None
51630	49	female	10	No	None
51647	45	female	3	Yes	Vigorous
51647	45	female	3	Yes	Vigorous

#
table(mental_health_data$activity_level)

## 
##     None Moderate Vigorous 
##     3139      768     1850

YOUR TURN - Answer these questions:

How many people are in each physical activity group?
- None: 3139
- Moderate: 768
- Vigorous: 1850

Step 2: Descriptive Statistics

# YOUR TURN: Calculate summary statistics by activity level
summary_stats_activity <- mental_health_data %>%
  group_by(activity_level) %>%
  summarise (
    n=n(),
    Mean = mean(DaysMentHlthBad),
    SD = sd(DaysMentHlthBad),
    median = median(DaysMentHlthBad),
    Min = min(DaysMentHlthBad),
    Max = max(DaysMentHlthBad)
  )
summary_stats_activity%>%
  kable(digits = 2,
       caption= "Descriptive Statistics: Number of Bad Mental Health Days by Physical Activity" )

Descriptive Statistics: Number of Bad Mental Health Days by Physical Activity
activity_level	n	Mean	SD	Max
None	3139	5.08	9.01	30
Moderate	768	3.81	6.87	30
Vigorous	1850	3.54	7.17	30

# Hint: Follow the same structure as the guided example
# Variables to summarize: n, Mean, SD, Median, Min, Max

YOUR TURN - Interpret:

Which group has the highest mean number of bad mental health days?] No activity level has the highest mean number of bad mental health days.
Which group has the lowest? The lowest mean number of bad mental health days is the vigorous activity level group.

Step 3: Visualization

# YOUR TURN: Create boxplots comparing DaysMentHlthBad across activity levels
ggplot(mental_health_data,
  aes(x = activity_level, y = DaysMentHlthBad, fill = activity_level)) +
  geom_boxplot(alpha = 0.7, outlier.shape = NA) +
  geom_jitter(width = 0.2, alpha = 0.1, size = 0.5) +
  scale_fill_brewer(palette = "Set2") +
  labs(
    title = "Number of Bad Mental Health Days by Activity Level",
    x = "Activity Level",
    y = "Bad Mental Health Days",
    fill = "Activity Level"
  )

# Hint: Use the same ggplot code structure as the example
# Change variable names and labels appropriately

YOUR TURN - Describe what you see:

Do the groups appear to differ? Yes the groups appear to differ. Vigorous activity level had the lowest number of bad mental health days, and no activity level had the highest.
Are the variances similar across groups? The variances differ across groups, where the plot for no activity level has the highest variance and vigorous has the lowest.

Step 4: Set Up Hypotheses

YOUR TURN - Write the hypotheses:

Null Hypothesis (H₀): μ_None = μ_Moderate = μ_Vigorous

Alternative Hypothesis (H₁): At least one population mean differs from the others

Significance level: α = 0.05

Step 5: Fit the ANOVA Model

# YOUR TURN: Fit the ANOVA model
anova_model <- aov(DaysMentHlthBad ~ activity_level, data=mental_health_data)
summary(anova_model)

##                  Df Sum Sq Mean Sq F value   Pr(>F)    
## activity_level    2   3109  1554.6   23.17 9.52e-11 ***
## Residuals      5754 386089    67.1                     
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

# Outcome: DaysMentHlthBad
# Predictor: activity_level

YOUR TURN - Extract and interpret the results:

F-statistic: 23.17
Degrees of freedom: 2
p-value: 9.52e-11
Decision (reject or fail to reject H₀): reject the null hypothesis
Statistical conclusion in words: There is statistically significant evidence showing that the number of bad mental health days differ across at least two activity level categories.

Step 6: Post-Hoc Tests

# YOUR TURN: Conduct Tukey HSD test
tukey_results <- TukeyHSD(anova_model)
print(tukey_results)

##   Tukey multiple comparisons of means
##     95% family-wise confidence level
## 
## Fit: aov(formula = DaysMentHlthBad ~ activity_level, data = mental_health_data)
## 
## $activity_level
##                         diff       lwr        upr     p adj
## Moderate-None     -1.2725867 -2.045657 -0.4995169 0.0003386
## Vigorous-None     -1.5464873 -2.109345 -0.9836298 0.0000000
## Vigorous-Moderate -0.2739006 -1.098213  0.5504114 0.7159887

# Only if your ANOVA p-value < 0.05
plot(tukey_results, las = 1)

YOUR TURN - Complete the table:

Comparison	Mean Difference	95% CI Lower	95% CI Upper	p-value	Significant?
Moderate - None	-1.2725867	-2.045657	-0.4995169	0.0003386	Yes
Vigorous - None	-1.5464873	-2.109345	-0.9836298	0.0000000	Yes
Vigorous - Moderate	-0.2739006	-1.098213	0.5504114	0.7159887	No

Interpretation:

Which specific groups differ significantly? Both the Moderate-None and Virgorous-None are statistically significant. —

Step 7: Calculate Effect Size

# YOUR TURN: Calculate eta-squared
anova_summary <- summary(anova_model)[[1]]

ss_treatment <- anova_summary$`Sum Sq`[1]
ss_total <- sum(anova_summary$`Sum Sq`)

# Calculate eta-squared
eta_squared <- ss_treatment / ss_total

cat("Eta-squared (η²):", round(eta_squared, 4), "\n")

## Eta-squared (η²): 0.008

cat("Percentage of variance explained:", round(eta_squared * 100, 2), "%")

## Percentage of variance explained: 0.8 %

# Hint: Extract Sum Sq from the ANOVA summary

YOUR TURN - Interpret:

η² = 0.008
Percentage of variance explained: Physical activity level explains 0.8% of variance in the number of bad mental health days.
Effect size classification (small/medium/large): Small
What does this mean practically? Activity level has a detectable relationship to mental health, but only accounts for a small portion of differences across individuals.

Step 8: Check Assumptions

# YOUR TURN: Create diagnostic plots
par(mfrow = c(2, 2))
plot(anova_model)

par(mfrow = c(1, 1))

YOUR TURN - Evaluate each plot:

Residuals vs Fitted: The relationship between activity level and mental health is not seen clearly.
Q-Q Plot: There are strong deviations from the diagonal line.
Scale-Location: The red line is not stable, showing non-constant variance.
Residuals vs Leverage: There are influential points beyond Cook’s distance line.

# YOUR TURN: Conduct Levene's test
levene_test <- leveneTest(DaysMentHlthBad ~ activity_level, data = mental_health_data)
print(levene_test)

## Levene's Test for Homogeneity of Variance (center = median)
##         Df F value    Pr(>F)    
## group    2  23.168 9.517e-11 ***
##       5754                      
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

YOUR TURN - Overall assessment:

Are assumptions reasonably met No the assumptions are not reasonably met.
Do any violations threaten your conclusions? There are violations to normality and homogeneity of variance which could threaten the conclusions. —

Step 9: Write Up Results

YOUR TURN - Write a complete 2-3 paragraph results section:

Include: 1. Sample description and descriptive statistics 2. F-test results 3. Post-hoc comparisons (if applicable) 4. Effect size interpretation 5. Public health significance

Your Results Section:

A one-way ANOVA was conducted to examine whether the mean number of bad mental health days differed across physical activity levels (None, Moderate, and Vigorous) from the NHANES dataset. Descriptive statistics indicated a clear pattern of individuals reporting no physical activity had the highest mean number of bad mental health days, followed by moderate activity, and individuals engaging in vigorous physical activity levels had the lowest number of bad mental health days.

ANOVA revealed a statistically significant difference in mean bad mental health days across activity levels, with the F-test result being 23.17. The post-hoc test showed both moderate vs none and vigorous vs none comparisons were statistically significant. This indicated that individuals with any level of physical activity reported fewer bad mental health days than individual with no activity. The difference between vigorous and moderate was not statistically significant.

The effect size (η² = 0.008) indicated that physical activity level explains 0.8% of the variance in bad mental health days, representing a small practical effect. Although the statistical association is clear, physical activity only accounts for a small portion of individual difference in mental health.

Reflection Questions

1. How does the effect size help you understand the practical vs. statistical significance? The effect size allows your to understand the meaning of the difference. A small p value will tell you groups differ and are statistically significant. The effect size will indicate whether the difference is important, or how big of an effect the difference has.

2. Why is it important to check ANOVA assumptions? What might happen if they’re violated?

Checking the assumptions will ensure the F-test and p-value are trusted. ANOVA can give misleading results when normality and homogeneity of variance are violated. Violations weaken the confidence in the conclusion.

3. In public health practice, when might you choose to use ANOVA?

In public health practice, ANOVA can be used to compare the continuous outcome of three or more groups. You can evaluate the dietary intake across income groups, or blood pressure across BMI categories as shown in the example.

4. What was the most challenging part of this lab activity? The most challenging part of the lab was to full understanding what the data showed. Understanding what the different tests show and how they effect the final results is important and will take practice.

Lab 2 ANOVA

Sarah Andres

2026-02-02