# Load necessary libraries
library(tidyverse)   # For data manipulation and visualization
## Warning: package 'tidyverse' was built under R version 4.5.2
## Warning: package 'tibble' was built under R version 4.5.2
## Warning: package 'tidyr' was built under R version 4.5.2
## Warning: package 'readr' was built under R version 4.5.2
## Warning: package 'purrr' was built under R version 4.5.2
## Warning: package 'dplyr' was built under R version 4.5.2
## Warning: package 'stringr' was built under R version 4.5.2
## Warning: package 'forcats' was built under R version 4.5.2
## Warning: package 'lubridate' was built under R version 4.5.2
## ── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ──
## ✔ dplyr     1.1.4     ✔ readr     2.1.6
## ✔ forcats   1.0.1     ✔ stringr   1.6.0
## ✔ ggplot2   4.0.0     ✔ tibble    3.3.1
## ✔ lubridate 1.9.4     ✔ tidyr     1.3.2
## ✔ purrr     1.2.1     
## ── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
## ✖ dplyr::filter() masks stats::filter()
## ✖ dplyr::lag()    masks stats::lag()
## ℹ Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to become errors
library(knitr)       # For nice tables
## Warning: package 'knitr' was built under R version 4.5.2
library(car)         # For Levene's test
## Warning: package 'car' was built under R version 4.5.2
## Loading required package: carData
## Warning: package 'carData' was built under R version 4.5.2
## 
## Attaching package: 'car'
## 
## The following object is masked from 'package:dplyr':
## 
##     recode
## 
## The following object is masked from 'package:purrr':
## 
##     some
library(NHANES)      # NHANES dataset
## Warning: package 'NHANES' was built under R version 4.5.2
# Load the NHANES data
data(NHANES)

Practice Problem: Physical Activity and Depression

Research Question: Is there a difference in the number of days with poor mental health across three physical activity levels (None, Moderate, Vigorous)?

Your Task: Complete the same 9-step analysis workflow you just practiced, but now on a different outcome and predictor.


Step 1: Data Preparation

# Prepare the dataset
set.seed(553)

mental_health_data <- NHANES %>%
  filter(Age >= 18) %>%
  filter(!is.na(DaysMentHlthBad) & !is.na(PhysActive)) %>%
  mutate(
    activity_level = case_when(
      PhysActive == "No" ~ "None",
      PhysActive == "Yes" & !is.na(PhysActiveDays) & PhysActiveDays < 3 ~ "Moderate",
      PhysActive == "Yes" & !is.na(PhysActiveDays) & PhysActiveDays >= 3 ~ "Vigorous",
      TRUE ~ NA_character_
    ),
    activity_level = factor(activity_level, 
                           levels = c("None", "Moderate", "Vigorous"))
  ) %>%
  filter(!is.na(activity_level)) %>%
  select(ID, Age, Gender, DaysMentHlthBad, PhysActive, activity_level)

# YOUR TURN: Display the first 6 rows and check sample sizes
head(mental_health_data, n = 6)
## # A tibble: 6 × 6
##      ID   Age Gender DaysMentHlthBad PhysActive activity_level
##   <int> <int> <fct>            <int> <fct>      <fct>         
## 1 51624    34 male                15 No         None          
## 2 51624    34 male                15 No         None          
## 3 51624    34 male                15 No         None          
## 4 51630    49 female              10 No         None          
## 5 51647    45 female               3 Yes        Vigorous      
## 6 51647    45 female               3 Yes        Vigorous
table(mental_health_data$activity_level)
## 
##     None Moderate Vigorous 
##     3139      768     1850

YOUR TURN - Answer these questions:

  • How many people are in each physical activity group?
    • None: 3139
    • Moderate: 768
    • Vigorous: 1850

Step 2: Descriptive Statistics

# YOUR TURN: Calculate summary statistics by activity level
# Hint: Follow the same structure as the guided example
# Variables to summarize: n, Mean, SD, Median, Min, Max

summary_stats <- mental_health_data %>%
  group_by(activity_level) %>%
  summarise(
    n = n(),
    Mean = mean(DaysMentHlthBad),
    SD = sd(DaysMentHlthBad),
    Median = median(DaysMentHlthBad),
    Min = min(DaysMentHlthBad),
    Max = max(DaysMentHlthBad)
  )

summary_stats %>% 
  kable(digits = 2, 
        caption = "Descriptive Statistics: Bad Mental Health Days by Physical Activity Level")
Descriptive Statistics: Bad Mental Health Days by Physical Activity Level
activity_level n Mean SD Median Min Max
None 3139 5.08 9.01 0 0 30
Moderate 768 3.81 6.87 0 0 30
Vigorous 1850 3.54 7.17 0 0 30

YOUR TURN - Interpret:

  • Which group has the highest mean number of bad mental health days? The “None” physical activity level group has the highest mean number of bad mental health days.
  • Which group has the lowest? The “Vigorous” physical activity level group has the lowest mean number of bad mental health days. —

Step 3: Visualization

# YOUR TURN: Create boxplots comparing DaysMentHlthBad across activity levels
# Hint: Use the same ggplot code structure as the example
# Change variable names and labels appropriately
ggplot(mental_health_data,
  aes(x = activity_level, y = DaysMentHlthBad, fill = activity_level)) +
  geom_boxplot(alpha = 0.7, outlier.shape = NA) +
  geom_jitter(width = 0.2, alpha = 0.1, size = 0.5) +
  scale_fill_brewer(palette = "Set2") +
  labs(
    title = "Bad Mental Health Days by Physical Activity Level",
    subtitle = "NHANES Data, Adults aged 18 and older",
    x = "Physical Activity Level",
    y = "Number of Bad Mental Health Days",
    fill = "Physical Activity Level"
  ) +
  theme_minimal(base_size = 12) +
  theme(legend.position = "none")

YOUR TURN - Describe what you see:

  • Do the groups appear to differ? The medians of all of the physical activity level groups seem to be very similar. The boxes all overlap with each other, but the “vigorous” group may have a slightly lower median.
  • Are the variances similar across groups? The variances are similar in the “None” and “Moderate” physical activity groups, but the “vigorous” physical activity group seems to have a much smaller variance because of the shorter box height. —

Step 4: Set Up Hypotheses

YOUR TURN - Write the hypotheses:

Null Hypothesis (H₀): μ_None = μ_Moderate = μ_Vigorous

Alternative Hypothesis (H₁): At least one population mean differs from the others

Significance level: α = 0.05


Step 5: Fit the ANOVA Model

# YOUR TURN: Fit the ANOVA model
# Outcome: DaysMentHlthBad
# Predictor: activity_level
# Fit the one-way ANOVA model
anova_model <- aov(DaysMentHlthBad ~ activity_level, data = mental_health_data)

# Display the ANOVA table
summary(anova_model)
##                  Df Sum Sq Mean Sq F value   Pr(>F)    
## activity_level    2   3109  1554.6   23.17 9.52e-11 ***
## Residuals      5754 386089    67.1                     
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

YOUR TURN - Extract and interpret the results:

  • F-statistic: 23.17
  • Degrees of freedom: 2
  • p-value: 9.52e-11
  • Decision (reject or fail to reject H₀): Since p < 0.5, we reject H₀
  • Statistical conclusion in words: There is statistically significant evidence that mean bad mental health days differs across at least two physical activity categories.

Step 6: Post-Hoc Tests

# YOUR TURN: Conduct Tukey HSD test
# Only if your ANOVA p-value < 0.05
# Conduct Tukey HSD test
tukey_results <- TukeyHSD(anova_model)
print(tukey_results)
##   Tukey multiple comparisons of means
##     95% family-wise confidence level
## 
## Fit: aov(formula = DaysMentHlthBad ~ activity_level, data = mental_health_data)
## 
## $activity_level
##                         diff       lwr        upr     p adj
## Moderate-None     -1.2725867 -2.045657 -0.4995169 0.0003386
## Vigorous-None     -1.5464873 -2.109345 -0.9836298 0.0000000
## Vigorous-Moderate -0.2739006 -1.098213  0.5504114 0.7159887
# Visualize the confidence intervals
plot(tukey_results, las = 1)

YOUR TURN - Complete the table:

Comparison Mean Difference 95% CI Lower 95% CI Upper p-value Significant?
Moderate - None -1.2725867 -2.045657 -0.4995169 0.0003386 Yes
Vigorous - None -1.5464873 -2.109345 -0.9836298 0.0000000 Yes
Vigorous - Moderate -0.2739006 -1.098213 0.5504114 0.7159887 No

Interpretation:

Which specific groups differ significantly? The Moderate-None groups and Vigorous-None groups differ significantly from one another. —

Step 7: Calculate Effect Size

# YOUR TURN: Calculate eta-squared
# Hint: Extract Sum Sq from the ANOVA summary
# Extract sum of squares from ANOVA table
anova_summary <- summary(anova_model)[[1]]

ss_treatment <- anova_summary$`Sum Sq`[1]
ss_total <- sum(anova_summary$`Sum Sq`)

# Calculate eta-squared
eta_squared <- ss_treatment / ss_total

cat("Eta-squared (η²):", round(eta_squared, 4), "\n")
## Eta-squared (η²): 0.008
cat("Percentage of variance explained:", round(eta_squared * 100, 2), "%")
## Percentage of variance explained: 0.8 %

YOUR TURN - Interpret:

  • η² = 0.008
  • Percentage of variance explained: 0.8%
  • Effect size classification (small/medium/large): Small
  • What does this mean practically?: This means that only 0.8% of the variance in the number bad mental health days can be explained by physical activity level. While statistically significant, physical activity alone doesn’t explain most of the variation in the number of bad mental health days.

Step 8: Check Assumptions

# YOUR TURN: Create diagnostic plots
par(mfrow = c(2, 2))
plot(anova_model)

par(mfrow = c(1, 1))

YOUR TURN - Evaluate each plot:

  1. Residuals vs Fitted: The points don’t necessarily show a random pattern, and they are not equally distributed above and below 0. This means that the independence assumption may not hold, and that the association between the predictor and outcome may not be linear.

  2. Q-Q Plot: The points only follow the line about half way, meaning there are departures from normalcy.

  3. Scale-Location: The horizontal red line is approximately horizontal, meaning there is homogeneity of variance.

  4. Residuals vs Leverage: There are points beyond Cook’s distance line, meaning there are highly influential outliers.

# YOUR TURN: Conduct Levene's test
# Levene's test for homogeneity of variance
levene_test <- leveneTest(DaysMentHlthBad ~ activity_level, data = mental_health_data)
print(levene_test)
## Levene's Test for Homogeneity of Variance (center = median)
##         Df F value    Pr(>F)    
## group    2  23.168 9.517e-11 ***
##       5754                      
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

YOUR TURN - Overall assessment:

  • Are assumptions reasonably met? Not all of the assumptions are reasonably met. The independence and normality assumptions are not reasonably met.
  • Do any violations threaten your conclusions? The independence and normality assumption violations could threaten the conclusions, because they don’t seem to be minor, and while the sample size is fairly large, the group sample sizes are not very equal. —

Step 9: Write Up Results

YOUR TURN - Write a complete 2-3 paragraph results section:

Include: 1. Sample description and descriptive statistics 2. F-test results 3. Post-hoc comparisons (if applicable) 4. Effect size interpretation 5. Public health significance

Your Results Section:

We conducted a one-way ANOVA to examine whether mean number of bad mental health days differs across physical activity level categories (None(3139), Moderate(768), Vigorous(185)) among 5,757 adults aged 18+ from NHANES. Descriptive statistics showed mean number of bad mental health days of 5.08 (SD = 9.01) for None, 3.81 (SD = 6.87) for Moderate, and 3.54 (SD = 7.17) for Vigorous physical activity level individuals.

The ANOVA revealed a statistically significant difference in mean number of bad mental health days across physical activity level categories, F(2, 5757) = 23.17, p < 0.001. Tukey’s HSD post-hoc tests indicated that the Moderate-None and Vigorous-None pairwise comparisons were significant (p < 0.05): those with Moderate physical activity had an average of 1.27 less bad mental health days than those with none (95% CI [-2.045657,-0.4995169]), and those with vigorous physical activity had an average of 1.55 less bad mental health days than those with none (95% CI [-2.109345,-0.9836298]). The Vigorous-Moderate pairwise comparison was not statistically significant (p = 0.760)(95% CI [-1.098213,0.5504114]), suggesting that the benefit of moderate activity is similar to the benefit of vigorous activity when it comes to decreasing the number of bad mental health days.

The effect size (η² = 0.008) indicates that physical activity level explains 0.8% of the variance in number of bad mental health days, representing a small practical effect. These findings support the relationship between higher physical activity level and a lower number of bad mental health days, though other factors account for most of the variation in the number of bad mental health days. This is important for public health, because the conclusion is suggesting that physical activity can improve mental health, which can be communicated to the public. The tests for normality and independence, were however, violated, and because of a large difference in the sample size in each group, there is a possibility that these conclusions are threatened by these violations.


Reflection Questions

1. How does the effect size help you understand the practical vs. statistical significance?

The effect size helps to show that even if results are statistically significant, it doesn’t mean that they are also practically significant. Statistically significance only means so much if the practical effects are not large. Even though the results were statistically significant, the effect size was very small, meaning it may not play that large of a role in explaining results in a real-life situation.

2. Why is it important to check ANOVA assumptions? What might happen if they’re violated?

It is important to check ANOVA assumptions, because if they are violated, it may threaten the conclusion received of the ANOVA test. The statistical test will only give meaningful conclusions if the assumptions are met, or have only very minor deviations.

3. In public health practice, when might you choose to use ANOVA?

You would choose to use ANOVA in public health when you are comparing the means of a continuous variable across 3 or more categories. The data has to follow the assumptions of normalcy, independence, and homogeneity of variances in order for the conclusion to not be threatened. In addition, one-way ANOVA can be used to decrease the chance of a type I error that may occur when you are doing multiple comparisons. This is important for public health, because a lot of the data collected can be made categorical, and many continuous variables can then be compared across these categories. Some examples may be comparing mean heart rate across different activity levels or comparing the number of blood cells across different age categories.

4. What was the most challenging part of this lab activity?

The most challenging part of this lab was understanding the different diagnostic plots and interpreting them.