Lab Overview

Time: ~30 minutes

Goal: Practice one-way ANOVA analysis from start to finish using real public health data

Learning Objectives:

Understand when and why to use ANOVA instead of multiple t-tests
Set up hypotheses for ANOVA
Conduct and interpret the F-test
Perform post-hoc tests when appropriate
Check ANOVA assumptions
Calculate and interpret effect size (η²)

Structure:

Part A: Guided Example (follow along)
Part B: Your Turn (independent practice)

Submission: Upload your completed .Rmd file and published to Brightspace by the end of class.

PART A: GUIDED EXAMPLE

Example: Blood Pressure and BMI Categories

Research Question: Is there a difference in mean systolic blood pressure (SBP) across three BMI categories (Normal weight, Overweight, Obese)?

Why ANOVA? We have one continuous outcome (SBP) and one categorical predictor with THREE groups (BMI category). Using multiple t-tests would inflate our Type I error rate.

Step 1: Setup and Data Preparation

# Load necessary libraries
library(tidyverse)   # For data manipulation and visualization
library(knitr)       # For nice tables
library(car)         # For Levene's test
library(NHANES)      # NHANES dataset

# Load the NHANES data
data(NHANES)

Create analysis dataset:

# Set seed for reproducibility
set.seed(553)

# Create BMI categories and prepare data
bp_bmi_data <- NHANES %>%
  filter(Age >= 18 & Age <= 65) %>%  # Adults 18-65
  filter(!is.na(BPSysAve) & !is.na(BMI)) %>%
  mutate(
    bmi_category = case_when(
      BMI < 25 ~ "Normal",
      BMI >= 25 & BMI < 30 ~ "Overweight",
      BMI >= 30 ~ "Obese",
      TRUE ~ NA_character_
    ),
    bmi_category = factor(bmi_category, 
                         levels = c("Normal", "Overweight", "Obese"))
  ) %>%
  filter(!is.na(bmi_category)) %>%
  select(ID, Age, Gender, BPSysAve, BMI, bmi_category)

# Display first few rows
head(bp_bmi_data) %>% 
  kable(caption = "Blood Pressure and BMI Dataset (first 6 rows)")

Blood Pressure and BMI Dataset (first 6 rows)
ID	Age	Gender	BPSysAve	BMI	bmi_category
51624	34	male	113	32.22	Obese
51624	34	male	113	32.22	Obese
51624	34	male	113	32.22	Obese
51630	49	female	112	30.57	Obese
51647	45	female	118	27.24	Overweight
51647	45	female	118	27.24	Overweight

# Check sample sizes
table(bp_bmi_data$bmi_category)

## 
##     Normal Overweight      Obese 
##       1939       1937       2150

Interpretation: We have 6026 adults with complete BP and BMI data across three BMI categories.

Step 2: Descriptive Statistics

# Calculate summary statistics by BMI category
summary_stats <- bp_bmi_data %>%
  group_by(bmi_category) %>%
  summarise(
    n = n(),
    Mean = mean(BPSysAve),
    SD = sd(BPSysAve),
    Median = median(BPSysAve),
    Min = min(BPSysAve),
    Max = max(BPSysAve)
  )

summary_stats %>% 
  kable(digits = 2, 
        caption = "Descriptive Statistics: Systolic BP by BMI Category")

Descriptive Statistics: Systolic BP by BMI Category
bmi_category	n	Mean	SD	Median	Min	Max
Normal	1939	114.23	15.01	113	78	221
Overweight	1937	118.74	13.86	117	83	186
Obese	2150	121.62	15.27	120	82	226

Observation: The mean SBP appears to increase from Normal (114.2) to Overweight (118.7) to Obese (121.6).

But is this difference statistically significant?

Step 3: Visualize the Data

# Create boxplots with individual points
ggplot(bp_bmi_data, 
  aes(x = bmi_category, y = BPSysAve, fill = bmi_category)) +
  geom_boxplot(alpha = 0.7, outlier.shape = NA) +
  geom_jitter(width = 0.2, alpha = 0.1, size = 0.5) +
  scale_fill_brewer(palette = "Set2") +
  labs(
    title = "Systolic Blood Pressure by BMI Category",
    subtitle = "NHANES Data, Adults aged 18-65",
    x = "BMI Category",
    y = "Systolic Blood Pressure (mmHg)",
    fill = "BMI Category"
  ) +
  theme_minimal(base_size = 12) +
  theme(legend.position = "none")

What the plot tells us:

There appears to be a trend: higher BMI categories have higher median SBP
The boxes overlap, but the obese group appears shifted upward
Variability (box heights) looks similar across groups

Step 4: Set Up Hypotheses

Null Hypothesis (H₀): μ_Normal = μ_Overweight = μ_Obese
(All three population means are equal)

Alternative Hypothesis (H₁): At least one population mean differs from the others

Significance level: α = 0.05

Step 5: Fit the ANOVA Model

# Fit the one-way ANOVA model
anova_model <- aov(BPSysAve ~ bmi_category, data = bp_bmi_data)

# Display the ANOVA table
summary(anova_model)

##                Df  Sum Sq Mean Sq F value Pr(>F)    
## bmi_category    2   56212   28106   129.2 <2e-16 ***
## Residuals    6023 1309859     217                   
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Interpretation:

F-statistic: 129.24
Degrees of freedom: df₁ = 2 (k-1 groups), df₂ = 6023 (n-k)
p-value: < 2e-16 (very small)
Decision: Since p < 0.05, we reject H₀
Conclusion: There is statistically significant evidence that mean systolic BP differs across at least two BMI categories.

Step 6: Post-Hoc Tests (Tukey HSD)

Why do we need this? The F-test tells us that groups differ, but not which groups differ. Tukey’s Honest Significant Difference controls the family-wise error rate for multiple pairwise comparisons.

# Conduct Tukey HSD test
tukey_results <- TukeyHSD(anova_model)
print(tukey_results)

##   Tukey multiple comparisons of means
##     95% family-wise confidence level
## 
## Fit: aov(formula = BPSysAve ~ bmi_category, data = bp_bmi_data)
## 
## $bmi_category
##                       diff      lwr      upr p adj
## Overweight-Normal 4.507724 3.397134 5.618314     0
## Obese-Normal      7.391744 6.309024 8.474464     0
## Obese-Overweight  2.884019 1.801006 3.967033     0

# Visualize the confidence intervals
plot(tukey_results, las = 0)

Interpretation:

Comparison	Mean Diff	95% CI	p-value	Significant?
Overweight - Normal	4.51	[3.4, 5.62]	3.82e-12	Yes
Obese - Normal	7.39	[6.31, 8.47]	< 0.001	Yes
Obese - Overweight	2.88	[1.8, 3.97]	1.38e-09	Yes

Conclusion: All three pairwise comparisons are statistically significant. Obese adults have higher SBP than overweight adults, who in turn have higher SBP than normal-weight adults.

Step 7: Calculate Effect Size

# Extract sum of squares from ANOVA table
anova_summary <- summary(anova_model)[[1]]

ss_treatment <- anova_summary$`Sum Sq`[1]
ss_total <- sum(anova_summary$`Sum Sq`)

# Calculate eta-squared
eta_squared <- ss_treatment / ss_total

cat("Eta-squared (η²):", round(eta_squared, 4), "\n")

## Eta-squared (η²): 0.0411

cat("Percentage of variance explained:", round(eta_squared * 100, 2), "%")

## Percentage of variance explained: 4.11 %

Interpretation: BMI category explains 4.11% of the variance in systolic BP.

Effect size guidelines: Small (0.01), Medium (0.06), Large (0.14)
Our effect: Small

While statistically significant, the practical effect is modest—BMI category alone doesn’t explain most of the variation in blood pressure.

Step 8: Check Assumptions

ANOVA Assumptions:

Independence: Observations are independent (assumed based on study design)
Normality: Residuals are approximately normally distributed
Homogeneity of variance: Equal variances across groups

# Create diagnostic plots
par(mfrow = c(2, 2))
plot(anova_model)

par(mfrow = c(1, 1))

Diagnostic Plot Interpretation:

Residuals vs Fitted: Points show random scatter around zero with no clear pattern → Good!
Q-Q Plot: Points follow the diagonal line reasonably well → Normality assumption is reasonable
Scale-Location: Red line is relatively flat → Equal variance assumption is reasonable
Residuals vs Leverage: No points beyond Cook’s distance lines → No highly influential outliers

# Levene's test for homogeneity of variance
levene_test <- leveneTest(BPSysAve ~ bmi_category, data = bp_bmi_data)
print(levene_test)

## Levene's Test for Homogeneity of Variance (center = median)
##         Df F value  Pr(>F)  
## group    2  2.7615 0.06328 .
##       6023                  
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Levene’s Test Interpretation:

p-value: 0.0633
If p < 0.05, we would reject equal variances
Here: Equal variance assumption is met

Overall Assessment: With n > 2000, ANOVA is robust to minor violations. Our assumptions are reasonably satisfied.

Step 9: Report Results

Example Results Section:

We conducted a one-way ANOVA to examine whether mean systolic blood pressure (SBP) differs across BMI categories (Normal, Overweight, Obese) among 6,026 adults aged 18-65 from NHANES. Descriptive statistics showed mean SBP of 114.2 mmHg (SD = 15) for normal weight, 118.7 mmHg (SD = 13.9) for overweight, and 121.6 mmHg (SD = 15.3) for obese individuals.

The ANOVA revealed a statistically significant difference in mean SBP across BMI categories, F(2, 6023) = 129.24, p < 0.001. Tukey’s HSD post-hoc tests indicated that all pairwise comparisons were significant (p < 0.05): obese adults had on average 7.4 mmHg higher SBP than normal-weight adults, and 2.9 mmHg higher than overweight adults.

The effect size (η² = 0.041) indicates that BMI category explains 4.1% of the variance in systolic blood pressure, representing a small practical effect. These findings support the well-established relationship between higher BMI and elevated blood pressure, though other factors account for most of the variation in SBP.

PART B: YOUR TURN - INDEPENDENT PRACTICE

Practice Problem: Physical Activity and Depression

Research Question: Is there a difference in the number of days with poor mental health across three physical activity levels (None, Moderate, Vigorous)?

Your Task: Complete the same 9-step analysis workflow you just practiced, but now on a different outcome and predictor.

Step 1: Data Preparation

# Load necessary libraries
library(tidyverse)   # For data manipulation and visualization
library(knitr)       # For nice tables
library(car)         # For Levene's test
library(NHANES)      # NHANES dataset

# Load the NHANES data
data(NHANES)
# Prepare the dataset
set.seed(553)

mental_health_data <- NHANES %>%
  filter(Age >= 18) %>%
  filter(!is.na(DaysMentHlthBad) & !is.na(PhysActive)) %>%
  mutate(
    activity_level = case_when(
      PhysActive == "No" ~ "None",
      PhysActive == "Yes" & !is.na(PhysActiveDays) & PhysActiveDays < 3 ~ "Moderate",
      PhysActive == "Yes" & !is.na(PhysActiveDays) & PhysActiveDays >= 3 ~ "Vigorous",
      TRUE ~ NA_character_
    ),
    activity_level = factor(activity_level, 
                           levels = c("None", "Moderate", "Vigorous"))
  ) %>%
  filter(!is.na(activity_level)) %>%
  select(ID, Age, Gender, DaysMentHlthBad, PhysActive, activity_level)

# YOUR TURN: Display the first 6 rows and check sample sizes
head (mental_health_data) %>% 
   kable(caption = "Activity Level and Mental Health (first 6 rows)")

Activity Level and Mental Health (first 6 rows)
ID	Age	Gender	DaysMentHlthBad	PhysActive	activity_level
51624	34	male	15	No	None
51624	34	male	15	No	None
51624	34	male	15	No	None
51630	49	female	10	No	None
51647	45	female	3	Yes	Vigorous
51647	45	female	3	Yes	Vigorous

# Check sample sizes
table(mental_health_data$activity_level)

## 
##     None Moderate Vigorous 
##     3139      768     1850

YOUR TURN - Answer these questions:

How many people are in each physical activity group?
- None: 3139
- Moderate: 768
- Vigorous: 1850

Step 2: Descriptive Statistics

# YOUR TURN: Calculate summary statistics by activity level
# Hint: Follow the same structure as the guided example
# Variables to summarize: n, Mean, SD, Median, Min, Max
# Calculate summary statistics by mental health category
summary_stats <- mental_health_data %>%
  group_by(activity_level) %>%
  summarise(
    n = n(),
    Mean = mean(DaysMentHlthBad),
    SD = sd(DaysMentHlthBad),
    Median = median(DaysMentHlthBad),
    Min = min(DaysMentHlthBad),
    Max = max(DaysMentHlthBad)
  )

summary_stats %>% 
  kable(digits = 2, 
        caption = "Descriptive Statistics: Mental Health Levels by Physical Activity")

Descriptive Statistics: Mental Health Levels by Physical Activity
activity_level	n	Mean	SD	Max
None	3139	5.08	9.01	30
Moderate	768	3.81	6.87	30
Vigorous	1850	3.54	7.17	30

YOUR TURN - Interpret:

Which group has the highest mean number of bad mental health days? None
Which group has the lowest? Vigorous

Step 3: Visualization

# YOUR TURN: Create boxplots comparing DaysMentHlthBad across activity levels
# Hint: Use the same ggplot code structure as the example
# Change variable names and labels appropriately

# Create boxplots with individual points
ggplot(mental_health_data, 
  aes(x = activity_level, y = DaysMentHlthBad, fill = activity_level)) +
  geom_boxplot(alpha = 0.7, outlier.shape = NA) +
  geom_jitter(width = 0.2, height = 0.2, alpha = 0.08, size = 0.5) +
  scale_fill_brewer(palette = "Set2") +
  labs(
    title = "Mental Health Levels by Physical Activity",
    subtitle = "NHANES Data, Adults aged 18-65",
    x = "Physical Activity Levels",
    y = "Number of Days with Poor Mental Health",
    fill = "Activity Level"
  ) +
  theme_minimal(base_size = 12) +
  theme(legend.position = "none")

YOUR TURN - Describe what you see:

Do the groups appear to differ? The none groups seems to have the highest number of poor mental health days, while the vigorous group has the least. The moderate group is in the middle compared to the other two groups for the number of poor mental health days.The boxes also seem to overlap.
Are the variances similar across groups? The standard deviations between the groups are similar, which can suggest comparable variability when comparing poor mental health days across the activity levels.

Step 4: Set Up Hypotheses

YOUR TURN - Write the hypotheses:

Null Hypothesis (H₀): μ_None = μ_Moderate = μ_Vigorous

Alternative Hypothesis (H₁): At least one group mean is different from the other

Significance level: α = 0.05

Step 5: Fit the ANOVA Model

# YOUR TURN: Fit the ANOVA model
# Outcome: DaysMentHlthBad
# Predictor: activity_level
# Fit the one-way ANOVA model
anova_model <- aov(DaysMentHlthBad ~ activity_level, data = mental_health_data)

# Display the ANOVA table
summary(anova_model)

##                  Df Sum Sq Mean Sq F value   Pr(>F)    
## activity_level    2   3109  1554.6   23.17 9.52e-11 ***
## Residuals      5754 386089    67.1                     
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

YOUR TURN - Extract and interpret the results:

Degrees of freedom: 23.17
p-value: 2 (between levels) 5754 (between residuals)
Decision (reject or fail to reject H₀): reject H₀, since p<0.05
Statistical conclusion in words: There is statistically significant evidence that the mean number of poor mental health days differ across at least two physical activity categories.

Step 6: Post-Hoc Tests

# YOUR TURN: Conduct Tukey HSD test
# Only if your ANOVA p-value < 0.05
# Conduct Tukey HSD test
tukey_results <- TukeyHSD(anova_model)
print(tukey_results)

##   Tukey multiple comparisons of means
##     95% family-wise confidence level
## 
## Fit: aov(formula = DaysMentHlthBad ~ activity_level, data = mental_health_data)
## 
## $activity_level
##                         diff       lwr        upr     p adj
## Moderate-None     -1.2725867 -2.045657 -0.4995169 0.0003386
## Vigorous-None     -1.5464873 -2.109345 -0.9836298 0.0000000
## Vigorous-Moderate -0.2739006 -1.098213  0.5504114 0.7159887

# Visualize the confidence intervals
plot(tukey_results, las = 1)

YOUR TURN - Complete the table:

Comparison	Mean Difference	95% CI Lower	95% CI Upper	p-value	Significant?
Moderate - None	-1.27	-2.05	-0.50	.0003	Yes
Vigorous - None	-1.55	-2.11	-0.98	<.0001	Yes
Vigorous - Moderate	-0.27	-1.10	0.55	0.716	No

Interpretation:

Which specific groups differ significantly? The groups that differ significantly are moderate-none and vigorous-none from vigorous-moderate because their p-values are both <0.05, showing that those groups significantly significant, compared to the group that have levels vigorous-moderate. —

Step 7: Calculate Effect Size

# YOUR TURN: Calculate eta-squared
# Hint: Extract Sum Sq from the ANOVA summary
# Extract sum of squares from ANOVA table
anova_summary <- summary(anova_model)[[1]]

ss_treatment <- anova_summary$`Sum Sq`[1]
ss_total <- sum(anova_summary$`Sum Sq`)

# Calculate eta-squared
eta_squared <- ss_treatment / ss_total

cat("Eta-squared (η²):", round(eta_squared, 4), "\n")

## Eta-squared (η²): 0.008

cat("Percentage of variance explained:", round(eta_squared * 100, 2), "%")

## Percentage of variance explained: 0.8 %

YOUR TURN - Interpret:

η² = 0.008
Percentage of variance explained: 0.8 %
Effect size classification (small/medium/large): S=0.01, M=0.06, L=0.14
What does this mean practically? While statistically significant, the practical effect is fair—The activity level category alone doesn’t explain most of the variation in poor mental days. It would only explain 0.8% of the variability. —

Step 8: Check Assumptions

# YOUR TURN: Create diagnostic plots
# Create diagnostic plots
par(mfrow = c(2, 2))
plot(anova_model)

par(mfrow = c(1, 1))

YOUR TURN - Evaluate each plot:

Residuals vs Fitted: The residuals are centered around 0 and there’s no strong systematic pattern, showing no violation of equal variance.
Q-Q Plot: There is a strong S-shape and deviation from the line, so the residuals aren’t normal and the normality assumption is violated.
Scale-Location: The red line is relatively flat with a slight upward trend, showing that equal variance assumption is reasonably met.
Residuals vs Leverage: There are no extreme leverage points, showing that there’s no highly influential outliers.

# YOUR TURN: Conduct Levene's test
# Levene's test for homogeneity of variance
levene_test <- leveneTest(DaysMentHlthBad ~ activity_level, data = mental_health_data)
print(levene_test)

## Levene's Test for Homogeneity of Variance (center = median)
##         Df F value    Pr(>F)    
## group    2  23.168 9.517e-11 ***
##       5754                      
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

YOUR TURN - Overall assessment:

Are assumptions reasonably met?
Do any violations threaten your conclusions?

Levene’s test demonstrates unequal variances,and the large sample size makes ANOVA robust to minor violations. The assumptions are reasonably satisfied and the results would remain reliable.

Step 9: Write Up Results

YOUR TURN - Write a complete 2-3 paragraph results section:

Include: 1. Sample description and descriptive statistics 2. F-test results 3. Post-hoc comparisons (if applicable) 4. Effect size interpretation 5. Public health significance

Your Results Section:

We conducted a one-way ANOVA to determine whether the mean number of poor mental health days differs across physical activity levels (None, Moderate, Vigorous). We utilized NHANES data collected from from 5,757 adults aged 18–65 to form conclusions. The descriptive statistics demonstrated a mean of 5.08 days (SD = 9.01) for individuals reporting no physical activity, 3.81 days (SD = 6.87) for those reporting moderate activity, and 3.54 days (SD = 7.17) for those reporting vigorous activity. Overall, individuals who engaged in more physical activity reported fewer poor mental health days on average.

The ANOVA revealed a statistically significant difference in mean poor mental health days across different activity levels, F(2, 5754) = 23.17, p < 0.001. Tukey’s HSD post-hoc tests proved that both the moderate and vigorous activity groups reported significantly fewer poor mental health days than the no-activity group, in which the p-values were less than 0.001 for both comparisons. However, there was no significant difference between the moderate and vigorous activity groups since the p value was marked at p = 0.716.

The effect size (η² = 0.008) proves that physical activity level explains approximately 0.8% of the variance in poor mental health days, demonstrating a very small practical effect. Although statistically significant, the magnitude of the association is modest, suggesting that while physical activity may play a role in mental health, many other factors can contribute to variation in poor mental health days. From a public health perspective, we can conclude that even though the effect size is small, increasing physical activity at the population level could still reduce the overall burden of poor mental health in communities.

Reflection Questions

1. How does the effect size help you understand the practical vs. statistical significance?

Effect size helps me to understand practical vs. statistical significance because a result can be statistically significant, such as having a low p-value, but the conclusion wouldn’t be practical if the effect size is small. Effect sizes are made to show how practical and the real world significance of that effect.

2. Why is it important to check ANOVA assumptions? What might happen if they’re violated?

It is important to check ANOVA assumptions to make sure they are actually valid, because if they are violated, it could lead to Type 1 errors, and inaccurate misinterpretations for the conclusion.

3. In public health practice, when might you choose to use ANOVA?

In public health practice, it would be ideal to use ANOVA when researchers would want to compare different variances across the means of different groups. For example, in this lab, we wanted to compare how different activity levels would affect mental health, in which we would use three means for the levels to help draw conclusions. It would also help prevent Type 1 error rate by abstaining from using multiple t-tests.

4. What was the most challenging part of this lab activity?

The challenging part of this lab activity was figuring how to fix certain syntax errors. For example, RStudio couldn’t recognize the token ‘Sq’, even though I utilizing the code from the practice part from the lab. Other than that, the lab was pretty straightforward to follow through, in terms of changing variables based on the data being used.

Submission Checklist

Before submitting, verify you have:

Completed all code chunks in Part B
Filled in all interpretation questions
Completed the Tukey HSD comparison table
Written a complete results section
Answered all reflection questions
Rendered your .Rmd file to HTML without errors
Checked that all outputs display correctly

To submit: Upload both your .Rmd file and the HTML output to Brightspace.

Lab completed on: February 05, 2026

GRADING RUBRIC (For TA Use)

Total Points: 15

Category	Criteria	Points	Notes
Code Execution	All code chunks run without errors	4	- Deduct 1 pt per major error - Deduct 0.5 pt per minor warning
Completion	All “YOUR TURN” sections attempted	4	- Part B Steps 1-9 completed - All fill-in-the-blank answered - Tukey table filled in
Interpretation	Correct statistical interpretation	4	- Hypotheses correctly stated (1 pt) - ANOVA results interpreted (1 pt) - Post-hoc results interpreted (1 pt) - Assumptions evaluated (1 pt)
Results Section	Professional, complete write-up	3	- Includes descriptive stats (1 pt) - Reports F-test & post-hoc (1 pt) - Effect size & significance (1 pt)

Detailed Grading Guidelines

Code Execution (4 points):

4 pts: All code runs perfectly, produces correct output
3 pts: Minor issues (1-2 small errors or warnings)
2 pts: Several errors but demonstrates understanding
1 pt: Major errors, incomplete code
0 pts: Code does not run at all

Completion (4 points):

4 pts: All sections attempted thoughtfully
3 pts: 1-2 sections incomplete or minimal effort
2 pts: Several sections missing
1 pt: Only partial completion
0 pts: Little to no work completed

Interpretation (4 points):

4 pts: All interpretations correct and well-explained
3 pts: Minor errors in interpretation
2 pts: Several interpretation errors
1 pt: Significant misunderstanding of concepts
0 pts: No interpretation provided

Results Section (3 points):

3 pts: Publication-quality, complete results section
2 pts: Good but missing some elements
1 pt: Incomplete or poorly written
0 pts: No results section written

Common Deductions

-0.5 pts: Missing sample sizes in write-up
-0.5 pts: Not reporting confidence intervals
-1 pt: Incorrect hypothesis statements
-1 pt: Misinterpreting p-values
-1 pt: Not checking assumptions
-0.5 pts: Poor formatting (no tables, unclear output)

In-Class Lab Activity: Analysis of Variance (ANOVA)

EPI 553: Principles of Statistical Inference II

Muntasir Masum

2026-02-05

Lab Overview

PART A: GUIDED EXAMPLE

Example: Blood Pressure and BMI Categories

Research Question: Is there a difference in mean systolic blood pressure (SBP) across three BMI categories (Normal weight, Overweight, Obese)?

Step 1: Setup and Data Preparation

Step 2: Descriptive Statistics

But is this difference statistically significant?

Step 3: Visualize the Data

Step 4: Set Up Hypotheses

Step 5: Fit the ANOVA Model

Step 6: Post-Hoc Tests (Tukey HSD)

Step 7: Calculate Effect Size

Step 8: Check Assumptions

Step 9: Report Results

PART B: YOUR TURN - INDEPENDENT PRACTICE

Practice Problem: Physical Activity and Depression

Research Question: Is there a difference in the number of days with poor mental health across three physical activity levels (None, Moderate, Vigorous)?

Step 1: Data Preparation

Step 2: Descriptive Statistics

Step 3: Visualization

Step 4: Set Up Hypotheses

Step 5: Fit the ANOVA Model

Step 6: Post-Hoc Tests

Step 7: Calculate Effect Size

Step 8: Check Assumptions

Levene’s test demonstrates unequal variances,and the large sample size makes ANOVA robust to minor violations. The assumptions are reasonably satisfied and the results would remain reliable.

Step 9: Write Up Results

Reflection Questions

Submission Checklist

GRADING RUBRIC (For TA Use)

Detailed Grading Guidelines

Common Deductions