---
title: "Multi Cycle Hb Analysis by Sex (1999–2018)"
author: "Dr. Simon Aseno, MPH"
output:
html_document:
toc: true
toc_depth: 2
number_sections: true
theme: flatly
highlight: tango
---
## 1. Overview
This report presents an analysis of **Hemoglobin (g/dL)** trends by sex in the United States using data from the [National Health and Nutrition Examination Survey (NHANES)](https://wwwn.cdc.gov/nchs/nhanes/) from 1999 to 2018.
We use a cleaned dataset that merges CBC and DEMO files, keeping only complete cases for hemoglobin (`LBXHGB`) and sex (`RIAGENDR`).
---
## 2. Dataset and License
**Dataset**: [Download CSV](https://example.com/hgb_by_sex_cleaned_1999_2018.csv)
**README**: [Download Instructions](https://example.com/README_HGB_Analysis.txt)
**License**: [MIT License](https://opensource.org/licenses/MIT)
---
## 3. Data Preparation
``` r
# Load the cleaned dataset
hgb_data <- readRDS("hgb_by_sex_cleaned_1999_2018.rds")
# Label Sex and extract Cycle Years
hgb_trend <- hgb_data %>%
mutate(
Sex = case_when(
RIAGENDR == 1 ~ "Male",
RIAGENDR == 2 ~ "Female",
TRUE ~ NA_character_
),
Year_Label = str_extract(Cycle, "\\d{4}_\\d{4}"),
Year_Label = gsub("_", "–", Year_Label)
) %>%
group_by(Cycle, Year_Label, Sex) %>%
summarise(
mean_hgb = mean(LBXHGB, na.rm = TRUE),
sd_hgb = sd(LBXHGB, na.rm = TRUE),
n = n(),
.groups = "drop"
)
summary_stats <- hgb_data %>%
mutate(Sex = ifelse(RIAGENDR == 1, "Male", "Female")) %>%
group_by(Sex) %>%
summarise(
Mean_HGB = round(mean(LBXHGB, na.rm = TRUE), 2),
SD_HGB = round(sd(LBXHGB, na.rm = TRUE), 2),
Min_HGB = round(min(LBXHGB, na.rm = TRUE), 2),
Max_HGB = round(max(LBXHGB, na.rm = TRUE), 2),
N = n()
)
knitr::kable(summary_stats, caption = "Summary of Hemoglobin (g/dL) by Sex (1999–2018)")
| Sex | Mean_HGB | SD_HGB | Min_HGB | Max_HGB | N |
|---|---|---|---|---|---|
| Female | 13.17 | 1.19 | 5.8 | 19.7 | 42072 |
| Male | 14.47 | 1.51 | 6.3 | 19.9 | 40854 |
ggplot(hgb_trend, aes(x = Year_Label, y = mean_hgb, group = Sex, color = Sex)) +
geom_line(linewidth = 1.2) +
geom_point(size = 3) +
geom_errorbar(
aes(ymin = mean_hgb - sd_hgb, ymax = mean_hgb + sd_hgb),
width = 0.15, alpha = 0.4
) +
scale_color_manual(values = c("Male" = "#0072B2", "Female" = "#D55E00")) +
labs(
title = "Hemoglobin Trends by Sex (1999–2018)",
subtitle = "Mean Hemoglobin (g/dL) by NHANES cycle",
x = "Survey Cycle (Years)",
y = "Mean Hemoglobin (g/dL)",
color = "Sex"
) +
theme_classic(base_size = 14) +
theme(
plot.title = element_text(face = "bold", size = 16),
axis.text.x = element_text(angle = 45, hjust = 1),
legend.position = "top"
)
Mean Hemoglobin (g/dL) with ±1 SD across NHANES Cycles
The analysis reveals a consistent sex-based difference in hemoglobin levels across NHANES cycles from 1999 to 2018:
These results align with known biological differences and may reflect population-level health or nutritional changes.
Citation Aseno, S. (2025). Hemoglobin Trends by Sex in the U.S. (1999–2018). Source: CDC NHANES
MIT License MIT License Full Text