March 11, 2026

(A) Introduction

  • We will work with a pre-processed subset called NHANESsample available in the HDSinRdata package (use install.packages(‘HDSinRdata’) to install it). The subset contains data from NHANES cycles 1999–2018 for adults aged 20 years or older. The sample includes individuals with non-missing blood lead level, blood pressure, and demographic information.

(A) Dataset Description

#install.packages("HDSinRdata")
library(HDSinRdata)
head(NHANESsample)
##   ID AGE    SEX               RACE  EDUCATION INCOME      SMOKE YEAR LEAD
## 1  2  77   Male Non-Hispanic White MoreThanHS   5.00 NeverSmoke 1999  5.0
## 2  5  49   Male Non-Hispanic White MoreThanHS   5.00  QuitSmoke 1999  1.6
## 3 12  37   Male Non-Hispanic White MoreThanHS   4.93 NeverSmoke 1999  2.4
## 4 13  70   Male   Mexican American LessThanHS   1.07  QuitSmoke 1999  1.6
## 5 14  81   Male Non-Hispanic White LessThanHS   2.67 StillSmoke 1999  5.5
## 6 15  38 Female Non-Hispanic White MoreThanHS   4.52 StillSmoke 1999  1.5
##     BMI_CAT LEAD_QUANTILE HYP ALC DBP1 DBP2 DBP3 DBP4 SBP1 SBP2 SBP3 SBP4
## 1   BMI<=25            Q4   0 Yes   58   56   56   NA  106   98   98   NA
## 2 25<BMI<30            Q3   1 Yes   82   84   82   NA  122  122  122   NA
## 3   BMI>=30            Q4   1 Yes  108   98  100   NA  182  172  176   NA
## 4 25<BMI<30            Q3   1 Yes   78   62   70   NA  140  130  130   NA
## 5 25<BMI<30            Q4   1 Yes   56   NA   58   64  142   NA  134  138
## 6 25<BMI<30            Q3   0 Yes   68   68   70   NA  106  112  106   NA

(A) Research Question:

  • What is the mean systolic blood pressure among adults 60+?

(A) Main Variables and parameters:

seniors <- subset(NHANESsample, AGE >= 60)
head(seniors)
##    ID AGE    SEX               RACE  EDUCATION INCOME      SMOKE YEAR LEAD
## 1   2  77   Male Non-Hispanic White MoreThanHS   5.00 NeverSmoke 1999  5.0
## 4  13  70   Male   Mexican American LessThanHS   1.07  QuitSmoke 1999  1.6
## 5  14  81   Male Non-Hispanic White LessThanHS   2.67 StillSmoke 1999  5.5
## 10 29  62   Male Non-Hispanic White         HS   1.07  QuitSmoke 1999  1.9
## 13 55  61 Female     Other Hispanic MoreThanHS   3.33 StillSmoke 1999  2.2
## 15 59  70   Male   Mexican American LessThanHS   0.97  QuitSmoke 1999  5.4
##      BMI_CAT LEAD_QUANTILE HYP ALC DBP1 DBP2 DBP3 DBP4 SBP1 SBP2 SBP3 SBP4
## 1    BMI<=25            Q4   0 Yes   58   56   56   NA  106   98   98   NA
## 4  25<BMI<30            Q3   1 Yes   78   62   70   NA  140  130  130   NA
## 5  25<BMI<30            Q4   1 Yes   56   NA   58   64  142   NA  134  138
## 10   BMI>=30            Q3   1  No   70   76   66   NA  124  122  126   NA
## 13   BMI<=25            Q3   0 Yes   70   60   74   NA  106  110  116   NA
## 15 25<BMI<30            Q4   0 Yes   70   70   72   NA  118  112  114   NA

(B) Descriptive Analysis

mean_sbp   <- mean(seniors$SBP1, na.rm = TRUE)
sd_sbp     <- sd(seniors$SBP1, na.rm = TRUE)
median_sbp <- median(seniors$SBP1, na.rm = TRUE)
n_size     <- nrow(seniors)
cat("Mean:", mean_sbp, "\nSD:", sd_sbp, "\nMedian:", median_sbp, "\nN:", n_size)
## Mean: 136.8336 
## SD: 21.18628 
## Median: 134 
## N: 10192

(B) Descriptive Analysis

library(ggplot2)
ggplot(seniors, aes(x = SBP1)) +
  geom_histogram(binwidth = 5, na.rm = TRUE, fill = "steelblue", color = "white") +
  geom_vline(xintercept = 120, linetype = 2, color = "red") +
  labs(title = "SBP Distribution (60+)", subtitle = "Red line = 120 mmHg", x = "SBP (mmHg)", y = "Count")
  • The numerical summary shows that the average systolic blood pressure for adults 60 and older is approximately 136.8336 mmHg, which is higher than the clinical target of 120 mmHg. The median of 134 suggests that more than half of this population exceeds the healthy threshold.

(B) Descriptive Analysis

(B) Descriptive Analysis

library(ggplot2)
ggplot(seniors, aes(y = SBP1)) +
  geom_boxplot(na.rm = TRUE, fill = "lightgray", outlier.color = "red") +
  labs(title = "Boxplot of Systolic Blood Pressure (Adults 60+)",
       y = "Systolic BP (mmHg)")
  • Looking at the histogram, the data is slightly right-skewed, meaning there are several individuals with very high blood pressure readings. This is further confirmed by the boxplot, which identifies multiple outliers above 200 mmHg. Despite the slight skew, our sample size of 10192 is sufficiently large to proceed with normal based inference due to the Central Limit Theorem.

(B) Descriptive Analysis

(C) Check for T-Test

  • Independence: We assume that individual respondents are independent of one another from NHANES sampling design.
  • Normality: While the histogram from Part B may show some right skew, our sample size (n > 10,000) for adults 60+ is very large. According to the Central Limit Theorem (CLT), the sampling distribution of the mean will be approximately normal regardless of the population distribution’s shape

(C) Check for T-Test

results <- t.test(seniors$SBP1, mu = 120, conf.level = 0.95)
print(results$estimate) # point estimate
## mean of x 
##  136.8336
print(results$conf.int) # confidence interval
## [1] 136.4068 137.2603
## attr(,"conf.level")
## [1] 0.95
  • We are 95% confident that the true population mean systolic blood pressure for U.S. adults aged 60 and older is between 136.41 and 137.26 mmHg. This range is entirely above the healthy benchmark of 120 mmHg

(C) Check for T-Test

  • Null Hypothesis (H₀):μ = 120 (The population mean SBP is 120 mmHg)
  • Alternative Hypothesis (Hₐ):μ > 120 (The population mean SBP is greater than 120 mmHg)
print(results$statistic) # test statistic
##        t 
## 77.32088
print(results$p.value) # p value (actual value is p < 2.2 * 10^(-16))
## [1] 0

(C) Conclusion

  • Since the p-value is extremely small (virtually zero and much less than 𝛼 = 0.05), we reject the null hypothesis. There is evidence that the average systolic blood pressure for adults 60+ is significantly higher than the 120 mmHg threshold, indicating widespread hypertension in this age group.