Data using 2017-2018 using the National Health and Nutrition Examination Survey (NHANES)

The National Health and Nutrition Examination Survey (NHANES), conducted by the National Center for Health Statistics (NCHS), is a cross-sectional, nationally representative survey of the US civilian noninstitutionalized population. The survey uses a complex, stratified, multistage probability cluster-sampling design (8). Participants are interviewed at their homes to obtain information on health history and health behaviors. Subsequently, they undergo a physical examination at a mobile examination center (MEC).

Among adults aged 20 years and older, the overall age-adjusted prevalence of diagnosed type II diabetes was 9.7% in 2013-2016. The majority of diabetes increased with age from a plurality of 1.8% among aged 20-39 years, 11.1% among 40-59 years old, and 21.0% among those aged 60 and older. ( Mendola ND, Chen T-C, Gu Q, Eberhardt MS, Saydah S. Prevalence of total, diagnosed, and undiagnosed diabetes among adults: United States, 2013–2016. NCHS Data Brief, no 319. Hyattsville, MD: National Center for Health Statistics. 2018).

First load packages and read the demographic data, keeping the variables necessary for the analysis.

pacman::p_load(gt, haven, patchwork, srvyr, survey, tidyverse,rstatix)

P_DEMO <- read_xpt("https://wwwn.cdc.gov/Nchs/Nhanes/2017-2018/P_DEMO.XPT") %>%
  select(SEQN, RIDAGEYR, SDMVPSU, SDMVSTRA, WTMECPRP)

Read the blood pressure data remove missing values and create mean systolic and diastolic.

P_BPXO <- read_xpt("https://wwwn.cdc.gov/Nchs/Nhanes/2017-2018/P_BPXO.XPT") %>%
  mutate(
    Systolic = rowMeans(cbind(BPXOSY1, BPXOSY2, BPXOSY3), na.rm = TRUE),
    Diastolic = rowMeans(cbind(BPXODI1, BPXODI2, BPXODI3), na.rm = TRUE)
  ) %>%
  select(SEQN, Systolic, Diastolic)

Read the diabetes questionnaire data and create diabetes indicator.

P_DIQ <- read_xpt("https://wwwn.cdc.gov/Nchs/Nhanes/2017-2018/P_DIQ.XPT") %>%
  mutate(
    Diabetes = case_when(DIQ010 == 1 ~ "Yes",
      DIQ010 %in% 2:3 ~ "No")
  ) %>%
  select(SEQN, Diabetes)

Join the files.

One <- left_join(P_DEMO, P_BPXO, by = "SEQN") %>%
  left_join(., P_DIQ, by = "SEQN")

It needs to be clarified how well GGPLOT graphs can use sample weights that account for and incorporate sampling design information; that is, the sample describes the US population. Is GGPLOT does give a different plot with a weight aesthetic?

g1 <- ggplot(One %>% filter(RIDAGEYR >= 18, !is.na(Systolic), !is.na(Diabetes), WTMECPRP > 0),
  aes(x = Systolic, y = Diabetes)) +
  geom_boxplot(fill = "#dee8ff") +
  coord_cartesian(xlim = c(80, 200)) +
  scale_x_continuous(breaks = seq(80, 200, 10)) +
  labs(title = "Unweighted",
    x = "Systolic Blood Pressure") +
  theme_light() +
  theme(plot.title = element_text(hjust = 0.5),
    text = element_text(size = 14))

g2 <- ggplot(One %>% filter(RIDAGEYR >= 18, !is.na(Systolic), !is.na(Diabetes), WTMECPRP > 0),
  aes(x = Systolic, y = Diabetes, weight = WTMECPRP)) +
  geom_boxplot(fill = "#dee8ff") +
  coord_cartesian(xlim = c(80, 200)) +
  scale_x_continuous(breaks = seq(80, 200, 10)) +
  labs(title = "Weighted data represent the noninstitutionalized
US population",
    x = "Systolic Blood Pressure") +
  theme_light() +
  theme(plot.title = element_text(hjust = 0.5),
    text = element_text(size = 14))

g1 / g2

### A t test for difference between mean systolic BP by diabetes status.

test <- One %>% t_test(Systolic ~ Diabetes)
options(scipen=999)
print(test)

## # A tibble: 1 × 10
##   .y.      group1 group2    n1    n2 statistic    df        p    p.adj p.adj.s…¹
## * <chr>    <chr>  <chr>  <int> <int>     <dbl> <dbl>    <dbl>    <dbl> <chr>    
## 1 Systolic No     Yes     9144  1203     -19.0 1463. 4.68e-72 4.68e-72 ****     
## # … with abbreviated variable name ¹p.adj.signif

g3 <- ggplot(One %>% filter(RIDAGEYR >= 18, !is.na(Diastolic), !is.na(Diabetes), WTMECPRP > 0),
  aes(x = Systolic, y = Diabetes)) +
  geom_boxplot(fill = "#dee8ff") +
  coord_cartesian(xlim = c(40, 200)) +
  scale_x_continuous(breaks = seq(20, 200, 10)) +
  labs(title = "Unweighted",
    x = "Diastolic Blood Pressure") +
  theme_light() +
  theme(plot.title = element_text(hjust = 0.5),
    text = element_text(size = 14))

g4 <- ggplot(One %>% filter(RIDAGEYR >= 18, !is.na(Diastolic), !is.na(Diabetes), WTMECPRP > 0),
  aes(x = Systolic, y = Diabetes, weight = WTMECPRP)) +
  geom_boxplot(fill = "#dee8ff") +
  coord_cartesian(xlim = c(40, 200)) +
  scale_x_continuous(breaks = seq(40, 200, 10)) +
  labs(title = "Weighted data represent the noninstitutionalized
US population",
    x = "Diastolic Blood Pressure") +
  theme_light() +
  theme(plot.title = element_text(hjust = 0.5),
    text = element_text(size = 14))

g3 / g4

### A t test for difference between mean diastolic BP by diabetes status.

test <- One %>% t_test(Diastolic ~ Diabetes)
options(scipen=999)
print(test)

## # A tibble: 1 × 10
##   .y.       group1 group2    n1    n2 statistic    df          p   p.adj p.adj…¹
## * <chr>     <chr>  <chr>  <int> <int>     <dbl> <dbl>      <dbl>   <dbl> <chr>  
## 1 Diastolic No     Yes     9144  1203     -5.96 1555.    3.13e-9 3.13e-9 ****   
## # … with abbreviated variable name ¹p.adj.signif

Conclusion

There is a statistical difference between non-diabetics and diabetics in systolic and diastolic BP. Individuals diagnosed as people with diabetes need to control their BP better. Also, there appears to be 2-3 mmHg difference between weighted and non-weighted graphics using GGPLOT.

Data using 2017-2018 using the National Health and Nutrition Examination Survey (NHANES)

Ami

2022-11-16

First load packages and read the demographic data, keeping the variables necessary for the analysis.

Read the blood pressure data remove missing values and create mean systolic and diastolic.

Read the diabetes questionnaire data and create diabetes indicator.

Join the files.

It needs to be clarified how well GGPLOT graphs can use sample weights that account for and incorporate sampling design information; that is, the sample describes the US population. Is GGPLOT does give a different plot with a weight aesthetic?

Conclusion

There is a statistical difference between non-diabetics and diabetics in systolic and diastolic BP. Individuals diagnosed as people with diabetes need to control their BP better. Also, there appears to be 2-3 mmHg difference between weighted and non-weighted graphics using GGPLOT.

Note:

I thank Jeffery Hughes, NCHS/CDC, for his programming help.

Data using 2017-2018 using the National Health and Nutrition Examination Survey (NHANES)

Ami

2022-11-16

High BP is a risk factor for several complications related to type two diabetes including, kidney failure and diabetic retinopathy (Diabetic retinopathy - Symptoms and causes - Mayo Clinic)

First load packages and read the demographic data, keeping the variables necessary for the analysis.

Read the blood pressure data remove missing values and create mean systolic and diastolic.

Read the diabetes questionnaire data and create diabetes indicator.

Join the files.

It needs to be clarified how well GGPLOT graphs can use sample weights that account for and incorporate sampling design information; that is, the sample describes the US population. Is GGPLOT does give a different plot with a weight aesthetic?

Conclusion

There is a statistical difference between non-diabetics and diabetics in systolic and diastolic BP. Individuals diagnosed as people with diabetes need to control their BP better. Also, there appears to be 2-3 mmHg difference between weighted and non-weighted graphics using GGPLOT.

Note:

I thank Jeffery Hughes, NCHS/CDC, for his programming help.