<!DOCTYPE html>
<!DOCTYPE html>
knitr::opts_chunk$set(echo = FALSE)
library(dplyr)
library(ggplot2)
setwd("~/Desktop/BRFSS_project")
load("4tiY2fqCQa-YmNn6gnGvzQ_1e7320c30a6f4b27894a54e2de50a805_brfss2013.RData")
set.seed(123)
brfss <- sample_n(brfss2013, 50000)
The Behavioral Risk Factor Surveillance System (BRFSS) is a large cross-sectional health survey conducted annually by the Centers for Disease Control and Prevention (CDC) in the United States. Data are collected through telephone interviews from non-institutionalized adults aged 18 years or older. Because the data are observational and self-reported, results show association rather than causation, but due to the large random sample they can be generalized to the adult U.S. population.
Is sleep duration related to poor mental health days across employment groups?
rq1 <- brfss %>%
transmute(
sleep = sleptim1,
menthlth = menthlth,
employ = employ1
) %>%
filter(!is.na(sleep), !is.na(menthlth), !is.na(employ),
sleep >= 0, sleep <= 24,
menthlth >= 0, menthlth <= 30)
rq1 %>% summarise(
n = n(),
mean_sleep = mean(sleep),
mean_menthlth = mean(menthlth)
)
## n mean_sleep mean_menthlth
## 1 48160 7.050208 3.351184
ggplot(rq1, aes(x = sleep, y = menthlth)) +
geom_point(alpha = 0.2) +
geom_smooth(method = "lm", se = FALSE) +
facet_wrap(~ employ) +
labs(x = "Hours of sleep", y = "Poor mental health days")
## `geom_smooth()` using formula = 'y ~ x'
Findings: Less sleep is associated with more poor mental health days across employment categories.
Is BMI different between smokers and non-smokers, and does this difference vary by sex?
rq2 <- brfss %>%
transmute(
bmi = as.numeric(X_bmi5)/100,
smoker = smoke100,
sex = sex
) %>%
filter(!is.na(bmi), bmi >= 10, bmi <= 80,
!is.na(smoker), !is.na(sex))
rq2 %>%
group_by(sex, smoker) %>%
summarise(mean_bmi = mean(bmi), .groups="drop")
## # A tibble: 4 × 3
## sex smoker mean_bmi
## <fct> <fct> <dbl>
## 1 Male Yes 28.1
## 2 Male No 28.1
## 3 Female Yes 27.7
## 4 Female No 27.5
ggplot(rq2, aes(x = smoker, y = bmi)) +
geom_boxplot() +
facet_wrap(~ sex) +
theme_minimal()
Findings: BMI distributions differ slightly between smokers and non-smokers and vary by sex.
Is physical activity associated with general health across age groups?
rq3 <- brfss %>%
transmute(
exercise = exerany2,
genhlth = genhlth,
agegrp = X_ageg5yr
) %>%
filter(!is.na(exercise), !is.na(genhlth), !is.na(agegrp))
tab3 <- rq3 %>%
count(agegrp, exercise, genhlth) %>%
group_by(agegrp, exercise) %>%
mutate(p = n/sum(n)) %>%
ungroup()
ggplot(tab3, aes(x = genhlth, y = p)) +
geom_col() +
facet_grid(agegrp ~ exercise) +
theme(axis.text.x = element_text(angle = 45, hjust = 1))
Findings: Individuals who exercise generally report better health across all age groups.