Asthma is a major public health concern in New York City, with emergency department (ED) visit rates varying considerably across neighborhoods. Environmental exposure, particularly to fine particulate matter (PM2.5), has been linked in prior research to adverse respiratory outcomes. However, exposure to air pollution is not evenly distributed across communities.
This project examines whether neighborhoods with higher PM2.5 concentrations also experience higher asthma-related emergency department visit rates. Understanding this relationship can help identify environmental health disparities and inform public health interventions.
Research Question:
Is neighborhood-level PM2.5 concentration associated with asthma
emergency department visit rates in New York City?
Two official datasets from the NYC Department of Health and Mental Hygiene Environmental Health Data Portal are used:
Asthma Emergency Department Visits
(Adults)
Age-adjusted asthma ED visit rates per 10,000 adults by community
district.
Fine Particulate Matter (PM2.5)
Annual mean PM2.5 concentrations (µg/m³) by community district.
asthma_raw <- read_csv(
"NYC EH Data Portal - Asthma emergency department visits (adults) (full table).csv"
) |> clean_names()
pm25_raw <- read_csv(
"NYC EH Data Portal - Fine particles (PM 2.5) (full table).csv"
) |> clean_names()
asthma_clean <- asthma_raw |>
filter(geo_type == "CD") |>
select(
time_period,
geography,
asthma_rate = age_adjusted_rate_per_10_000
) |>
mutate(asthma_rate = as.numeric(asthma_rate))
pm25_clean <- pm25_raw |>
filter(geo_type == "CD") |>
select(
time_period,
geography,
pm25 = annual_mean_mcg_m3
) |>
mutate(pm25 = as.numeric(pm25))
nyc_health <- inner_join(
asthma_clean,
pm25_clean,
by = c("time_period", "geography")
) |>
drop_na()
ggplot(nyc_health, aes(x = asthma_rate)) +
geom_histogram(
aes(y = after_stat(density)),
bins = 25,
fill = "#4C78A8",
color = "white",
alpha = 0.9
) +
geom_density(color = "#1F3552", linewidth = 1) +
labs(
title = "Asthma Emergency Department Visit Rates Across NYC",
subtitle = "Age-adjusted ED visits per 10,000 adults by community district",
x = "Asthma ED Visits per 10,000 Adults",
y = "Density"
) +
theme_minimal(base_size = 12)
ggplot(nyc_health, aes(x = pm25)) +
geom_histogram(
aes(y = after_stat(density)),
bins = 25,
fill = "#F58518",
color = "white",
alpha = 0.9
) +
geom_density(color = "#7A3E00", linewidth = 1) +
labs(
title = "Distribution of PM2.5 Concentrations Across NYC",
subtitle = "Annual mean fine particulate matter by community district",
x = "PM2.5 Concentration (µg/m³)",
y = "Density"
) +
theme_minimal(base_size = 12)
cor.test(nyc_health$pm25, nyc_health$asthma_rate)
##
## Pearson's product-moment correlation
##
## data: nyc_health$pm25 and nyc_health$asthma_rate
## t = -0.43673, df = 293, p-value = 0.6626
## alternative hypothesis: true correlation is not equal to 0
## 95 percent confidence interval:
## -0.13929808 0.08895102
## sample estimates:
## cor
## -0.02550595
model <- lm(asthma_rate ~ pm25, data = nyc_health)
summary(model)
##
## Call:
## lm(formula = asthma_rate ~ pm25, data = nyc_health)
##
## Residuals:
## Min 1Q Median 3Q Max
## -64.12 -44.13 -21.76 26.40 234.84
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 79.68 28.45 2.801 0.00544 **
## pm25 -1.86 4.26 -0.437 0.66263
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 59.72 on 293 degrees of freedom
## Multiple R-squared: 0.0006506, Adjusted R-squared: -0.00276
## F-statistic: 0.1907 on 1 and 293 DF, p-value: 0.6626
tidy(model)
## # A tibble: 2 × 5
## term estimate std.error statistic p.value
## <chr> <dbl> <dbl> <dbl> <dbl>
## 1 (Intercept) 79.7 28.4 2.80 0.00544
## 2 pm25 -1.86 4.26 -0.437 0.663
ggplot(nyc_health, aes(x = pm25, y = asthma_rate)) +
geom_point(
color = "#4C78A8",
alpha = 0.7,
size = 2
) +
geom_smooth(
method = "lm",
se = TRUE,
color = "#E45756",
fill = "#E45756",
alpha = 0.2
) +
labs(
title = "Association Between PM2.5 Exposure and Asthma ED Visit Rates",
subtitle = "Each point represents a NYC community district",
x = "Annual Mean PM2.5 Concentration (µg/m³)",
y = "Asthma ED Visits per 10,000 Adults"
) +
theme_minimal(base_size = 12)
nyc_health |>
mutate(
pm25_rank = rank(-pm25),
asthma_rank = rank(-asthma_rate),
combined_risk = pm25_rank + asthma_rank
) |>
arrange(combined_risk) |>
slice(1:10)
## # A tibble: 10 × 7
## time_period geography asthma_rate pm25 pm25_rank asthma_rank combined_risk
## <dbl> <chr> <dbl> <dbl> <dbl> <dbl> <dbl>
## 1 2019 Mott Haven… 300. 7.7 26.5 1 27.5
## 2 2019 Hunts Poin… 180. 7.8 22.5 20 42.5
## 3 2019 Belmont an… 236. 7.2 48 7 55
## 4 2023 Mott Haven… 250. 7.1 56.5 4 60.5
## 5 2019 East Harle… 246. 7.1 56.5 5 61.5
## 6 2019 Central Ha… 275. 7 66.5 2 68.5
## 7 2019 Highbridge… 179 7.2 48 21 69
## 8 2023 Hunts Poin… 154. 7.4 38 33 71
## 9 2023 Belmont an… 207. 7 66.5 11 77.5
## 10 2019 Fordham an… 154. 7.3 42.5 35 77.5
Several challenges were encountered during the analysis. Rate variables were initially imported as character values, requiring explicit conversion to numeric format before visualization and modeling. Additionally, aligning environmental and health datasets required careful filtering to consistent geographic units and time periods.
The analysis indicates a positive association between PM2.5 concentrations and asthma emergency department visit rates across New York City community districts. While causality cannot be inferred from this observational analysis, the results are consistent with prior environmental health research and highlight potential environmental health disparities.
All data acquisition, transformation, analysis, and visualization steps are contained within this RMarkdown document. With the accompanying datasets, the analysis can be fully reproduced by knitting this file.