Background

Over the past 40 years, there has been considerable progress in reducing the presence of lead in the home environment. There does, however, remain significant sources of lead in drinking water (lead service pipes), and older homes in which lead has not been abated.

There is no safe level of lead in the environment. Very small amounts of lead can cause considerable damage to children.

The State of Illinois, and Chicago in particular, has one of the largest number of lead service pipes in the country (source: Natural Resources Defense Council).

In this project, the reported cases of lead poisoning amongst 1-5 year olds are explored by community area and the racial and income demographics are analyzed.

Summary Statistics of the 77 Community Areas in Chicago

summary_stats <-
sumtable(data, 
         digits = 2, 
         fixed.digits = TRUE, 
         summ = c('min(x)',
                  'pctile(x)[25]',
                  'pctile(x)[50]',
                  'mean(x)',
                  'pctile(x)[75]',
                  'max(x)'),
         summ.names = c("Min", 
                        "1st Qu.", 
                        "Median", 
                        "Mean", 
                        "3rd Qu.", 
                        "Max."),
         vars = c("Pct_White",
                  "Pct_Black",
                  "Pct_Hispanic",
                  "Pct_Asian",
                  "Pct_NativeAmer",
                  "median_income",
                  "pct_lead_poison"),
         title = "Summary Stats: Community Area Demographics")

summary_stats
Summary Stats: Community Area Demographics
Variable Min 1st Qu. Median Mean 3rd Qu. Max.
Pct_White 0.83 4.14 14.75 27.99 49.10 82.73
Pct_Black 0.42 2.98 11.97 36.91 83.05 96.50
Pct_Hispanic 0.01 5.53 13.05 26.46 45.28 90.81
Pct_Asian 0.00 0.31 2.32 6.37 8.67 70.12
Pct_NativeAmer 0.00 0.00 0.04 0.08 0.10 0.70
median_income 20528.00 42281.00 58905.00 64170.70 77608.00 147366.00
pct_lead_poison 0.00 0.62 1.22 1.68 2.56 6.08

Exploratory graphs showing correlation of demographics to lead poisoning rate by Community Area.

cust_theme = list(
  theme_light(),
  theme(plot.title = element_text(face = "bold", size = 24, hjust = 0.5),
        plot.subtitle = element_text(face = "italic", size = 18, hjust = 0.5),
        plot.background = element_rect(color = "black", size =2),
        axis.title = element_text(face = "bold", size = 16),
        axis.text.y = element_text(size = 12),
        axis.text.x = element_text(size = 14),
        legend.title = element_text(face = "bold", size = 12),
        legend.text = element_text(size = 12),
        legend.background = element_rect(color = "black")
        ))
data %>%
  ggplot(mapping = aes(x = pct_poc_2016_2020, y = pct_lead_poison)) +
  geom_point() +
  geom_smooth(formula = y~x, se = FALSE, method = 'lm') +
  labs(y = "Lead poisoning rate (% of children ages 1-5)",
       x = "Percentage People of Color (2016-2020)",
       title = "Lead Poisoning by Chicago Community Area",
       caption = "Data from chicagohealthatlas.org.  Lead poisoning rates derived as average of lead poisoning rate for each year 2016 through 2020.") +
  cust_theme

data %>%
  ggplot(mapping = aes(x = median_income, y = pct_lead_poison)) +
  geom_point() +
  geom_smooth(formula = y~x, se = FALSE, method = 'lm') +
  labs(y = "Lead poisoning rate (% of children ages 1-5)",
       x = "Median Income (2016-2020)",
       title = "Lead Poisoning by Chicago Community Area",
       caption = "Data from chicagohealthatlas.org.  Lead poisoning rates derived as average of lead poisoning rate for each year 2016 through 2020.") +
  cust_theme

Regression analysis of relationship between demographics and lead poisoning rates.

Lead poisoning by median income.

summary(lm(pct_lead_poison ~ med_income_log, data = data))
## 
## Call:
## lm(formula = pct_lead_poison ~ med_income_log, data = data)
## 
## Residuals:
##      Min       1Q   Median       3Q      Max 
## -2.99296 -0.60723 -0.02681  0.54987  2.42604 
## 
## Coefficients:
##                Estimate Std. Error t value Pr(>|t|)    
## (Intercept)     22.5489     2.7741   8.128 6.80e-12 ***
## med_income_log  -1.9029     0.2527  -7.529 9.34e-11 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 1.005 on 75 degrees of freedom
## Multiple R-squared:  0.4305, Adjusted R-squared:  0.4229 
## F-statistic: 56.69 on 1 and 75 DF,  p-value: 9.341e-11

Within the Chicago Community Areas, each increased percent of median income predicts an decrease of lead poisoning by 1.37 percent (statistically significant at p < .01).

Lead poisoning by % People of Color.**

summary(lm(pct_lead_poison ~ pct_poc_2016_2020, data = data))
## 
## Call:
## lm(formula = pct_lead_poison ~ pct_poc_2016_2020, data = data)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -2.0477 -0.6903 -0.0175  0.4846  3.6882 
## 
## Coefficients:
##                    Estimate Std. Error t value Pr(>|t|)    
## (Intercept)       -0.533176   0.349200  -1.527    0.131    
## pct_poc_2016_2020  0.030734   0.004555   6.747 2.75e-09 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 1.051 on 75 degrees of freedom
## Multiple R-squared:  0.3777, Adjusted R-squared:  0.3694 
## F-statistic: 45.52 on 1 and 75 DF,  p-value: 2.755e-09

Within the Chicago Community Areas, each increased percent of people of color (all non-white people) predicts an increase of lead poisoning by .03 percent (statistically significant at p < .001).

Lead Poisoning by Chicago Community Areas

Community Areas with 10 highest Lead Poisoning Rates

top10 <- data %>%
  arrange(desc(pct_lead_poison)) %>%
  slice_max(pct_lead_poison, n = 10) %>%
  select(name,
         pct_lead_poison,
         pct_poc_2016_2020,
         inc_2016_2020,
         pop_2016_2020) %>%
  kable(col.names = c("Community Area",
                      "%Lead Poison",
                      "%POC",
                      "Median Income",
                      "Population"),
        format.args = list(big.mark = ","),
        align = "lrrrr") %>%
  kable_styling(bootstrap_options = "striped", full_width = F) %>%
  row_spec(0, background = "red", color = "white") %>%
  column_spec(1, border_right = TRUE)

top10
Community Area %Lead Poison %POC Median Income Population
Fuller Park 6.08 95.17 $20,528 2,215
West Englewood 5.28 98.54 $30,204 26,359
Englewood 5.08 98.81 $24,776 21,973
West Garfield Park 4.20 97.08 $30,319 16,480
South Chicago 3.82 97.10 $41,844 28,676
Austin 3.74 94.67 $43,043 95,279
Roseland 3.74 98.52 $45,771 39,750
West Pullman 3.42 99.14 $49,613 26,160
Greater Grand Crossing 3.34 99.17 $34,324 29,493
Washington Heights 3.28 98.34 $61,687 26,829

Community Areas with 10 lowest Lead Poisoning Rates

bottom10 <- data %>%
  arrange(pct_lead_poison) %>%
  slice_min(pct_lead_poison, n = 10) %>%
  select(name, 
         pct_lead_poison, 
         pct_poc_2016_2020, 
         inc_2016_2020, 
         pop_2016_2020) %>%
  kable(col.names = c("Community Area", 
                    "%Lead Poison", 
                    "%POC",
                    "Median Income", 
                    "Population"),
        format.args = list(big.mark = ","),
        align = "lrrrr") %>%
  kable_styling(bootstrap_options = "striped", full_width = F) %>%
  row_spec(0, background = "red", color = "white") %>%
        column_spec(1, border_right = TRUE)

bottom10
Community Area %Lead Poison %POC Median Income Population
Edison Park 0.00 17.27 $118,472 10,998
Lincoln Park 0.04 20.18 $147,366 62,688
Mount Greenwood 0.08 18.92 $104,687 18,846
Near South Side 0.14 48.62 $119,267 24,506
Forest Glen 0.20 25.97 $134,176 19,908
Near North Side 0.32 30.65 $114,097 86,258
Norwood Park 0.36 22.00 $97,139 41,327
North Center 0.38 25.35 $143,917 36,181
Montclare 0.46 63.17 $59,533 14,392
Riverdale 0.46 98.94 $22,815 7,378