This analysis examined the correlation between childhood lead poisoning and eight factors:

Comparison maps

In the below maps, you can compare the areas with high childhood lead poisoning to each of the different factors which you can toggle on and off by clicking on the layers icon.

Scatterplots

We compared lead poisoning rates with each factor one by one. Here’s what we found:

Census tracts with a higher share of Black residents have a higher likelihood of childhood lead poisoning:

## 
## Call:
## lm(formula = PERC_POISONED ~ black_pct, data = joined)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -8.1412 -3.3884 -0.9074  3.2466 14.1934 
## 
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)    
## (Intercept)   3.3413     0.5611   5.955 1.17e-08 ***
## black_pct     9.2120     1.0165   9.063  < 2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 4.965 on 197 degrees of freedom
##   (12 observations deleted due to missingness)
## Multiple R-squared:  0.2943, Adjusted R-squared:  0.2907 
## F-statistic: 82.14 on 1 and 197 DF,  p-value: < 2.2e-16

Census tracts with higher rates of renters show a similar correlation:

## 
## Call:
## lm(formula = PERC_POISONED ~ renter_occupied_pct, data = joined)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -12.249  -3.543  -1.281   1.495  16.383 
## 
## Coefficients:
##                     Estimate Std. Error t value Pr(>|t|)    
## (Intercept)         -0.07102    1.27144  -0.056    0.956    
## renter_occupied_pct 12.31999    2.02530   6.083 6.03e-09 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 5.423 on 197 degrees of freedom
##   (12 observations deleted due to missingness)
## Multiple R-squared:  0.1581, Adjusted R-squared:  0.1539 
## F-statistic:    37 on 1 and 197 DF,  p-value: 6.029e-09

As do census tracts with higher rates of childen on Medicaid:

## 
## Call:
## lm(formula = PERC_POISONED ~ under_19_medicaid_pct, data = joined)
## 
## Residuals:
##    Min     1Q Median     3Q    Max 
## -8.193 -3.990 -1.643  3.446 19.468 
## 
## Coefficients:
##                       Estimate Std. Error t value Pr(>|t|)    
## (Intercept)             1.5312     0.8734   1.753   0.0811 .  
## under_19_medicaid_pct  11.4613     1.5699   7.300 6.91e-12 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 5.243 on 197 degrees of freedom
##   (12 observations deleted due to missingness)
## Multiple R-squared:  0.2129, Adjusted R-squared:  0.2089 
## F-statistic:  53.3 on 1 and 197 DF,  p-value: 6.908e-12

Census tracts with a higher median household income show the opposite effect — they have lower levels of lead poisoning:

## 
## Call:
## lm(formula = PERC_POISONED ~ median_household_incomeE, data = joined)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -11.576  -3.909  -1.493   2.998  16.365 
## 
## Coefficients:
##                            Estimate Std. Error t value Pr(>|t|)    
## (Intercept)               1.307e+01  9.349e-01  13.986  < 2e-16 ***
## median_household_incomeE -1.286e-04  1.905e-05  -6.751 1.61e-10 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 5.326 on 197 degrees of freedom
##   (12 observations deleted due to missingness)
## Multiple R-squared:  0.1879, Adjusted R-squared:  0.1838 
## F-statistic: 45.57 on 1 and 197 DF,  p-value: 1.607e-10

So do census tracts with newer residential housing:

## 
## Call:
## lm(formula = PERC_POISONED ~ avg_year_built, data = joined)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -11.086  -3.317  -0.816   2.474  13.795 
## 
## Coefficients:
##                 Estimate Std. Error t value Pr(>|t|)    
## (Intercept)    250.31574   26.56952   9.421   <2e-16 ***
## avg_year_built  -0.12581    0.01376  -9.146   <2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 4.948 on 196 degrees of freedom
##   (13 observations deleted due to missingness)
## Multiple R-squared:  0.2991, Adjusted R-squared:  0.2955 
## F-statistic: 83.65 on 1 and 196 DF,  p-value: < 2.2e-16

While census tracts with a higher share of Hispanic residents do not show much of a correlation at all

## 
## Call:
## lm(formula = PERC_POISONED ~ hisp_pct, data = joined)
## 
## Residuals:
##    Min     1Q Median     3Q    Max 
## -8.153 -4.817 -1.260  3.467 16.328 
## 
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)    
## (Intercept)   8.3519     0.5254  15.896  < 2e-16 ***
## hisp_pct     -5.5112     1.7326  -3.181  0.00171 ** 
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 5.764 on 197 degrees of freedom
##   (12 observations deleted due to missingness)
## Multiple R-squared:  0.04885,    Adjusted R-squared:  0.04402 
## F-statistic: 10.12 on 1 and 197 DF,  p-value: 0.001706

But the strongest correlation appears to be the census tract’s violation rate — which I calculated as the number of individual DNS violations per rental unit:

## `geom_smooth()` using formula 'y ~ x'
## Warning: Removed 12 rows containing non-finite values (stat_smooth).
## Warning: Removed 12 rows containing missing values (geom_point).

## 
## Call:
## lm(formula = PERC_POISONED ~ violation_rate, data = joined)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -9.9516 -2.3384 -0.4347  2.0552  8.6141 
## 
## Coefficients:
##                Estimate Std. Error t value Pr(>|t|)    
## (Intercept)      1.6107     0.3794   4.246 3.36e-05 ***
## violation_rate   5.2439     0.2677  19.588  < 2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 3.442 on 197 degrees of freedom
##   (12 observations deleted due to missingness)
## Multiple R-squared:  0.6607, Adjusted R-squared:  0.659 
## F-statistic: 383.7 on 1 and 197 DF,  p-value: < 2.2e-16

What the above shows is that under a simple linear regression model, there is a statistically significant correlation between violation rate and childhood lead poisoning. Specifically, an increase of one violation per rental unit appears correlated with a 5 percentage point increase in the childhood lead poisoning rate, with an adjusted R-squared of .659.

fit <- lm(PERC_POISONED ~ black_pct + avg_year_built + violation_rate, data = joined)
summary(fit)
## 
## Call:
## lm(formula = PERC_POISONED ~ black_pct + avg_year_built + violation_rate, 
##     data = joined)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -8.2349 -1.6568 -0.3176  1.0227 11.4312 
## 
## Coefficients:
##                 Estimate Std. Error t value Pr(>|t|)    
## (Intercept)    151.95748   19.70229   7.713 6.29e-13 ***
## black_pct        4.82653    0.80299   6.011 9.00e-09 ***
## avg_year_built  -0.07780    0.01016  -7.656 8.83e-13 ***
## violation_rate   3.25985    0.34030   9.579  < 2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 2.982 on 194 degrees of freedom
##   (13 observations deleted due to missingness)
## Multiple R-squared:  0.7481, Adjusted R-squared:  0.7442 
## F-statistic: 192.1 on 3 and 194 DF,  p-value: < 2.2e-16

Above is a multiple linear regression model that includes the violation rate along with the share of Black residents and the age of residential housing. This appears to show an even better fit, with an adjusted R-squared of .744. Under this model, an increase of one violation per rental unit is associated with a 3.3 percentage point increase in the childhood lead poisoning rate, when holding the share of Black residents and the median housing age constant.

Conclusion

In the city of Milwaukee, neighborhoods with a higher rate of rental code violations tend to also have higher childhood lead poisoning rates.

Predominantly Black neighborhoods and older neighborhoods are also more likely to be affected.