12/4/2018

Introduction

Data Frames Used

Variables of Interest

Intended to further break down data by:

  • Race
  • Rate of Unemployment

Analysis

Place of Interest

cdcDF %>%
  ggplot(aes(x = Year, y = deathPerc, group = State)) + 
  geom_line() +
  gghighlight::gghighlight(State == "West Virginia") +
  labs(y = "Death Percentage")
deathPercByStateData %>% 
  filter(Year == 2016) %>% 
  ggplot(aes(x = long, y = lat, group = group)) + 
  geom_polygon(aes(fill = deathPerc)) +
  coord_fixed(1.3) +
  ditch_the_axes + 
  theme(plot.title = element_text(hjust = 0.5)) + 
  labs(title = "Ratio of Deaths per State Population - 2016", 
       fill = "Death Percentage")

Country vs State Comparison

raceData %>% 
  filter(State %in% c("United States", "West Virginia")) %>% 
  knitr::kable()
State White Black Hispanic Asian American Indian/Alaska Native Native Hawaiian/Other Pacific Islander Two Or More Races Total Year
United States 0.61 0.12 0.18 0.01 0.05 0 0.03 1 2016
West Virginia 0.92 0.03 0.01 0.00 0.01 NA 0.02 1 2016

Top Five States - Race Breakdown

cdcDF2016 %>% 
  arrange(desc(deathPerc)) %>% 
  head(5) %>% 
  mutate(deathPerc = round(deathPerc, digits = 5)) %>% 
  select(State, deathPerc, White, Black, Hispanic, Asian,
         `American Indian/Alaska Native`,
         `Two Or More Races`) %>% 
  knitr::kable()

State deathPerc rate2016 White Black Hispanic Asian American Indian/Alaska Native Two Or More Races
West Virginia 0.05314 6.1 0.92 0.03 0.01 0 0.01 0.02
Ohio 0.03912 5.0 0.80 0.12 0.04 0 0.02 0.03
New Hampshire 0.03761 2.9 0.91 0.01 0.04 NA 0.03 0.02
Pennsylvania 0.03712 5.4 0.77 0.10 0.07 0 0.03 0.02
Maryland 0.03489 4.4 0.51 0.29 0.10 0 0.06 0.03

Top Five - Unemployment Rate

cdcDF2016 %>% 
  arrange(desc(rate2016)) %>% 
  head(5) %>% 
  select(State, rate2016) %>% 
  knitr::kable()
State rate2016
Alaska 6.9
New Mexico 6.7
West Virginia 6.1
Louisiana 6.0
Alabama 5.9

Numerical Modelling

modCDC <- lm(deathPerc ~ White + rate2016, data = cdcDF2016)
summary(modCDC)
confint(modCDC)

## 
## Call:
## lm(formula = deathPerc ~ White + rate2016, data = cdcDF2016)
## 
## Residuals:
##        Min         1Q     Median         3Q        Max 
## -0.0111508 -0.0074817 -0.0005783  0.0042798  0.0219710 
## 
## Coefficients:
##              Estimate Std. Error t value Pr(>|t|)   
## (Intercept) -0.007121   0.009882  -0.721  0.47471   
## White        0.017568   0.008243   2.131  0.03832 * 
## rate2016     0.003627   0.001317   2.754  0.00835 **
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.008492 on 47 degrees of freedom
## Multiple R-squared:  0.1609, Adjusted R-squared:  0.1251 
## F-statistic: 4.505 on 2 and 47 DF,  p-value: 0.01622

##                     2.5 %      97.5 %
## (Intercept) -0.0270011093 0.012758684
## White        0.0009861185 0.034150026
## rate2016     0.0009774032 0.006276686

Classification Tree

tr <- rpart::rpart(deathPerc~White + rate2016, data = cdcDF2016)
plot(partykit::as.party(tr))

Conclusion

Final Thoughts

Slight to moderate correlation between drug-related deaths (deathPerc) and rate of unemployment (rate2016) and proportion of white people within the state population (White).

References

  1. “Multiple Cause of Death Data on CDC WONDER.” Centers for Disease Control and Prevention, Centers for Disease Control and Prevention, wonder.cdc.gov/mcd.html.

  2. "Population Distribution by Race/Ethnicity". The Henry J. Kaiser Family Foundation, The Henry J. Kaiser Family Foundation, 29 Nov. 2018, www.kff.org/other/state-indicator/distribution-by-raceethnicity/?currentTimeframe=1&sortModel=%7B%22colId%22%3A%22Location%22%2C%22sort%22%3A%22desc%22%7D.

  3. “Unemployment Rates for States, Annual Averages.” U.S. Bureau of Labor Statistics, U.S. Bureau of Labor Statistics, 27 Feb. 2018, www.bls.gov/lau/lastrk16.htm.