Covid-19 has surged in many states in the US over the summer of 2019 and there have been various explanations for this might have occurred. One theory is that super spreader events might be inducing a large uptake in cases. Bars, where people congregate in tightly packed spaces, with little ventilation, and have to talk loudly might be a prime candidate for super spreading of the novel coronavirus.

In addition, incarcerated individuals are at high-risk for COVID-19. See The Marshall Project. Overcrowding and a lack of PPE might make prisons and jails especially dangerous.

In this problem set, we will examine whether bars and/or inceration is associated with COVID-19.
All the datasets we’ll be using are on the county level (unit of observation = county).

The datasets:
  1. COVID-19 data accessed via Social Explorer Source: NY Times
  2. Quarterly Census of Employment and Wages (QCEW) Source: U.S. Bureau of Labor Statistics (BLS)
  3. Incarceration Trends Dataset Source: Vera Institute of Justice
Questions
1. We first need to merge data from 3 different sources. Beginning with the Covid-19 data, merge it with the QCEW and then with the incarceration data.
2. Take a look at the first 10 observations of your newly merged dataset, which contains data on Covid-19, bars, incarceration, and all three. Why do you think we’re missing data for some of these counties?
3. Graph and interpret the distribution of bars and drinking establishments. Should we be looking at the distribution of bars and drinking establishment as a rate or as a count? Why?
4. Graph and interpret the distribution of incarceration. Should we be looking at the distribution of incarcerations as a rate or as a count? Why?
5. Is there any evidence in the data that bars are contributing to higher covid-19 cases?
6. Is there any evidence that higher levels of incarceration are contribution to higher covid-19 cases?