Source: NY Times
What explains the differences in covid-19 cases throughout the United States? Mask usage most likely plays an important role in mitigating the spread of covid-19. We’ll explore whether mask usage corresponds to differences in covid-19 cases.
All the datasets we’ll be using are on the county level (unit of observation = county).
Hint 1: Make sure that the variable that you use to link all the datasets (matching variable) has the same variable name (upper and lower case matters)
Hint 2: This variable also needs to the same class across all the datasets. I’m including code that will help.
# the fips variable must also be the same class across all datasets
#mask<- mask%>%
# mutate(fips=as.numeric(fips))
Hint: You’ll have to understand the mask data. See NY Times description of the mask data here
Hint 1: Use the 7 or 30 day moving average of new cases - think about why this may be a better measure for this question than total cumulative cases.
Hint 2: Think about comparing the distribution of covid cases between counties with high mask usage vs. counties with low mask usage.