The relevant article is found here.
In it, Reilly says he built a regression model with covariates as population [number], population density, median income, median age, diversity (measured as the percentage of minorities in a population), and the state’s Covid-19 response strategy (0 = lockdown, 1 = social distancing).
#import data
library(haven)
df <- read_dta("./COVID.dta")
attach(df)
head(df)
## # A tibble: 6 x 10
## StateName Population Density Age Income POC Strategy Cases Deaths Governor
## <chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
## 1 Alabama 4.90 93.5 39.9 48.1 31.5 0 4241 123 0
## 2 Alaska 1.30 1.30 34 73 33.3 0 293 9 0
## 3 Arizona 7.30 57 37.4 56.6 27 0 3692 142 0
## 4 Arkansas 3 56.4 37.9 45.9 23 1 1599 34 0
## 5 Californ… 39.5 254. 36.3 71.2 27.9 0 26838 864 1
## 6 Colorado 5.80 52 36.6 69.1 18.7 0 8280 357 1
ANOVA strategy — States form blocks and within each block is a independent variables (which can be viewed as treatments) which produces a response variable deaths.
lm1 = lm(Deaths ~ Population+Density+Age + Income + POC + as.factor(Strategy) + Cases)
anova(lm1)
## Analysis of Variance Table
##
## Response: Deaths
## Df Sum Sq Mean Sq F value Pr(>F)
## Population 1 17581158 17581158 799.0221 < 2.2e-16 ***
## Density 1 9454264 9454264 429.6740 < 2.2e-16 ***
## Age 1 27814 27814 1.2641 0.267269
## Income 1 4320 4320 0.1963 0.659985
## POC 1 269539 269539 12.2499 0.001116 **
## as.factor(Strategy) 1 81564 81564 3.7069 0.060977 .
## Cases 1 110944053 110944053 5042.1453 < 2.2e-16 ***
## Residuals 42 924140 22003
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1