Coronavirus Data
Coronavirus Data pulled on March 18 from the Johns Hopkins University Center for Systems Science and Engineering (JHU CCSE) Coronavirus repository, via R Coronavirus package by github user RamiKrispin.
United States coronavirus data includes daily number of cases from February 24, 2020, through March 19th, 2020.
| US |
2020-03-16 |
1133 |
22 |
22 |
4617 |
85 |
8.437500 |
0.0184102 |
| US |
2020-03-17 |
1789 |
23 |
23 |
6406 |
108 |
8.764990 |
0.0168592 |
| US |
2020-03-18 |
1362 |
10 |
24 |
7768 |
118 |
8.957768 |
0.0151905 |
| US |
2020-03-19 |
5894 |
82 |
25 |
13662 |
200 |
9.522374 |
0.0146391 |
| US |
2020-03-20 |
5423 |
44 |
26 |
19085 |
244 |
9.856658 |
0.0127849 |
| US |
2020-03-21 |
6389 |
63 |
27 |
25474 |
307 |
10.145414 |
0.0120515 |
| US |
2020-03-22 |
7787 |
110 |
28 |
33261 |
417 |
10.412141 |
0.0125372 |
| US |
2020-03-23 |
10571 |
140 |
29 |
43832 |
557 |
10.688119 |
0.0127076 |
| US |
2020-03-24 |
9893 |
149 |
30 |
53725 |
706 |
10.891634 |
0.0131410 |
| US |
2020-03-25 |
12038 |
236 |
31 |
65763 |
942 |
11.093813 |
0.0143242 |
Number of Coronavirus Cases in the US

Cumulative Death Rate by Date
UScoronavirus%>%
ggplot()+
geom_area(aes(x=date,y=DeathRate), fill="salmon")+
theme_minimal()+
labs(title="United States: Cumulative Death Rate by Date")

Number of Coronavirus Cases in the US, by Date
The line in the graph represents exponential growth. Unlike linear growth, where we might observe steady increase across our domain, exponential growth exhibits increase that becomes increasingly more rapid among the larger values in our domain.
We can log-transform the Y-values so that each upward interval in our y-axis represents a change of several orders of magnitude, instead of several equally spaced units. By doing this, and then re-plotting our line, we are able to represent the exponential increase over time in a visually linear format.

Predicting Growth: Linear Regression w/Exponential Data
After applying the log-transformation to the Y-values in our data, we are presented with a visually linear relationship between confirmed cases, and date. If we can fit a least-squares regression line through these points, then we should be able to estimate the rate of change, and the number of cases that could be expected on future dates.
- For every day that passes, there is a 0.2915 increase in number of cases on a log scale.
- To estimate the number of cases on a specific date.
exp_model<-lm(UScoronavirus$Log_Confirmed_Total~UScoronavirus$DayNumber)
summary(exp_model)
##
## Call:
## lm(formula = UScoronavirus$Log_Confirmed_Total ~ UScoronavirus$DayNumber)
##
## Residuals:
## Min 1Q Median 3Q Max
## -0.34803 -0.15553 -0.05247 0.15960 0.81030
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 2.498482 0.100151 24.95 <2e-16 ***
## UScoronavirus$DayNumber 0.274738 0.005464 50.28 <2e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 0.2721 on 29 degrees of freedom
## Multiple R-squared: 0.9887, Adjusted R-squared: 0.9883
## F-statistic: 2529 on 1 and 29 DF, p-value: < 2.2e-16
Graph of Fitted vs. Observed Values

Exponential Growth Rate by State
Comparing Exponential Growth rates between different States in the US as of March 24th.
Percent of Cases Resulting in Death by Region

Map of Confirmed Cases
