Data 605 -Discussion Week12

Using R, build a regression model for data that interests you. Conduct residual analysis. Was the linear model appropriate? Why or why not?

processors <- read.csv("full_data.csv")
head(processors)

##         date    location new_cases new_deaths total_cases total_deaths
## 1 2019-12-31 Afghanistan         0          0           0            0
## 2 2020-01-01 Afghanistan         0          0           0            0
## 3 2020-01-02 Afghanistan         0          0           0            0
## 4 2020-01-03 Afghanistan         0          0           0            0
## 5 2020-01-04 Afghanistan         0          0           0            0
## 6 2020-01-05 Afghanistan         0          0           0            0

plot(processors$new_cases, processors$new_deaths)

lm1=lm(processors$new_deaths~processors$new_cases)

plot(processors$new_cases, processors$new_deaths)
abline(lm1)

summary(lm1)

## 
## Call:
## lm(formula = processors$new_deaths ~ processors$new_cases)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -402.45    0.05    0.09    0.09  510.77 
## 
## Coefficients:
##                        Estimate Std. Error t value Pr(>|t|)    
## (Intercept)          -0.0941680  0.2169339  -0.434    0.664    
## processors$new_cases  0.0433618  0.0002312 187.585   <2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 17.09 on 6269 degrees of freedom
## Multiple R-squared:  0.8488, Adjusted R-squared:  0.8488 
## F-statistic: 3.519e+04 on 1 and 6269 DF,  p-value: < 2.2e-16

plot(fitted(lm1),resid(lm1))

qqnorm(resid(lm1))
qqline(resid(lm1))

Data 605 -Discussion Week12

Monu Chacko

4/19/2020

Using R, build a regression model for data that interests you. Conduct residual analysis. Was the linear model appropriate? Why or why not?