Dan Wigodsky

Assignment 11

April 22, 2018

Simple Linear Regression
……………………………………..

Does the stopping distance for a car have a relationship to its speed? Can its speed be used to predict its stopping distance? We perform a linear regression to find if there is a relationship.

model.for.stopping.distance<-lm(cars$dist~cars$speed)
summary(model.for.stopping.distance)
## 
## Call:
## lm(formula = cars$dist ~ cars$speed)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -29.069  -9.525  -2.272   9.215  43.201 
## 
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)    
## (Intercept) -17.5791     6.7584  -2.601   0.0123 *  
## cars$speed    3.9324     0.4155   9.464 1.49e-12 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 15.38 on 48 degrees of freedom
## Multiple R-squared:  0.6511, Adjusted R-squared:  0.6438 
## F-statistic: 89.57 on 1 and 48 DF,  p-value: 1.49e-12

Our model is statistically significant at the .001 significance level. Our standard error for our slope is reasonably smaller than the slope coefficient. To validate our model, we need to make sure our relationship is appropriately linear and variance- stable. There appears to be a quadratic relationship, but is the linear model statistically inappropriate? To find this out, we explore our residuals.

Our residuals are reasonably normal. We can see that in the quartiles in our summary and in the plots below. At 1.49* 10-12, our p-value is incredibly small and is significant at the .001 significance level. Our R2 is .6511 and the adjusted R2 is close to that. This means most of the difference in stopping distance can be explained by our model.

## [1] "The mean of the residuals is "
## [1] 8.65974e-17
## 
##  studentized Breusch-Pagan test
## 
## data:  model.for.stopping.distance
## BP = 3.2149, df = 1, p-value = 0.07297

Our residuals are nearly normal and stable and we accept our model: stopping distance = -17.5791 + 3.9324 * speed. According to the cars data set, there is a relationship between speed and stopping distance with stopping distance as the dependent variable.