First we load the ‘cars’ dataset.

Then, we take a look at our data via a histogram.

hist(cars$dist)

hist(cars$speed)

Below, we build our model an look at it’s summary.

linear.model <- lm(cars$dist~cars$speed)
summary(linear.model)
## 
## Call:
## lm(formula = cars$dist ~ cars$speed)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -29.069  -9.525  -2.272   9.215  43.201 
## 
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)    
## (Intercept) -17.5791     6.7584  -2.601   0.0123 *  
## cars$speed    3.9324     0.4155   9.464 1.49e-12 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 15.38 on 48 degrees of freedom
## Multiple R-squared:  0.6511, Adjusted R-squared:  0.6438 
## F-statistic: 89.57 on 1 and 48 DF,  p-value: 1.49e-12
plot(x = cars$speed, y = cars$dist)

intercept <- coef(linear.model)[1]
slope <- coef(linear.model)[2]
slope
## cars$speed 
##   3.932409
intercept
## (Intercept) 
##   -17.57909

This shows a simple linear model fitted to our data. In particular: \[ \overline{\text{stopping distance}} = -17.6 + 3.93* mph \]

Next we conduct a residual analysis.

residual <- residuals(linear.model)
residual <- as.data.frame(residual)
hist(residual$residual)

plot(fitted(linear.model), resid(linear.model))

Next, we visualize the qqnorm plot.

qqnorm(resid(linear.model))
qqline(resid(linear.model))

Then we use a Shapiro-Wilk normality test to see if the sample comes froma normally distributed population.

shapiro.test(linear.model$residuals)
## 
##  Shapiro-Wilk normality test
## 
## data:  linear.model$residuals
## W = 0.94509, p-value = 0.02152

Because the p value is .02% we can reject the null hypothesis that the sample came from a normal distribution.