Using the cars dataset in R, build a linear model for stopping distance as a function of speed and replicate the analysis of your textbook chapter 3 (visualization, quality evaluation of the model, and residual analysis.)

library(tidyverse)
library(gridExtra)
library(dplyr)
library(ggpmisc)
library(tidymodels)
library(DataExplorer)
data <- cars
summary(data)
##      speed           dist       
##  Min.   : 4.0   Min.   :  2.00  
##  1st Qu.:12.0   1st Qu.: 26.00  
##  Median :15.0   Median : 36.00  
##  Mean   :15.4   Mean   : 42.98  
##  3rd Qu.:19.0   3rd Qu.: 56.00  
##  Max.   :25.0   Max.   :120.00
glimpse(data)
## Rows: 50
## Columns: 2
## $ speed <dbl> 4, 4, 7, 7, 8, 9, 10, 10, 10, 11, 11, 12, 12, 12, 12, 13, 13,...
## $ dist  <dbl> 2, 10, 4, 22, 16, 10, 18, 26, 34, 17, 28, 14, 20, 24, 28, 26,...
DataExplorer::plot_histogram(cars, theme_config = theme_bw())

The dataset contains 2 variables, and 50 cases.


model <- lm(formula = dist ~ speed,
            data = data)
summary(model)
## 
## Call:
## lm(formula = dist ~ speed, data = data)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -29.069  -9.525  -2.272   9.215  43.201 
## 
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)    
## (Intercept) -17.5791     6.7584  -2.601   0.0123 *  
## speed         3.9324     0.4155   9.464 1.49e-12 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 15.38 on 48 degrees of freedom
## Multiple R-squared:  0.6511, Adjusted R-squared:  0.6438 
## F-statistic: 89.57 on 1 and 48 DF,  p-value: 1.49e-12

The model has an adjusted r-squared of 0.6438102, and a p-value of \(\approx\) 0.


plot(model)

The residuals appear to be nearly normal, as can be seen in the plots fitted vs. residuals and this is depicted showing random scatter around the 0 residual line. Points are also scattered around the Q-Q Plot normal line. Based on the model R2, p value and normality of the model, we can assume that this model can sufficiently made predictions on distance given the speed.