Speed and Stopping Distances of Cars
Description
The data give the speed of cars and the distances taken to stop. Note that the data were recorded in the 1920s.
Usage
cars Format
A data frame with 50 observations on 2 variables.
[,1] speed numeric Speed (mph)
[,2] dist numeric Stopping distance (ft)
Importing the dataset
dataset_cars <- cars
summary(dataset_cars)
## speed dist
## Min. : 4.0 Min. : 2.00
## 1st Qu.:12.0 1st Qu.: 26.00
## Median :15.0 Median : 36.00
## Mean :15.4 Mean : 42.98
## 3rd Qu.:19.0 3rd Qu.: 56.00
## Max. :25.0 Max. :120.00
library(ggplot2)
ggplot() +
geom_point(aes(x = dataset_cars$speed, y = dataset_cars$dist),
color = 'red') +
ggtitle('speed and stopping distance of cars') +
xlab('speed (mph)') +
ylab('stopping distance (ft)')
From the initial view, there looks to be a positive linear relationship between the speed of the car and the stopping disctance. Let us check further how closely these 2 are related, and relationship statistics.
Defining a linear model between the speed and stopping distance using lm function. Then we will evaluate the model.
regressor <- lm(formula = dist ~ speed,
data = dataset_cars)
regressor
##
## Call:
## lm(formula = dist ~ speed, data = dataset_cars)
##
## Coefficients:
## (Intercept) speed
## -17.579 3.932
From the above details, we see the 2 important values: Intercept - distance required to stop a vehicle when the speed is 0. This is not a valid scenario, however it is used to determine the line.Its value is -17.579 Slope - the change in the dependent variable for every unit change of the independent variable. Its value is 3.932, that means for every 1 mph increased or decrease in the speed, the corresponding stopping distance will be increased or decreased by 3.932 ft.
summary(regressor)
##
## Call:
## lm(formula = dist ~ speed, data = dataset_cars)
##
## Residuals:
## Min 1Q Median 3Q Max
## -29.069 -9.525 -2.272 9.215 43.201
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) -17.5791 6.7584 -2.601 0.0123 *
## speed 3.9324 0.4155 9.464 1.49e-12 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 15.38 on 48 degrees of freedom
## Multiple R-squared: 0.6511, Adjusted R-squared: 0.6438
## F-statistic: 89.57 on 1 and 48 DF, p-value: 1.49e-12
As we see above, 65.11% of the variation of the stopping distance is explained by the speed of the car.
Plotting the residuals:
plot(fitted(regressor),resid(regressor))
qqnorm(resid(regressor))
qqline(resid(regressor))