attach(cars)
str(cars)
## 'data.frame': 50 obs. of 2 variables:
## $ speed: num 4 4 7 7 8 9 10 10 10 11 ...
## $ dist : num 2 10 4 22 16 10 18 26 34 17 ...
dim(cars)
## [1] 50 2
head(cars)
## speed dist
## 1 4 2
## 2 4 10
## 3 7 4
## 4 7 22
## 5 8 16
## 6 9 10
The built-in data set contains 50 observations containing speed and stopping distance.
fit <- lm(dist ~ speed, data = cars)
plot(speed, dist)
abline(fit)
fit
##
## Call:
## lm(formula = dist ~ speed, data = cars)
##
## Coefficients:
## (Intercept) speed
## -17.579 3.932
Based on the model, the y-intercept is -17.579 and slope is 3.932
stoppingdistance = -17.579 + 3.932 * speed
summary(fit)
##
## Call:
## lm(formula = dist ~ speed, data = cars)
##
## Residuals:
## Min 1Q Median 3Q Max
## -29.069 -9.525 -2.272 9.215 43.201
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) -17.5791 6.7584 -2.601 0.0123 *
## speed 3.9324 0.4155 9.464 1.49e-12 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 15.38 on 48 degrees of freedom
## Multiple R-squared: 0.6511, Adjusted R-squared: 0.6438
## F-statistic: 89.57 on 1 and 48 DF, p-value: 1.49e-12
With a low p-value of 1.49e-12, there is a great probability that speed is relevant or signifant in the model. The reported R-squared of 0.6511 for this model means that the model explains 65.11 percent of the data’s variation.
The plot seems to show residuals around the horizontal line without distinct patterns.
plot(fitted(fit),resid(fit))
The plot seems not too concerning as most data points are near the line.
qqnorm(resid(fit))
qqline(resid(fit))