Introduction

This analysis examines the relationship between horsepower and miles per gallon (mpg) using the Auto dataset. A simple linear regression model is used, and a residual vs fitted plot is used to evaluate whether the relationship is linear or nonlinear.

Load Data

# Run this once in your console if ISLR is not yet installed:
# install.packages("ISLR")
library(ISLR)
data(Auto)
head(Auto)
##   mpg cylinders displacement horsepower weight acceleration year origin
## 1  18         8          307        130   3504         12.0   70      1
## 2  15         8          350        165   3693         11.5   70      1
## 3  18         8          318        150   3436         11.0   70      1
## 4  16         8          304        150   3433         12.0   70      1
## 5  17         8          302        140   3449         10.5   70      1
## 6  15         8          429        198   4341         10.0   70      1
##                        name
## 1 chevrolet chevelle malibu
## 2         buick skylark 320
## 3        plymouth satellite
## 4             amc rebel sst
## 5               ford torino
## 6          ford galaxie 500

Linear Regression Model

model <- lm(mpg ~ horsepower, data = Auto)
summary(model)
## 
## Call:
## lm(formula = mpg ~ horsepower, data = Auto)
## 
## Residuals:
##      Min       1Q   Median       3Q      Max 
## -13.5710  -3.2592  -0.3435   2.7630  16.9240 
## 
## Coefficients:
##              Estimate Std. Error t value Pr(>|t|)    
## (Intercept) 39.935861   0.717499   55.66   <2e-16 ***
## horsepower  -0.157845   0.006446  -24.49   <2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 4.906 on 390 degrees of freedom
## Multiple R-squared:  0.6059, Adjusted R-squared:  0.6049 
## F-statistic: 599.7 on 1 and 390 DF,  p-value: < 2.2e-16

The regression estimates the relationship between horsepower and mpg.

Residual vs Fitted Plot

plot(model$fitted.values, model$residuals,
     xlab = "Fitted Values",
     ylab = "Residuals",
     main = "Residuals vs Fitted Plot")
abline(h = 0, col = "red")

Interpretation

The residual vs fitted plot helps determine whether the linear regression model adequately captures the relationship between horsepower and mpg. If the residuals are randomly scattered around the horizontal line at zero, the relationship can be considered approximately linear. However, if a systematic pattern such as a U-shape or inverted U-shape appears, this indicates that the linear model fails to capture a nonlinear relationship between the variables.

In this case, the residual plot shows a curved pattern, suggesting that the relationship between horsepower and mpg is nonlinear. Therefore, a nonlinear model such as polynomial regression may provide a better fit for the data.

Conclusion

The analysis suggests that a simple linear regression model may not fully capture the relationship between horsepower and mpg. The residual pattern indicates potential nonlinearity, and further modeling techniques may be needed to improve the model.