2025-11-07

Simple Linear Regression

How is miles per gallon affected by engine horsepower?


By modelling fuel efficiency \(\text Y\) as a linear function of horsepower \(\text X\) I hope to find the answer.

\(y_i = \beta_0 + \beta_1 x_i + \varepsilon_i, \quad i = 1, \dots, n\)

  • \(Y_i\): mpg (miles per gallon) of car \(i\)
  • \(X_i\): horsepower of car \(i\)
  • \(\beta_0\): intercept
  • \(\beta_1\): slope
  • \(\varepsilon_i\): random error

MPG vs Horsepower

Comparing Horsepower to Miles per Gallon using the built in mtcars dataset.
##                    mpg  hp
## Mazda RX4         21.0 110
## Mazda RX4 Wag     21.0 110
## Datsun 710        22.8  93
## Hornet 4 Drive    21.4 110
## Hornet Sportabout 18.7 175
## Valiant           18.1 105
## Duster 360        14.3 245
## Merc 240D         24.4  62
## Merc 230          22.8  95
## Merc 280          19.2 123

Plotting initial Data

Reregression

Using the linear model function you can see it has very low P value


mod <-lm(mpg ~ hp)
summary(mod)
## 
## Call:
## lm(formula = mpg ~ hp)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -5.7121 -2.1122 -0.8854  1.5819  8.2360 
## 
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)    
## (Intercept) 30.09886    1.63392  18.421  < 2e-16 ***
## hp          -0.06823    0.01012  -6.742 1.79e-07 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 3.863 on 30 degrees of freedom
## Multiple R-squared:  0.6024, Adjusted R-squared:  0.5892 
## F-statistic: 45.46 on 1 and 30 DF,  p-value: 1.788e-07

Adding Regression Line

Distribution of Fuel Efficiency in Observations

Conclusion

By modeling miles per gallon \(Y\) as a linear function of horsepower \(X\):

Found that \(\hat{\beta}_1 < 0\), indicating that higher horsepower is associated with lower fuel efficiency.

The fitted line summarizes the overall trend and provides a simple prediction rule for mpg given hp.

Overall, this simple linear regression example shows how we can use statistics to quantify and visualize the trade-off between engine power and fuel efficiency.