What this is about

  • This small project explores how a car’s weight affects its fuel efficiency.
  • The idea is simple, heavier cars usually burn more fuel, so their miles per gallon (MPG) should drop as weight increases.
  • In the next few slides, I’ll show this relationship, fit a straight-line model, and check how well the line captures the pattern.

The data (mtcars)

  • I’m using the built-in mtcars dataset that comes with R.
  • It includes information about different car models, like their weight, horsepower, and fuel efficiency.
  • For this analysis, I’ll focus mainly on weight (wt) and miles per gallon (mpg), and later, I’ll bring in horsepower (hp) for a quick 3D look. The weights are measured in thousands of pounds, and mpg shows how far a car can go on one gallon of fuel.

A 3D first look (Plotly)

MPG vs Weight (ggplot)

The model (LaTeX)

To describe how weight affects MPG, I’m using a simple linear regression model:

\[ \text{MPG}_i = \beta_0 + \beta_1 \times \text{WT}_i + \varepsilon_i \]

  • Here, \(\beta_0\) is the starting MPG (the intercept), and \(\beta_1\) shows how much MPG changes when weight increases by 1000 lbs.
  • We assume the random errors are independent, have an average of zero, and roughly constant spread.

Estimation & testing (LaTeX)

The slope \(\beta_1\) is estimated using the least squares method:

\[ \hat{\beta}_1 = \frac{\sum (x_i-\bar{x})(y_i-\bar{y})}{\sum (x_i-\bar{x})^2} \]

  • To check if the relationship is real, I test \(H_0:\beta_1=0\) using a t-test.
  • I also look at the p-value and confidence interval to see if the slope is significantly different from zero.

What the fitted line says

The fitted regression line comes out to:

\[ \widehat{\text{MPG}} = 37.29 + -5.34 \times \text{WT} \]

  • This means that for every extra 1000 pounds of weight, the car’s fuel efficiency drops by about -5.34 miles per gallon.
  • The model fits quite well, it explains roughly 0.753 of the variation in MPG, and the slope’s p-value (1.29^{-10}) shows the relationship is statistically strong.

Model Diagnostics: Residuals

R Code Used to Create the Plot

library(ggplot2); data(mtcars)
fit <- lm(mpg ~ wt, data = mtcars)

ggplot(mtcars, aes(wt, mpg)) +
  geom_point() +
  geom_smooth(method = "lm", se = TRUE) +
  labs(x = "Weight (1000 lbs)", y = "MPG")

Conclusion

  • Overall, the pattern is clear, heavier cars get lower mileage.
  • The regression line captures this relationship well, and the slope is statistically significant, showing a strong negative link between weight and fuel efficiency.
  • This simple example highlights how linear regression can turn a real-world idea into measurable evidence, a skill that applies across many areas like health, sports, and finance.