10/19/2025

Cars Data

My topic is Simple Linear Regression and I am going to demonstrate this using the Cars dataset. I have included the variables involved and a short snippet of the data.

##                    mpg cyl disp  hp drat    wt  qsec vs am gear carb
## Mazda RX4         21.0   6  160 110 3.90 2.620 16.46  0  1    4    4
## Mazda RX4 Wag     21.0   6  160 110 3.90 2.875 17.02  0  1    4    4
## Datsun 710        22.8   4  108  93 3.85 2.320 18.61  1  1    4    1
## Hornet 4 Drive    21.4   6  258 110 3.08 3.215 19.44  1  0    3    1
## Hornet Sportabout 18.7   8  360 175 3.15 3.440 17.02  0  0    3    2
## Valiant           18.1   6  225 105 2.76 3.460 20.22  1  0    3    1

Simple Linear Regression Model

\[ y = \beta_0 + \beta_1 x + \varepsilon \]

  • \(y\): dependent (response) variable
  • \(x\): independent (predictor) variable
  • \(\beta_0\): intercept
  • \(\beta_1\): slope
  • \(\varepsilon\): random error term

Minimizing the Error: The Least Squares Method

To find the best-fit line, we use a method called Least Squares. The idea behind this is to minimize the sum of squared differences between the actual values
and the predicted values from the regression line.
These differences are called residuals.

\[ Residual = y_i - \hat{y}_i \]

  • \(y_i\): actual observed value
  • \(\hat{y}_i\): predicted value from the regression line for that \(x_i\)

Cars Data Plotly

This shows the linear relationship between a cars weight and MPG

## A marker object has been specified, but markers is not in the mode
## Adding markers to the mode...

Cars Data GGplot 1

This shows the linear relationship between a cars hp and MPG

## `geom_smooth()` using formula = 'y ~ x'

Cars Data GGplot 2

This shows the linear relationship between a cars horsepower and displacement

## `geom_smooth()` using formula = 'y ~ x'

Cars Data R Code

This slide shows the R code used to create the MPG vs Horsepower regression plot.

data(mtcars)

var1 <- lm(mpg ~ hp, data = mtcars)

library(ggplot2)
ggplot(mtcars, aes(x = hp, y = mpg)) +
  geom_point(color = "green", size = 3, alpha = 0.9) +
  geom_smooth(method = "lm", color = "purple", se = FALSE) +
  labs(
    title = "Simple Linear Regression: MPG vs Horsepower",
    x = "Horsepower (hp)",
    y = "Miles per Gallon (MPG)"
  )