2025-11-09

Overview

  • Topic: Simple Linear Regression
  • Goal: Model the relationship between two numeric variables.
  • Example: Predict miles per gallon (mpg) using car weight (wt).
  • Dataset: Built-in mtcars dataset (no uploads needed).
  • Why this topic?
    • Easy to understand
    • Easy to visualize
    • Common in real-world analysis

What is Simple Linear Regression?

We model a numeric outcome \(Y\) using one predictor \(X\):

\[ Y = \beta_0 + \beta_1 X + \varepsilon \]

Where:

  • \(\beta_0\): intercept
  • \(\beta_1\): slope
  • \(\varepsilon\): random error term

Estimated line:

\[ \hat{Y} = b_0 + b_1 X \]

Our Example: mpg vs weight

We use the built-in mtcars dataset:

  • \(Y = \text{mpg}\): miles per gallon
  • \(X = \text{wt}\): weight in 1000 lbs

Intuition:
Heavier cars usually have lower fuel efficiency.

##                    mpg    wt
## Mazda RX4         21.0 2.620
## Mazda RX4 Wag     21.0 2.875
## Datsun 710        22.8 2.320
## Hornet 4 Drive    21.4 3.215
## Hornet Sportabout 18.7 3.440
## Valiant           18.1 3.460

ggplot Plot 1: Scatterplot of MPG vs Weight

Each point represents one car. A clear downward trend shows that as weight increases, mpg decreases.

ggplot Plot 2: Regression Line

We fit the model:

\[ \text{mpg} = b_0 + b_1 \cdot \text{wt} \]

The line shows the predicted mpg for a given weight, and the shaded region is the confidence interval.

Plotly Plot: Interactive MPG vs Weight

Hover over each point to see: - Car name
- Weight
- MPG

This interactive plot allows deeper exploration of the data.

R Code: Fitting the Linear Model

We use R’s built-in lm() function to fit the regression model and display its summary.

model <- lm(mpg ~ wt, data = mtcars)
summary(model)
## 
## Call:
## lm(formula = mpg ~ wt, data = mtcars)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -4.5432 -2.3647 -0.1252  1.4096  6.8727 
## 
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)    
## (Intercept)  37.2851     1.8776  19.858  < 2e-16 ***
## wt           -5.3445     0.5591  -9.559 1.29e-10 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 3.046 on 30 degrees of freedom
## Multiple R-squared:  0.7528, Adjusted R-squared:  0.7446 
## F-statistic: 91.38 on 1 and 30 DF,  p-value: 1.294e-10

Interpreting the Results

  • The slope \(b_1\) is negative → heavier cars = lower mpg
  • A small p-value → strong evidence of a real relationship
  • \(R^2\) shows how much variation in mpg is explained by weight
  • Conclusion: Weight significantly affects fuel efficiency.

Summary

  • Simple Linear Regression models a linear relationship between two variables
  • We analyzed mpg vs weight using the mtcars dataset
  • Included:
    • 2 ggplot plots
    • 1 Plotly plot
    • 2 slides with math
    • 1 slide with R code