2025-11-09

What is Simple Linear Regression?

  • Goal: model a linear relationship between predictor \(X\) and response \(Y\).
  • Model: \(Y_i = \beta_0 + \beta_1 X_i + \epsilon_i\).
  • \(\beta_0\): intercept, \(\beta_1\): slope, \(\epsilon_i\): random error.
  • Example: Predicting car MPG from vehicle weight.

The Data: mtcars

  • Response \(Y\): mpg (miles per gallon)
  • Predictor \(X\): wt (weight in 1000 lbs)
  • Clear negative relationship
head(mtcars[, c("mpg", "wt")])
##                    mpg    wt
## Mazda RX4         21.0 2.620
## Mazda RX4 Wag     21.0 2.875
## Datsun 710        22.8 2.320
## Hornet 4 Drive    21.4 3.215
## Hornet Sportabout 18.7 3.440
## Valiant           18.1 3.460

Math Behind the Line (1/2)

Least squares chooses \(\hat{\beta}_0,\hat{\beta}_1\) to minimize \[ SSE=\sum_{i=1}^n (y_i-\hat{\beta}_0-\hat{\beta}_1 x_i)^2. \]

Math Behind the Line (2/2)

\[ \hat{\beta}_1= \frac{\sum_{i=1}^n (x_i-\bar{x})(y_i-\bar{y})}{\sum_{i=1}^n (x_i-\bar{x})^2}, \qquad \hat{\beta}_0=\bar{y}-\hat{\beta}_1\bar{x},\quad R^2=1-\frac{SSE}{SST}. \]

Fit the Model in R

model <- lm(mpg ~ wt, data = mtcars)
summary(model)
## 
## Call:
## lm(formula = mpg ~ wt, data = mtcars)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -4.5432 -2.3647 -0.1252  1.4096  6.8727 
## 
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)    
## (Intercept)  37.2851     1.8776  19.858  < 2e-16 ***
## wt           -5.3445     0.5591  -9.559 1.29e-10 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 3.046 on 30 degrees of freedom
## Multiple R-squared:  0.7528, Adjusted R-squared:  0.7446 
## F-statistic: 91.38 on 1 and 30 DF,  p-value: 1.294e-10

MPG vs Weight with Fitted Regression Line (info)

  • Visualize the relationship and overlay the fitted line.
  • Ribbon = confidence interval for mean prediction.
  • Expect a visible downward trend (negative slope).

MPG vs Weight with Fitted Regression Line

Residuals vs Fitted Values (info)

  • Diagnostic to check linearity & constant variance.
  • We want residuals scattered around 0 (no strong pattern).

Residuals vs Fitted Values

Interactive: MPG vs Weight with Fitted Line (info)

  • Hover to see exact values for each car.
  • Line shows the fitted SLR prediction.

Interactive: MPG vs Weight with Fitted Line

3D View: MPG vs Weight vs Horsepower (info)

  • A peek at multiple regression (fitting a plane instead of a line).

3D View: MPG vs Weight vs Horsepower