2025-09-17

Introduction

Linear regression is a statistical method to model the relationship between a dependent variable (y) and one or more independent variables (x).

We will explore both simple and multiple linear regression using the mtcars dataset.

Dataset Preview

We use the mtcars dataset and select variables:

  • mpg: Miles per gallon (response)
  • wt: Car weight in 1000 lbs (predictor)
  • hp: Horsepower (predictor)
## 
## Attaching package: 'dplyr'
## The following objects are masked from 'package:stats':
## 
##     filter, lag
## The following objects are masked from 'package:base':
## 
##     intersect, setdiff, setequal, union
##                    mpg    wt  hp
## Mazda RX4         21.0 2.620 110
## Mazda RX4 Wag     21.0 2.875 110
## Datsun 710        22.8 2.320  93
## Hornet 4 Drive    21.4 3.215 110
## Hornet Sportabout 18.7 3.440 175
## Valiant           18.1 3.460 105

Simple Linear Regression (Math)

Model a response \(y\) using predictor \(x\):

\[ y = \beta_0 + \beta_1 x + \varepsilon, \quad \varepsilon \sim N(0, \sigma^2) \] OLS estimates minimize squared residuals:

\[ \hat{\beta} = \arg\min_{\beta} \sum_{i=1}^n (y_i - \beta_0 - \beta_1 x_i)^2 \]

Simple Regression Plot

Residuals Plot

Multiple Regression (Math)

For predictors \(X\):

\[ y = X \beta + \varepsilon \]

Estimator:

\[ \hat{\beta} = (X^\top X)^{-1} X^\top y \]

Standard error of coefficient \(j\):

\[ SE(\hat{\beta}_j) = \hat{\sigma}^2 \Big[ (X^\top X)^{-1} \Big]_{jj} \]

Multiple Regression 3D Plot

Regression Coefficients (Code Slide)

## # A tibble: 3 × 5
##   term        estimate std.error statistic  p.value
##   <chr>          <dbl>     <dbl>     <dbl>    <dbl>
## 1 (Intercept)  37.2      1.60        23.3  2.57e-20
## 2 wt           -3.88     0.633       -6.13 1.12e- 6
## 3 hp           -0.0318   0.00903     -3.52 1.45e- 3

Conclusion

  • Simple regression: mpg decreases as weight increases.
  • Multiple regression: both wt and hp influence mpg.
  • Residual plots check model assumptions.
  • Plotly 3D surface shows interaction between predictors.
  • Linear regression gives interpretable coefficients and predictions.