The big idea

What we’re modeling

Multiple linear regression explains a numeric outcome (response) using two or more predictors.

We’ll use the built-in mtcars dataset and model:

  • Response: miles per gallon (mpg)
  • Predictors: car weight (wt) and horsepower (hp)

Why this is useful:

  • Predict outcomes
  • Measure how each predictor relates to the response (holding others fixed)
  • Do inference: confidence intervals, p-values, etc.

The model

Multiple linear regression model

We assume:

\[\displaystyle 𝑌= \beta_0 + \beta_1 X_1 + \beta_2 X_2 + \varepsilon\]

where:

  • Y is the response (here, mpg)

  • \(X_1\) , \(X_2\) are predictors (here, 𝑤𝑡, ℎ𝑝)

  • \(\varepsilon\) is random error, often assumed \(\varepsilon \sim \mathcal{N} (\mu =0; \, \, \sigma^2)\)

Interpretation example:

  • \(\beta_1\) = expected change in 𝑚𝑝𝑔 for a +1 unit change in 𝑤𝑡, holding ℎ𝑝 constant.

Example data

                   mpg    wt  hp               car
Mazda RX4         21.0 2.620 110         Mazda RX4
Mazda RX4 Wag     21.0 2.875 110     Mazda RX4 Wag
Datsun 710        22.8 2.320  93        Datsun 710
Hornet 4 Drive    21.4 3.215 110    Hornet 4 Drive
Hornet Sportabout 18.7 3.440 175 Hornet Sportabout
Valiant           18.1 3.460 105           Valiant

We’ll fit:

# A tibble: 3 × 5
  term        estimate std.error statistic  p.value
  <chr>          <dbl>     <dbl>     <dbl>    <dbl>
1 (Intercept)  37.2      1.60        23.3  2.57e-20
2 wt           -3.88     0.633       -6.13 1.12e- 6
3 hp           -0.0318   0.00903     -3.52 1.45e- 3

ggplot: relationship in 2D

mpg vs weight (with a simple trend line)

Even though our final model uses two predictors, it helps to visualize one relationship first.

R code that makes the ggplot

The following code produced the ggplot on the previous slide:

p1 <- ggplot(df, aes(x = wt, y = mpg)) +
geom_point(size = 2) +
geom_smooth(method = "lm", se = TRUE) +
labs(
title = "mpg vs weight (wt)",
x = "Weight (1000 lbs)",
y = "Miles per gallon (mpg)"
)

p1

Least Squares

How the “best” line/plane is chosen

The regression coefficients are chosen to minimize the sum of squared residuals:

\[\displaystyle SSE = \sum_{i=1}^n(y_i - \hat{y}_i)^2\]

With two predictors:

\[\displaystyle \hat{y}_i = \hat{\beta}_0 + \hat{\beta}_1 x_{i1} + \hat{\beta}_2 x_{i2}\]

Residual:

\[\displaystyle e_i = y_i - \hat{y}_i\]

Small residuals (on average) → better fit.

ggplot: diagnostics

Residuals vs fitted values

A common diagnostic: residuals should look randomly scattered around 0.

Plotly (3D) regression plane

3D view: (wt, hp) → mpg

This is the same model, but visualized as a plane in 3D.

Create a grid of predictor values for the plane

wt_seq <- seq(min(df$wt), max(df$wt), length.out = 25)
hp_seq <- seq(min(df$hp), max(df$hp), length.out = 25)

grid <- expand.grid(wt = wt_seq, hp = hp_seq)
grid$mpg_hat <- predict(fit, newdata = grid)

Turn predictions into a matrix for plotly surface

Inference: p-values + confidence intervals

What do the results say?

coef(summary(fit))
               Estimate Std. Error   t value     Pr(>|t|)
(Intercept) 37.22727012 1.59878754 23.284689 2.565459e-20
wt          -3.87783074 0.63273349 -6.128695 1.119647e-06
hp          -0.03177295 0.00902971 -3.518712 1.451229e-03

confint(fit)
                  2.5 %      97.5 %
(Intercept) 33.95738245 40.49715778
wt          -5.17191604 -2.58374544
hp          -0.05024078 -0.01330512

How to interpret:

  • p-value for a coefficient tests \(H_0 : \beta_j = 0\)
  • A 95% CI gives a plausible range of values for \(\beta_j\)
  • If the CI does not include 0, that usually agrees with “small p-value.”

Takeaways

  • Multiple linear regression models a response using multiple predictors:

    • 𝑚𝑝𝑔 decreases as weight increases (often strongly)

    • 𝑚𝑝𝑔 also tends to decrease as horsepower increases (holding weight fixed)

  • Always check diagnostics (residual plots) before trusting conclusions

  • 3D Plotly makes the “fitted plane” interpretation very intuitive