Overview

  • Topic: Simple linear regression
  • Example: Predict MPG from Weight (mtcars)
  • We will show 2 ggplots, 1 plotly (3D), 2 LaTeX math slides, and one code slide.

Model (LaTeX slide #1)

We model a response \(Y\) with a predictor \(X\): \[ Y_i = \beta_0 + \beta_1 X_i + \varepsilon_i, \quad \varepsilon_i \stackrel{iid}{\sim} N(0,\sigma^2) \] Interpretation: \(\beta_1\) is the expected change in \(Y\) for a 1‑unit increase in \(X\).

Assumptions (LaTeX slide #2)

  1. Linearity
  2. Independence
  3. Constant variance (homoscedasticity)
  4. Normal errors \(\varepsilon_i \sim N(0,\sigma^2)\)

Data

We use built‑in mtcars and keep the columns we need.

df <- dplyr::select(mtcars, mpg, wt, hp)
head(df)
##                    mpg    wt  hp
## Mazda RX4         21.0 2.620 110
## Mazda RX4 Wag     21.0 2.875 110
## Datsun 710        22.8 2.320  93
## Hornet 4 Drive    21.4 3.215 110
## Hornet Sportabout 18.7 3.440 175
## Valiant           18.1 3.460 105

ggplot #1 - Relationship

class: smaller

Fit the model (R code)

mod <- lm(mpg ~ wt, data = df)
summary(mod)
## 
## Call:
## lm(formula = mpg ~ wt, data = df)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -4.5432 -2.3647 -0.1252  1.4096  6.8727 
## 
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)    
## (Intercept)  37.2851     1.8776  19.858  < 2e-16 ***
## wt           -5.3445     0.5591  -9.559 1.29e-10 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 3.046 on 30 degrees of freedom
## Multiple R-squared:  0.7528, Adjusted R-squared:  0.7446 
## F-statistic: 91.38 on 1 and 30 DF,  p-value: 1.294e-10
coef(mod)
## (Intercept)          wt 
##   37.285126   -5.344472
confint(mod)
##                 2.5 %    97.5 %
## (Intercept) 33.450500 41.119753
## wt          -6.486308 -4.202635

ggplot #2 - Residuals vs Fitted

class: smaller

plotly (3D) - Interactive Exploration

Takeaways

  • \(\hat\beta_1 < 0\): as weight rises, MPG falls.
  • Always check diagnostics before trusting the model.
  • Interactive plots help explore multivariate structure.