1. Welcome & Agenda

  • Quick intro to Simple Linear Regression
  • Data example: car weight vs. fuel efficiency
  • Fit the model, test the slope, check assumptions
  • Plots and key takeaways

2. Why Regression?

  • Predict one numeric value from another.
  • Examples:
    • Fuel efficiency vs. car weight
    • Blood pressure vs. sodium intake
  • We’ll use the built-in mtcars dataset.

3. The Model

We model \[ Y = \beta_0 + \beta_1 X + \varepsilon \] where: - \(Y\): response (mpg) - \(X\): predictor (weight) - \(\varepsilon\): random error

4. Key Assumptions

  1. Linearity of mean response
  2. Constant variance of errors
  3. Independence of observations
  4. Errors roughly normal for inference

5. Data & Code

library(tidyverse)
library(broom)
library(plotly)

data(mtcars)
df <- mtcars %>% select(mpg, wt, hp)
head(df)
##                    mpg    wt  hp
## Mazda RX4         21.0 2.620 110
## Mazda RX4 Wag     21.0 2.875 110
## Datsun 710        22.8 2.320  93
## Hornet 4 Drive    21.4 3.215 110
## Hornet Sportabout 18.7 3.440 175
## Valiant           18.1 3.460 105
fit <- lm(mpg ~ wt, data = df)

6. ggplot: Scatter + Line

ggplot(df, aes(x = wt, y = mpg)) +
  geom_point() +
  geom_smooth(method = "lm", se = TRUE) +
  labs(title="Fuel Efficiency vs Weight",
       x="Weight (1000 lbs)", y="MPG")

7. Interpreting Slope

  • Slope \(\hat{\beta}_1\): change in mpg for each 1000 lb increase in weight.
  • Example output:
##      Estimate    Std. Error       t value      Pr(>|t|) 
## -5.344472e+00  5.591010e-01 -9.559044e+00  1.293959e-10

8. Hypothesis Test

Test if slope differs from 0: \[ H_0: \beta_1 = 0, \quad H_a: \beta_1 \neq 0 \] t-statistic: \[ t = \frac{\hat{\beta}_1}{SE(\hat{\beta}_1)} \]

9. ggplot: Residual Plot

aug <- augment(fit)
ggplot(aug, aes(.fitted, .resid)) +
  geom_hline(yintercept=0, linetype="dashed") +
  geom_point() +
  labs(title="Residuals vs Fitted", x="Fitted", y="Residuals")

10. Plotly 3D

plot_ly(df, x=~wt, y=~hp, z=~mpg,
        type="scatter3d", mode="markers") %>%
  layout(scene=list(
    xaxis=list(title="Weight"),
    yaxis=list(title="Horsepower"),
    zaxis=list(title="MPG")
  ))

11.Key Takeaways

  • SLR predicts a numeric outcome from one predictor.
  • Check assumptions with residual plots.
  • In our example: heavier cars → lower mpg.
  • Plotly adds interactive 3D insights.