mtcars dataset and variables

We will model fuel efficiency using the built-in dataset mtcars.

  • Response: mpg (miles per gallon)
  • Predictor: wt (weight in 1000 lbs)
  • Extra variable for 3D plot: hp (horsepower)

Model (math)

We use simple linear regression:

\[ mpg_i = \beta_0 + \beta_1\,wt_i + \varepsilon_i \]

with the usual assumptions (e.g., \(E[\varepsilon_i]=0\)).

ggplot 1: mpg vs wt (with fitted line)

Inference (math): testing slope and p-value

To test whether weight is associated with mpg:

\[ H_0: \beta_1 = 0 \quad\text{vs}\quad H_a: \beta_1 \neq 0 \]

The test statistic is

\[ t = \frac{\hat\beta_1}{SE(\hat\beta_1)} \]

and the p-value is computed from the \(t\) distribution with \(n-2\) degrees of freedom.

ggplot 2: residuals vs fitted

plotly: interactive 3D scatter (mpg, wt, hp)

R code (all plots shown)

# ---- Setup ----
library(ggplot2)
library(plotly)
data(mtcars)

# ---- ggplot 1: mpg vs wt (with fitted line) ----
ggplot(mtcars, aes(x = wt, y = mpg)) +
  geom_point() +
  geom_smooth(method = "lm", se = TRUE)

# ---- ggplot 2: residuals vs fitted ----
fit <- lm(mpg ~ wt, data = mtcars)
df  <- data.frame(fitted = fitted(fit), resid = resid(fit))

ggplot(df, aes(x = fitted, y = resid)) +
  geom_point() +
  geom_hline(yintercept = 0)

# ---- plotly: interactive 3D scatter (mpg, wt, hp) ----
plot_ly(
  mtcars,
  x = ~wt, y = ~hp, z = ~mpg,
  type = "scatter3d",
  mode = "markers"
)