We will be looking at the relationship between commute times and distance traveled. For this presentation, we have 500 points of data for each of the following cities: Boston, Houston, Minneapolis, and Washington.
2026-02-08
We will be looking at the relationship between commute times and distance traveled. For this presentation, we have 500 points of data for each of the following cities: Boston, Houston, Minneapolis, and Washington.
\[Time_i = \beta_0 + \beta_1 \cdot Distance_i + \varepsilon_i\]
\[ H_0: \beta_1 = 0 \]
\[ H_a: \beta_1 \neq 0 \]
## `geom_smooth()` using formula = 'y ~ x'
g <- lm(Time ~ Distance, data = MetroCommutes)
ggplot(data.frame(
fitted = fitted(g),
residuals = resid(g)
), aes(x = fitted, y = residuals)) +
geom_point(alpha = 0.4) +
geom_hline(yintercept = 0, linetype = "dashed") +
labs(
title = "Residuals vs Fitted Values",
x = "Fitted Commute Time",
y = "Residuals"
)
Based on the available data, we can conclude that commute time generally increases as commute distance increases. We used a linear model to display this relationship. Residual analysis demonstrates that the linear model we used was appropriate for this data set.