Linear Regression: Interpretation

What this presentation covers

What linear regression is
The mathematical model
An example using real data
Diagnostics and prediction
Visualization with ggplot and Plotly

Why linear regression?

Linear regression is used to explain and predict a numeric outcome using one or more predictors.

Examples: - Business: sales vs advertising
- Engineering: output vs system inputs
- Biology: growth vs dosage

In this presentation: - Response variable: mpg (miles per gallon)
- Predictor: wt (car weight)

The linear regression model

\[ Y_i = \beta_0 + \beta_1 X_i + \varepsilon_i \]

Where: - \(\beta_0\) is the intercept
- \(\beta_1\) is the slope
- \(\varepsilon_i\) is random error

Least squares estimation (Math)

\[ \min_{\beta_0,\beta_1}\sum_{i=1}^{n}(Y_i-\beta_0-\beta_1X_i)^2 \]

This gives the “best-fitting” line.

mpg vs weight (ggplot)

Model fit and coefficient table

Linear regression results: mpg ~ wt
term	estimate	conf.low	conf.high	p.value
(Intercept)	37.2851	33.4505	41.1198	0
wt	-5.3445	-6.4863	-4.2026	0

Interpretation: - Negative slope → heavier cars have lower mpg
- p-value tests whether weight is a significant predictor

Diagnostics: residuals vs fitted (ggplot)

A good model shows residuals randomly scattered around zero.

Prediction and uncertainty

For a new value \(x_0\):

\[ \hat{Y}(x_0)=\hat{\beta}_0+\hat{\beta}_1 x_0 \]

Two intervals: - Confidence interval: mean response
- Prediction interval: new observation (wider)

Code

# Fit the model
fit1 <- lm(mpg ~ wt, data = mtcars)

# Predict mpg for a car weighing 3.0 (1000 lbs)
predict(fit1, newdata = data.frame(wt = 3.0), interval = "prediction")

##        fit      lwr      upr
## 1 21.25171 14.92987 27.57355

3D Plotly visualization

Key takeaways

Linear regression models relationships between variables
Least squares finds the best-fitting line or plane
ggplot and Plotly improve interpretation and communication