Slide 1: Introduction

What is Linear Regression?

Linear regression models the relationship between a dependent variable (y) and one or more independent variables (x) using a straight line.

The general model is:

\[ y = \beta_0 + \beta_1 x + \epsilon \]

Slide 2: Example Dataset

We’ll use R’s built-in mtcars dataset to explore the relationship between car weight (wt) and miles per gallon (mpg).

head(mtcars[, c("wt", "mpg")])
##                      wt  mpg
## Mazda RX4         2.620 21.0
## Mazda RX4 Wag     2.875 21.0
## Datsun 710        2.320 22.8
## Hornet 4 Drive    3.215 21.4
## Hornet Sportabout 3.440 18.7
## Valiant           3.460 18.1

Slide 3: Data Visualization (ggplot)

We can visualize the relationship between wt and mpg.

Slide 4: Fitting the Model

We’ll fit a simple linear regression model predicting mpg from weight.

model <- lm(mpg ~ wt, data = mtcars)
summary(model)
## 
## Call:
## lm(formula = mpg ~ wt, data = mtcars)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -4.5432 -2.3647 -0.1252  1.4096  6.8727 
## 
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)    
## (Intercept)  37.2851     1.8776  19.858  < 2e-16 ***
## wt           -5.3445     0.5591  -9.559 1.29e-10 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 3.046 on 30 degrees of freedom
## Multiple R-squared:  0.7528, Adjusted R-squared:  0.7446 
## F-statistic: 91.38 on 1 and 30 DF,  p-value: 1.294e-10

Slide 5: Regression Line (ggplot2)

Let’s add the regression line to the scatter plot.

## `geom_smooth()` using formula = 'y ~ x'

Slide 6: Model Equation and Interpretation

From the model output, suppose we obtained:

\[ \hat{y} = 37.285 - 5.344x \]

This means for each additional 1000 lbs of car weight, the fuel efficiency decreases by 5.34 mpg on average.

Slide 7: 3D Visualization (Plotly)

Let’s visualize mpg, wt, and hp (horsepower) together in 3D.

Slide 8: Math Behind Linear Regression

We minimize the Sum of Squared Errors (SSE):

\[ SSE = \sum_{i=1}^{n} (y_i - (\beta_0 + \beta_1 x_i))^2 \]

To find estimates:

\[ \frac{\partial SSE}{\partial \beta_0} = 0, \quad \frac{\partial SSE}{\partial \beta_1} = 0 \]

Solving these gives the Least Squares Estimators for \(\beta_0\) and \(\beta_1\).

Slide 9: Checking Model Residuals

Slide 10: Conclusion

  • Linear regression helps us understand and predict relationships between variables.
  • Weight has a negative impact on fuel efficiency in cars.
  • Visualization and diagnostics are key parts of model interpretation.

Thank you for viewing!

Slide 11: References

  • Dataset: mtcars (R Base)
  • Packages: ggplot2, plotly, dplyr
  • Concept: Simple Linear Regression in Statistics