2025-03-24

Introduction

Simple Linear Regression is a statistical technique used to model the relationship between a dependent variable \(y\) and an independent variable \(x\).

The model is represented by the equation: \[ y = \beta_0 + \beta_1 x + \varepsilon \]

Example Dataset

We will study the relationship between car weight and miles per gallon (mpg) using the built-in mtcars dataset.

# Load the data
library(datasets)
data <- mtcars

Scatter Plot with Regression Line (ggplot2)

Residual Plot (ggplot2)

Regression Equation in LaTeX

We fit the model: \[ \hat{y} = \beta_0 + \beta_1 x \]

The estimated regression equation is: \[ \hat{\text{mpg}} = 37.29 - 5.34 \times \text{wt} \]

Where: - \(\text{wt}\): weight of the car (in 1000 lbs) - \(\hat{\text{mpg}}\): predicted miles per gallon

Plotly Interactive Plot

R Code for Regression

# Fit the regression model
model <- lm(mpg ~ wt, data = mtcars)

# Summary of the model
summary(model)

Output of Regression Summary

## 
## Call:
## lm(formula = mpg ~ wt, data = data)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -4.5432 -2.3647 -0.1252  1.4096  6.8727 
## 
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)    
## (Intercept)  37.2851     1.8776  19.858  < 2e-16 ***
## wt           -5.3445     0.5591  -9.559 1.29e-10 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 3.046 on 30 degrees of freedom
## Multiple R-squared:  0.7528, Adjusted R-squared:  0.7446 
## F-statistic: 91.38 on 1 and 30 DF,  p-value: 1.294e-10

Conclusion

  • Simple Linear Regression models a linear relationship between two variables.
  • We explored the relationship between car weight and MPG using the mtcars dataset.
  • Visualization and analysis were done using ggplot2 and plotly.
  • The regression slope indicates that as weight increases, MPG decreases.

References

Thank You!