2025-03-16

Introduction

Simple Linear Regression is a statistical method used to model the relationship between two continuous variables:
- Independent Variable (X)
- Dependent Variable (Y)

We assume the relationship follows the equation:

\[ Y = \beta_0 + \beta_1 X + \epsilon \]

Where:
- \(\beta_0\) is the intercept
- \(\beta_1\) is the slope
- \(\epsilon\) is the error term

Example: Predicting MPG

We want to predict miles per gallon based on horsepower

Dataset: mtcars
- X (hp): Horsepower of the car
- Y (mpg): Miles per gallon

Exploratory Data Analysis

Fitting a Linear Model

Use lm() function to fit a regression model:

The estimated regression equation is:

\[ \hat{mpg} = \beta_0 + \beta_1 \times horsepower \]

Visualizing the Regression Line

Residual Analysis

ggplot(data.frame(residuals = model$residuals), aes(x = residuals)) +
  geom_histogram(fill = "blue", color = "white", bins = 30) +
  labs(title = "Residuals Distribution",
       x = "Residuals",
       y = "Frequency") 

3D Visualization