Simple Linear Regression

Kamalesh Reddi Arugunta

2025-11-10

Introduction

In this presentation, we explore Simple Linear Regression, one of the most widely used tools in Statistics and Machine Learning.
It helps us understand the relationship between two variables and make predictions.

What is Simple Linear Regression?

Simple linear regression models the relationship between a dependent variable \(y\) and an independent variable \(x\) using a straight line.

The model assumes: \[ y = \beta_0 + \beta_1 x + \epsilon \]

Where: - \(\beta_0\): intercept
- \(\beta_1\): slope
- \(\epsilon\): random error term

Example Dataset: mtcars

We’ll use R’s built-in mtcars dataset.
It contains car performance data such as miles per gallon (mpg), horsepower, and weight.

We’ll predict mpg (miles per gallon) using weight (wt).

Visualizing the Relationship

The plot shows a negative relationship — heavier cars tend to have lower fuel efficiency.

Fitting the Model

We can fit a linear model in R using the lm() function.

model <- lm(mpg ~ wt, data = mtcars)
summary(model)
## 
## Call:
## lm(formula = mpg ~ wt, data = mtcars)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -4.5432 -2.3647 -0.1252  1.4096  6.8727 
## 
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)    
## (Intercept)  37.2851     1.8776  19.858  < 2e-16 ***
## wt           -5.3445     0.5591  -9.559 1.29e-10 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 3.046 on 30 degrees of freedom
## Multiple R-squared:  0.7528, Adjusted R-squared:  0.7446 
## F-statistic: 91.38 on 1 and 30 DF,  p-value: 1.294e-10

This code computes the regression coefficients \(\beta_0\) and \(\beta_1\).

Regression Equation

From the output, we get an estimated equation like: \[ \hat{y} = 37.29 - 5.34x \]

Interpretation: - The intercept (37.29) means a car with zero weight (theoretically) would have 37.29 mpg.
- The slope (-5.34) means for every 1000 lbs increase in weight, mpg decreases by about 5.34.

Regression Line Plot

## `geom_smooth()` using formula = 'y ~ x'

This plot shows the fitted regression line with confidence intervals.

3D Visualization with Plotly

Let’s visualize MPG, Weight, and Horsepower together in 3D.

This interactive 3D plot helps us visualize how both weight and horsepower impact fuel efficiency.

Mathematical Foundation

We estimate the parameters \(\beta_0\) and \(\beta_1\) using the least squares method, which minimizes: \[ \sum_{i=1}^{n} (y_i - \hat{y}_i)^2 \]

The formulas are: \[ \beta_1 = \frac{\sum (x_i - \bar{x})(y_i - \bar{y})}{\sum (x_i - \bar{x})^2}, \quad \beta_0 = \bar{y} - \beta_1 \bar{x} \]

Model Performance

Let’s check how well our model fits.

## [1] 0.7528328

An \(R^2\) value close to 1 means a strong linear relationship.

Conclusion

References