2025-10-20

Simple Linear Regression: An Overview

We’ll explore Simple Linear Regression, a foundational concept in statistics that models the relationship between two variables.

3D Visualization (Plotly)

Regression Line (ggplot #1)

## `geom_smooth()` using formula = 'y ~ x'

Residual Distribution (ggplot #2)

Regression Coefficients

##             Estimate Std. Error  t value     Pr(>|t|)
## (Intercept) 2.028376 0.29338333  6.91374 4.840349e-10
## x           1.473764 0.05343931 27.57828 5.575170e-48

Model Summary

## 
## Call:
## lm(formula = y ~ x, data = df)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -1.9073 -0.6835 -0.0875  0.5806  3.2904 
## 
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)    
## (Intercept)  2.02838    0.29338   6.914 4.84e-10 ***
## x            1.47376    0.05344  27.578  < 2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.9707 on 98 degrees of freedom
## Multiple R-squared:  0.8859, Adjusted R-squared:  0.8847 
## F-statistic: 760.6 on 1 and 98 DF,  p-value: < 2.2e-16

Regression Equation (Math in LaTeX)

The simple linear regression model is given by:

\[ Y = \beta_0 + \beta_1 X + \varepsilon \]

where: - \(\beta_0\) is the intercept
- \(\beta_1\) is the slope
- \(\varepsilon\) is the random error term

Confidence Interval for Slope

\[ \beta_1 \pm t_{\alpha/2, n-2} \cdot SE(\beta_1) \]

This gives the range in which the true slope likely lies with 95% confidence.