2026-02-08

Simple Linear Regression Definition

A simple linear regression model is a mathematical equation between two linear variables:

  • An independent (predictor) variable - x

  • A dependent (response) variable - y

Examples

  • Relationship between height and weight

  • Relationship between salary and expenditure

  • Relationship between car payment and car insurance

Mathmatical Model

The simple linear regression model\(^(1)\):

\[ \hat{y} = b_0 + b_1 x \]

  • x = independent variable

  • y = dependent variable

  • \(b_0\) = intercept

  • \(b_1\) = slope

Least-Squares Regression Line

The equation is \[ \hat{y} = b_0 + b_1 x \]

  • \(b_0 = \bar{y} - b_1 x\) — intercept
  • \(b_1 = r \left( \frac{s_y}{s_x} \right)\) — slope
  • r = correlation coefficient

Alternate computational equation \[ b_1 = \frac{ \sum xy - \frac{(\sum x)(\sum y)}{n} }{ \sum x^2 - \frac{(\sum x)^2}{n} } = \frac{S_{xy}}{S_{xx}} \]

Plotly Plot: Trees dataset

GGPlot #1: Air Quality Dataset

GGPlot #2: Faithful dataset

R Code: Time vs. Eruption Duration with Faithful dataset

library(ggplot2)

ggplot(faithful, aes(x = eruptions, y = waiting)) +
  geom_point(color = "red") +
  geom_smooth(method = "lm", se = FALSE, color = "cyan") +
  labs(
    title = "faithful: Waiting Time vs. Eruption Duration",
    x = "Eruption Duration (min)",
    y = "Waiting Time (min)"
  )

Sources