2026-02-09

Simple Linear Regression – Title Slide

Understanding relationships between two variables using statistics.

What is Simple Linear Regression?

Simple linear regression models the relationship between: - One predictor variable \(x\) - One response variable \(y\)

It assumes a linear relationship between the two.

Model Equation

The simple linear regression model is:

\[ y = \beta_0 + \beta_1 x + \varepsilon \]

Where: - \(\beta_0\) is the intercept
- \(\beta_1\) is the slope
- \(\varepsilon\) is the random error term

Example Dataset

We generate simulated data where: - \(x\) represents hours studied - \(y\) represents exam score

A linear trend with random noise is added.

set.seed(123)

data <- data.frame(
  x = seq(1, 10, length.out = 50)
)

data$y <- 50 + 5 * data$x + rnorm(50, 0, 5)

model <- lm(y ~ x, data = data)

Least Squares Estimation

The slope estimator is:

\[ \hat{\beta}_1 = \frac{\sum (x_i - \bar{x})(y_i - \bar{y})} {\sum (x_i - \bar{x})^2} \]

The intercept estimator is:

\[ \hat{\beta}_0 = \bar{y} - \hat{\beta}_1 \bar{x} \]

Slide 7 — ggplot #1

Slide 8 — ggplot #2

## `geom_smooth()` using formula = 'y ~ x'

Slide 9 — Plotly plot