2026-02-09

What is simple linear regression?

Simple linear regression models the relationship between: - A predictor variable \(x\) - A response variable \(y\)

The goal is to explain or predict \(y\) using a straight line.

The model (math)

\[ Y_i = \beta_0 + \beta_1 x_i + \varepsilon_i, \quad \varepsilon_i \sim N(0,\sigma^2) \]

  • \(\beta_0\): intercept
  • \(\beta_1\): slope

Example data

We use the built-in airquality dataset.

library(ggplot2)
library(plotly)

data(airquality)
df <- airquality
df <- df[complete.cases(df$Temp, df$Wind), ]

Fit the linear model

We fit a simple linear regression model predicting temperature from wind speed.

Call:
lm(formula = Temp ~ Wind, data = df)

Residuals:
    Min      1Q  Median      3Q     Max 
-23.291  -5.723   1.709   6.016  19.199 

Coefficients:
            Estimate Std. Error t value Pr(>|t|)    
(Intercept)  90.1349     2.0522  43.921  < 2e-16 ***
Wind         -1.2305     0.1944  -6.331 2.64e-09 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 8.442 on 151 degrees of freedom
Multiple R-squared:  0.2098,    Adjusted R-squared:  0.2045 
F-statistic: 40.08 on 1 and 151 DF,  p-value: 2.642e-09

Scatter plot + regression line (ggplot)

Residuals vs fitted values (ggplot)

Scatter plot + regression line (plotly)