What is Simple Linear Regression?

We are modeling the relationship between:

  • predictor \(x\)
  • response \(y\)

The idea is to describe how the average value of \(y\) changes as \(x\) changes.

Model in LaTeX

\[ y = \beta_0 + \beta_1 x + \varepsilon \]

  • \(\beta_0\): intercept
  • \(\beta_1\): slope
  • \(\varepsilon\): random error (noise)

Least Squares Idea

We choose \(\hat{\beta}_0\) and \(\hat{\beta}_1\) to minimize: \[ SSE = \sum_{i=1}^{n}(y_i - \hat{y}_i)^2 \]

where \(\hat{y}_i = \hat{\beta}_0 + \hat{\beta}_1 x_i\).

Example Data

##          x         y
## 1 2.875775  2.940276
## 2 7.883051 15.500151
## 3 4.089769  8.441400
## 4 8.830174 12.968987
## 5 9.404673 18.614639
## 6 0.455565  3.536276

Scatter Plot(ggplot)

Regression Line(ggplot)

Linear Model

We fit the regression model using ordinary least squares.

model <- lm(y ~ x, data = df)
summary(model)

3D Visualization(plotly)