The simple linear regression model is defined as:
\[Y = \beta_0 + \beta_1 X + \epsilon\]
Where:
The coefficients are estimated using the least squares method, which minimizes the sum of squared residuals.
\[\hat{\beta}_1 = \frac{\sum(x_i - \bar{x})(y_i - \bar{y})}{\sum(x_i - \bar{x})^2}\]
\[\hat{\beta}_0 = \bar{y} - \hat{\beta}_1 \bar{x}\]
The Faithful dataset records waiting time between eruptions and eruption duration for the Old Faithful geyser in Yellowstone National Park, Wyoming, USA. It contains two variables, eruptions and waiting, both measured in minutes, with 272 observations total.
## eruptions waiting
## Min. :1.600 Min. :43.0
## 1st Qu.:2.163 1st Qu.:58.0
## Median :4.000 Median :76.0
## Mean :3.488 Mean :70.9
## 3rd Qu.:4.454 3rd Qu.:82.0
## Max. :5.100 Max. :96.0
## `geom_smooth()` using formula = 'y ~ x'