- Simple linear regression is a statistical method that allows us to summarize and study relationships between two continuous variables.
- One variable, denoted \(x\), is regarded as the predictor, explanatory, or independent variable.
- The other variable, denoted \(y\), is regarded as the response, outcome, or dependent variable.
The simple linear regression model is defined by the following equation:
\[y_i = \beta_0 + \beta_1 x_i + \epsilon_i\]
Where: - \(y_i\) is the response variable for the \(i^{th}\) observation. - \(x_i\) is the explanatory variable. - \(\beta_0\) is the y-intercept. - \(\beta_1\) is the slope of the regression line. - \(\epsilon_i\) is the random error term.
To find the line of best fit, we minimize the sum of squared residuals (Least Squares Method). The estimates are calculated as:
\[\hat{\beta}_1 = \frac{\sum_{i=1}^{n} (x_i - \bar{x})(y_i - \bar{y})}{\sum_{i=1}^{n} (x_i - \bar{x})^2}\]
\[\hat{\beta}_0 = \bar{y} - \hat{\beta}_1 \bar{x}\]
These formulas give us the estimated slope and intercept.