2023-10-13
The simple linear regression model represents the relationship between a dependent variable (Y) and an independent variable (X) with a straight line. \[ Y = \beta_0 + \beta_1X + \varepsilon \] - Equation in terms of errors \[Y -\hat{Y} = \varepsilon\] - Equation in terms of the slope and Error Term \[\beta_1 = \frac{Y - \beta_0 - \varepsilon}{X}\]
The coefficient B represents the slope of the regression line, which indicates how much the dependent variable changes for a one-unit change in the independent variable. \[\hat{\beta_1} = \frac{\sum_{i=1}^{n}(X_i-\bar{X})(Y_i - \bar{Y})}{\sum_{i=1}^{n}(X_i - \bar{X})^{2}}\] Equation in terms of the mean of X and Y \[\hat{\beta_1} = \frac{(\bar{X}\bar{Y})(\bar{XY})}{(\bar{X})^{2} - \bar{X^{2}}}\]
This example helps visualize how the residuals vary with changes in horsepower, providing insights into the goodness of fit of the linear regression model.
By visualizing the patterns, we can assess whether the linear regression model adequately captures the relationship between horsepower and miles per gallon. In the context of this specific example, it helps us evaluate how well the model predicts car fuel efficiency (mpg) based on engine power (horsepower).
library(ggplot2) lm_model <- lm(mpg ~ hp, data = mtcars) residuals <- resid(lm_model) ggplot(data = data.frame( Residuals = residuals), aes(sample = Residuals)) + geom_qq() + geom_qq_line() + labs(title = "Q-Q Plot for Residuals")