January 17, 2021

Simple Linear Regression

Linear regression attempts to model the relationship between two or more variables by fitting a linear equation \(y = \beta_o + \beta_1x\) to observed data.

Simple Linear Regression on mtcars

Our goal is to identify which predictor best predicts gas mileage. We simply fit a simple linear regression model, and use the variable that has the strongest positive correlation to mpg.

The model

Let’s now fit a linear regression model that shows the relationship between mpg, our response variable, and drat, our predictor.

Residuals

Let’s see how the residuals look like on a scatter plot to better visualize how good our model is. To assess, the points should be close to zero and doesn’t show any pattern.

Conclusion and recommendations

  • Linear regression assumes a strong relationship (not necessarily positive) of the predictor and outcome
  • Although our chosen variable has the highest positive correlation, it still didn’t do well based on the residual plots as there’s a lot of variability
  • We could set a threshold on the strength of relationship, either positive or negative, and use those variables, hence, a multivariate regression