Regression Part 2
POLS 3316: Statistics for Political Scientists

Tom Hanna

2023-11-14

Regression (Part 2)

  • Review

  • More on some terms that have been used occasionally: fit, fitted, predicted, residual, predictors, estimates (plus estimator and estimand)

  • Regression with two or more Xs: multiple linear regression

  • Interpreting Regression Results

Review

\(y = \alpha + \beta X + \epsilon\)

Assumptions (Testable)

  • Linearity
  • Normality
  • Independence
  • Homoskedasticity
  • No perfect multicollinearity

The process of least squared residuals

  • Similar to the process of finding variance by measuring squared distances to the mean
  • This measures squared distances to the line given by the equation
  • The method aims to minimize the sum of the squared distances - the sum of squared residuals

New terms (or new definitions)

  • fit, fitted, predicted - the model’s prediction for the expected value of y for a given value of x
  • predictors - x variables
  • estimates (plus estimator and estimand) - the values of the parameters (alpha and beta) estimated by the model
  • residual - the vertical distance between the observed and the estimated (fitted, predicted) using estimated parameters.

Multiple Linear Regression

  • During the lecture on causation, I said that causes aren’t simple - there are often multiple causes

  • So how do we analyze 2 (or 3 or 20) explanatory (X) variables?

Multiple Linear Regression: Answer

With OLS regression.

When we add a second X, we add a new axis so now we don’t have a line, we have the 3d equivalent::

Multiple Linear Regression: Answer

Multiple Regression plane

Multiple Linear Regression: Answer

We can’t really visualize more than two Xes geometrically, but the idea is the same.

Interpreting and writing regression results

  • Example: focusing on sample project and your project data

https://github.com/tomhanna-uh/demonstration_semester_project

and in Posit Cloud:

https://posit.cloud/spaces/422701/join?access_code=i9VTqnd2IhMmH1vkEjNADBKf7bYILcAtEJxjiO0Q

Interpreting and writing regression results

  • The regression equation for a single X variable is:

    \(y = \alpha + \Beta * X + \epsilon\)

  • \(\alpha\) is the intercept or Constant

  • \(\beta\) is the slope or coefficient for X given by the “Coefficient” in the model summary

  • \(\epsilon\) is the error term and is random (you don’t have to do anything with it)

Authorship, License, Credits

Creative Commons License