Correlation and Linear Regression
What is Correlation?
- Correlation measures how two interval-level variables move together.
- Pearson’s r ranges from −1 to +1:
- +1 = perfect positive
- −1 = perfect negative
- 0 = no linear relationship
Example: Education and Turnout
What is Regression?
- Regression quantifies how much Y changes for a one-unit change in X.
- Simple regression equation:
\[
\hat{y} = \hat{a} + \hat{b}x
\]
Regression Example: Studying and Exam Scores
- Data from 8 students shows more hours studied → higher scores.
- Regression line:
\[
\hat{y} = 55 + 6x
\]
- A student who studies 3 hours is predicted to score:
\[
55 + (6 \times 3) = 73
\]
Testing Significance
- Use a t-test to assess if the relationship is real.
- Formula:
\[
t = \frac{b - 0}{SE_b}
\]
- Example: t = 11.5 → highly significant (p < .001)
R-Squared: Model Fit
- R-squared tells us how much of Y is explained by X.
\[
R^2 = \frac{\text{Explained Variation}}{\text{Total Variation}}
\]
- Example: R² = 0.30 → education explains 30% of turnout variation.
Multiple Regression: Adding Variables
- Lets us control for other factors.
- Equation:
\[
\hat{y} = \hat{a} + b_1x_1 + b_2x_2
\]
- Example: Turnout = education + battleground status
- After control: Education effect = 0.67; Battleground = +5.74 pts
Dummy Variables
- Use 0/1 indicators for categories.
- Example: Voter ID laws → 4 dummies for 5 categories.
- Intercept = predicted value for base category.
Interaction Effects
- Interaction = effect of X1 depends on X2.
- Include a term like:
\[
\text{Education} \times \text{Battleground}
\]
- If significant, it shows different effects in different contexts.
Why It Matters
- These tools help us answer big questions:
- Who votes and why?
- Do policies work?
- What explains political attitudes?
- Regression allows testing and explaining our theories with data.
Wrap-Up Questions
- What does Pearson’s r tell us?
- What does the slope in regression mean?
- What is R²?
- Why use multiple regression?
Thanks!
See you next time!