Teaching Notes: Day 5 - Multiple Regression Application
Lab
Session Overview
Duration: 2 × 45-minute sessions
Objective: Apply multiple regression concepts to
analyze wage determinants, test model specifications, and interpret
results.
Tools: GRETL, Wage
Dataset
Session 1: Model Estimation & Interpretation
(45 min)
1. Review of Key Concepts
1. The Fundamental Problem: Why Simple Regression Isn’t
Enough
A. The Naïve Approach: Wage vs. Gender (Simple
Regression)
- Question: “Do women earn less than
men?”
- Simple Regression Approach:
\[
\log(\text{Wage}) = \beta_0 + \beta_1 \text{Female} + \epsilon
\]
- Finds: Women earn 22% less on average.
- But is this all discrimination?
- Problem: This bundles all gender-related
differences (education, job type, experience) into one coefficient.
B. The Reality: Gender Affects Wage Through Multiple
Channels
C. The Need for Multiple Regression
- Goal: Isolate the direct effect of gender,
holding other factors constant.
- Analogy:
- “Comparing wages of men and women with the same education and
job status.”
- Like a controlled experiment, where we “adjust” for
the factors that affect wage (dependent variable).
2. Total Effect vs. Partial Effect: Two Different Research
Questions
( Policy vs. Economic Analysis)
A. When to Use Simple Regression (Total
Effect)
- Question: “What is the overall wage gap society
observes?”
- Useful for policy debates (e.g., “Do we need gender
equity laws?”).
- Example: If women earn less because they
choose lower-paying careers, should policymakers intervene?
B. When to Use Multiple Regression (Partial
Effect)
- Question: “Is there wage discrimination after
accounting for qualifications?”
- Used in legal cases (e.g., suing for pay
discrimination).
- Example: If women with the same education and
job status still earn less, this suggests bias.
C. Real-World Parallel: The “College Premium”
Debate
- Simple Regression: College grads earn more.
- Multiple Regression: But what if they also come
from wealthier families?
- Key Insight: Multiple regression helps separate
causation from correlation.
3. Omitted Variable Bias: The Core Issue
A. Visualizing Bias
- Draw:
- X-axis: Female (0 = Male, 1 = Female)
- Y-axis: Log(Wage)
- Regression line: Downward slope (-0.25).
- Now overlay:
- Education: Women cluster at lower education → pulls
their wages down further.
- Part-Time: More women work part-time → also lowers
wages.
B. The Two Biases at Play
- Downward Bias from Education
- Women have less education → lowers wages.
- If we don’t control for education, the gender gap looks
worse than it is.
- Downward Bias from Part-Time Work
- Women work part-time more often → lowers wages.
- If we don’t control for part-time status, the gender gap looks
worse than it is.
C. Net Effect in Simple Regression
- The observed -0.25 is a mixture:
- True discrimination effect (unknown).
- Minus education bias (makes gap seem bigger).
- Minus part-time bias (makes gap seem bigger).
- Conclusion: We can’t trust the simple
regression!
4. Preparing for GRETL: What We’ll Test
(Linking Theory to Lab)
A. Residual Analysis: Detecting Omitted
Variables
- If residuals correlate with education → education was
omitted!
- If residuals correlate with part-time → part-time was
omitted!
B. Next Steps in Lab
- Run simple regression (gender only).
- Check residuals against education & part-time. Interpret the
coefficient values. Discuss.
- If patterns exist → multiple regression is
needed!
- Extra level of education has an effect of 22% on unexplained wage
(Expected).
- Having a part time job increases unexplained wage by 10%
(Unexpected).
- Why? If part time jobs are more common for employees with higher
education, this might happen.
Summary
What it measures |
Total gender gap |
Partial (direct) effect |
Use case |
Policy discussions |
Legal discrimination |
Main problem |
Omitted variable bias |
Requires more data |
Gender gap interpretation |
“Women earn 22% less” |
“Women earn X% less for the same qualifications” |
Transition to GRETL:
“Now, let’s see this in action with real data!”
2. Lab Exercise 1: Multiple Regression in GRETL
(20 min)
Task: Run the model:
\[
\log(\text{Wage}) = 3.05 - 0.04\text{Female} + 0.03\text{Age} +
0.23\text{Educ} - 0.37\text{Parttime} + e
\]
Steps:
1. Load dataset in GRETL.
2.
Model > Ordinary Least Squares > log(Wage) ~ Female + Age + Educ + Parttime
.
Regression Outcomes Discussion:
| Variable | Coeff. | Interpretation (Relative Effect) |
|————|——–|———————————-|
| Female | -0.041 | Women earn 4% less (p=0.097, NS at 5%)
|
| Age | 0.031 | +3% per year (p=0.000) |
| Educ | 0.233 | +26% per level (p=0.000) |
| Parttime | -0.365 | -31% (p=0.000) |
Key Points:
- Gender effect shrinks from -25% (simple) to -4% (multiple
regression).
- Education and part-time are highly significant confounders.
Session 2: Model Extensions & Testing (45
min)
3. Absolute vs. Relative Effects (10
min)
Mathematical Interpretation: - Log-wage
model (relative effects):
\[
\beta_j = \frac{\partial \text{Wage}}{\partial x_j} \cdot
\frac{1}{\text{Wage}} \approx \% \Delta \text{Wage}
\]
- e.g., \(e^{0.233} - 1 = 26\%\) wage
increase per education level.
- Linear wage model (absolute effects): \[
\text{Wage} = -77.87 - 2.12\text{Female} + 29.47\text{Educ} + \dots
\]
- Education adds $29.47 per level (vs. 26% in log model).
Conceptual Discussion:
- “Use log models for % or relative effects, linear for $
effects.”
4. Testing Education Effects (20 min)
Task: Test if education returns are constant across
levels.
Steps:
1. Create dummies (DE2, DE3, DE4) for education levels.
2. Estimate:
\[
\log(\text{Wage}) = 3.32 - 0.03\text{Female} + 0.03\text{Age} +
0.17\text{DE2} + 0.38\text{DE3} + 0.77\text{DE4} - 0.37\text{Parttime} +
e
\]
3. F-test:
- \(H_0\): Linear education effects
(\(\beta_5 = 2\beta_4, \beta_6 =
3\beta_4\)).
- Compute \(F = \frac{(0.716 -
0.704)/2}{(1-0.716)/493} = 10.4\) > 2.6 (critical
value).
- Reject \(H_0\):
Returns are non-linear (47% gain for highest level).
Wage Effects by Education Level: - Level 1 → 2:
+19%
- Level 2 → 3: +23%
- Level 3 → 4: +47%
Discussion:
- “Higher education yields increasing returns, especially at top
levels.”