Week 6 Hour 1: Regression Adjustment for Causal Inference
Statistical Literacy - Synchronous Session
Overview
Learning Goal: Understand regression adjustment as an alternative to IPTW for controlling confounding.
Key Questions:
- How does regression “adjust” for confounding?
- When does regression give us causal effects?
- How does regression compare to IPTW?
Dataset: NHEFS smoking cessation study (males 25-35 years old)
- Treatment: Quit smoking between 1971-1982
- Outcome: Change in BMI
- Confounders:
- Comorbidities: Baseline BMI, cholesterol, diabetes, high blood pressure
- Lifestyle Variables: Exercise frequency, alcohol consumption
- Demographics: Age, education (school years), race, marital status
Section A: IPTW Bridge & Review
Recap: What is IPTW?
Inverse Probability of Treatment Weighting (IPTW):
- Goal: Create balance on observed confounders by reweighting
- Method: Weight by inverse of propensity score
Propensity Score: \(e(X) = P(A=1 | X)\)
- Probability of treatment given confounders
- Calculated from logistic regression model
IPTW Weights:
- Treated: \(w = \frac{1}{e(X)}\)
- Control: \(w = \frac{1}{1-e(X)}\)
Result: Creates pseudo-population where treatment is independent of confounders
DAG: Confounders → Treatment
Initial Imbalance: Love Plot
Key Observation: Several variables show substantial imbalance (SMD > 0.1) before weighting, indicating confounding.
Propensity Score Distributions
Key Insight: IPTW reweights observations so that propensity score distributions become similar between treatment groups.
Balance After IPTW
Key Result: After IPTW, all variables have SMD < 0.1, indicating good balance.
Key Insight: PS Model Matters
What IPTW models: Treatment assignment \(P(A|X)\)
Critical point: You can only balance variables included in PS model!
Let’s see this with NHEFS data…
Balance Check: Love Plots
Observation:
- Model 1 balances comorbidities ✓, but NOT lifestyle or demographics
- Model 2 balances comorbidities + lifestyle ✓, but NOT demographics
- Key point: You can only balance variables included in the PS model!
IPTW Estimates
| Naive | IPTW: Comorbidities | IPTW: +Lifestyle | IPTW: +Demographics | |
|---|---|---|---|---|
| + p < 0.1, * p < 0.05, ** p < 0.01, *** p < 0.001 | ||||
| Quit Smoking | 1.505* | 1.826** | 1.915** | 2.213** |
| (0.584) | (0.681) | (0.703) | (0.757) | |
| Num.Obs. | 218 | 218 | 218 | 218 |
Key point: Different PS models → different estimates
Now let’s see a different approach: Regression adjustment
Section B: Regression Mechanics
Simple Regression: Concept
Linear regression model: \(Y = \beta_0 + \beta_1 X + \varepsilon\)
- \(\beta_1\) = expected change in Y for 1-unit change in X
- Fitted line minimizes squared residuals
- This is descriptive association, not causal
Let’s see two examples with NHEFS data…
Example 1: BMI Over Time
Question: How does baseline BMI predict follow-up BMI?
Result: \(\hat{\beta}_1 \approx\) 0.95 (p < .001)
Interpretation: A 1-unit increase in 1971 BMI is associated with 0.95-unit increase in 1982 BMI on average.
Example 2: BMI Change
Question: How does baseline BMI predict BMI change?
Result: \(\hat{\beta}_1 \approx\) -0.053 (p = 0.119)
Interpretation: Weak, non-significant association between baseline BMI and BMI change.
Key point: Different outcome → different relationship
Multiple Regression: Concept
Multiple regression: \(Y = \beta_0 + \beta_1 X_1 + \beta_2 X_2 + ... + \beta_p X_p + \varepsilon\)
Critical interpretation: Each \(\beta\) represents effect “holding other variables constant”
This “holding constant” language is KEY to adjustment strategy!
Connection to Week 5: Like stratification, but allows continuous variables
Adding Treatment Variable
Now add quit smoking as a predictor:
Model: BMI Change = \(\beta_0 + \beta_1 \times\) Quit Smoking \(+ \beta_2 \times\) Baseline BMI
Results:
- Quit smoking: \(\hat{\beta}_1 \approx\) 1.53 (p = 0.009)
- Baseline BMI: \(\hat{\beta}_2 \approx\) -0.056 (p = 0.1)
Interpretation:
- After adjusting for baseline BMI, quitting smoking is associated with 1.53-unit increase in BMI
- Coefficient for quit smoking is now “adjusted” for baseline BMI
Visualizing Multiple Regression
Key insights:
- Two parallel lines (quitters vs non-quitters)
- Vertical distance = treatment effect (1.53 units)
- Parallel slopes = assumes constant effect across all BMI levels
Progressive Adjustment
What if we add MORE confounders?
| Baseline | +Comorbidities | +Lifestyle | +Demographics | |
|---|---|---|---|---|
| + p < 0.1, * p < 0.05, ** p < 0.01, *** p < 0.001 | ||||
| (Intercept) | 4.494** | 6.212*** | 5.615** | 4.630 |
| (1.511) | (1.809) | (1.858) | (3.112) | |
| qsmkTRUE | 1.530** | 1.800** | 1.887** | 2.071*** |
| (0.582) | (0.598) | (0.596) | (0.609) | |
| bmi71 | -0.056+ | -0.045 | -0.040 | -0.032 |
| (0.034) | (0.035) | (0.035) | (0.035) | |
| cholesterol | -0.009 | -0.008 | -0.008 | |
| (0.007) | (0.007) | (0.007) | ||
| diabetes | 2.108 | 2.274 | 1.989 | |
| (1.536) | (1.527) | (1.526) | ||
| hbp | -2.465 | -2.618+ | -2.359 | |
| (1.564) | (1.556) | (1.554) | ||
| exercise1 | -0.320 | -0.252 | ||
| (0.564) | (0.563) | |||
| exercise2 | 1.158+ | 1.108+ | ||
| (0.630) | (0.632) | |||
| alcoholfreq | 0.055 | 0.039 | ||
| (0.213) | (0.219) | |||
| marital | 0.231 | |||
| (0.230) | ||||
| race1 | 1.287 | |||
| (0.799) | ||||
| school | -0.106 | |||
| (0.091) | ||||
| age | 0.040 | |||
| (0.078) | ||||
| Num.Obs. | 218 | 218 | 218 | 218 |
| R2 | 0.042 | 0.066 | 0.092 | 0.117 |
Observation: Quit smoking coefficient changes from 1.53 → 2.07
This is omitted variable bias in action!
Section C: Omitted Variable Bias
The Problem
Omitted Variable Bias (OVB): What happens when we leave out a confounder?
Setup:
- True model: \(Y = \beta_0 + \beta_1 A + \beta_2 Z + \varepsilon\)
- Fitted model: \(Y = \beta_0 + \beta_1 A + \varepsilon\) (omit Z)
- Result: \(\hat{\beta}_1\) is BIASED
When does bias occur? Need BOTH:
- Z affects Y (\(\beta_2 \neq 0\))
- Z correlated with A (Z is a confounder!)
OVB Direction Formula
Direction of bias:
\[\text{Sign(Bias)} = \text{Sign}(\beta_2) \times \text{Sign}(\text{Corr}(A, Z))\]
| \(\beta_2\) (Z → Y) | Corr(A,Z) | Bias Direction |
|---|---|---|
| Positive (+) | Positive (+) | Positive (overestimate) |
| Positive (+) | Negative (−) | Negative (underestimate) |
| Negative (−) | Positive (+) | Negative (underestimate) |
| Negative (−) | Negative (−) | Positive (overestimate) |
Let’s test this with NHEFS data…
NHEFS Example: Omitting Exercise
Scenario: What if we omit “exercise” from the model?
Results:
- \(\beta_2\) (exercise → outcome): 0.476 (POSITIVE)
- Corr(quit smoking, exercise): -0.013 (NEGATIVE)
- Predicted bias direction: NEGATIVE (underestimate)
- Observed bias: -0.079 (NEGATIVE)
- Match? YES ✓
Verification Table
| Without Exercise | With Exercise | |
|---|---|---|
| + p < 0.1, * p < 0.05, ** p < 0.01, *** p < 0.001 | ||
| Quit Smoking | 1.800** | 1.880** |
| (0.598) | (0.594) | |
| Num.Obs. | 218 | 218 |
| R2 | 0.066 | 0.091 |
Key takeaways:
- OVB formula correctly predicts bias direction
- This is the regression version of unmeasured confounding
- If we don’t measure/include confounder → biased estimate
- Same problem IPTW faces!
NHEFS Example 2: Omitting Age
New question: What if we omit age from our model?
Analysis:
- \(\beta_2\) (age → outcome): 0.08 (POSITIVE)
- Corr(quit smoking, age): 0.081 (POSITIVE)
- Predicted bias direction: POSITIVE
- Observed bias: 0.04 (POSITIVE)
- Match? YES ✓
Interpretation:
- Age has a positive effect on BMI change (older people gain more weight)
- Age is positively correlated with quitting smoking
- Omitting age creates POSITIVE bias in our treatment effect estimate
NHEFS Example 3: Omitting Baseline BMI
New question: What if we omit baseline BMI from our model?
Note: This is interesting because baseline BMI is a baseline measure of our outcome variable!
Analysis:
- \(\beta_2\) (baseline BMI → outcome): -0.053 (NEGATIVE)
- Corr(quit smoking, baseline BMI): 0.026 (POSITIVE)
- Predicted bias direction: NEGATIVE
- Observed bias: -0.025 (NEGATIVE)
- Match? YES ✓
Interpretation:
- Baseline BMI has a negative effect on BMI change (regression to the mean)
- Higher baseline BMI is positively correlated with quitting
- Omitting baseline BMI creates NEGATIVE bias
Why this matters: Baseline measures of the outcome are always important confounders to include!
Comparison Table: Multiple OVB Examples
| Omitted Variable | β₂ (Z → Y) | Corr(A, Z) | Predicted Bias | Observed Bias | Formula Works? | |
|---|---|---|---|---|---|---|
| as.numeric(exercise) | Exercise | 0.476 (+) | -0.013 (-) | NEGATIVE (underestimate) | -0.079 (-) | ✓ YES |
| age | Age | 0.08 (+) | 0.081 (+) | POSITIVE | 0.04 (+) | ✓ YES |
| bmi71 | Baseline BMI | -0.053 (-) | 0.026 (+) | NEGATIVE | -0.025 (-) | ✓ YES |
Key Takeaways:
- OVB formula works consistently across different types of confounders
- Different confounders can create bias in different directions
- Baseline measures of the outcome are critical confounders
- All measured confounders should be included in the model!
Section D: IPTW Makes Regression Robust
The Key Insight
Powerful discovery: When we use IPTW weighting, the treatment effect estimate stays stable regardless of which confounders we include in the regression!
What this means:
- With IPTW weights: qsmkTRUE coefficient ≈ 3.5 whether we control for 0, 1, or 10 confounders
- Without IPTW weights: coefficient changes dramatically as we add/remove confounders
Why this matters:
- IPTW has already balanced the confounders through weighting
- Regression doesn’t need to “work hard” to adjust for them
- This provides double protection against confounding
Let’s demonstrate this with our smoking data…
Four Weighted Regression Models
We’ll fit 4 models with progressively more confounders, all using IPTW weights:
| Model Specification | qsmkTRUE Estimate | Std Error | 95% CI Lower | 95% CI Upper | |
|---|---|---|---|---|---|
| qsmkTRUE | 1. No confounders | 2.213 | 0.757 | 0.729 | 3.697 |
| qsmkTRUE1 | 2. + Baseline BMI | 2.222 | 0.758 | 0.735 | 3.709 |
| qsmkTRUE2 | 3. + Comorbidities | 2.192 | 0.734 | 0.753 | 3.630 |
| qsmkTRUE3 | 4. + All confounders | 2.305 | 0.670 | 0.993 | 3.617 |
KEY OBSERVATION:
- All four estimates cluster around 2.23
- Range across models: Only 0.113
- Confidence intervals all overlap substantially
- Conclusion: With IPTW weighting, which confounders we include doesn’t matter much!
Why This Works: Double Robustness
The key insight: When we achieve balance through IPTW, the weighted data mimics randomization.
Two complementary approaches to remove confounding:
- IPTW: Creates balance by reweighting so confounders are uncorrelated with treatment
- Regression adjustment: Directly controls for confounders in the outcome model
Together = “Double Robustness”: Using IPTW + regression provides extra protection. We saw this! Weighted Model 1 (no confounders) ≈ Weighted Model 4 (all confounders) because IPTW already removed the confounding. This approach is called doubly robust estimation and is popular in practice because you need to misspecify BOTH models to get biased estimates.
Comparison: RCT vs Observational Study with IPTW
| Aspect | RCT (Randomized Trial) | Observational Study + IPTW |
|---|---|---|
| Balance | Created by randomization | Created by reweighting |
| Exchangeability | Holds by design | Assumed (conditional on X) |
| Confounding | None (by construction) | Removed if all confounders measured |
| Regression sensitivity | Low (already balanced) | Low (after IPTW balancing) |
| Model dependence | Minimal | Minimal (after good balance) |
| Key assumption | Random assignment worked | Unconfoundedness + correct PS model |
Main insight: IPTW attempts to recreate the balance that randomization provides naturally in RCTs.
Section E: Synthesis & Comparison
When Does Regression Give Causal Effects?
Case 1: Randomized Control Trial
If quit smoking was randomly assigned:
- Randomization → quit smoking ⊥ (all confounders)
- Exchangeability holds by design
What happens with regression:
- Simple model (Y ~ A): Already unbiased
- Multiple model (Y ~ A + X): Increases precision, still unbiased
- Functional form doesn’t matter much (we just demonstrated this!)
Bottom line: In RCT, regression is “nice to have” not “must have”
Case 2: Observational Study
Requirements for unbiased estimate:
Requirement 1: Unconfoundedness \((Y_1, Y_0) \perp A | X\)
- Must measure and include ALL confounders
- No unmeasured confounding
- Same as IPTW!
Requirement 2: Correct functional form
- Linear relationships (or include polynomials/interactions)
- No missing interactions
- Correct link function
Critical point: Regression requires unconfoundedness (like IPTW) PLUS correct outcome model
IPTW vs Regression: Side-by-Side
| Aspect | IPTW | Regression |
|---|---|---|
| Strategy | Reweight to create balance | Model outcome to adjust |
| What you model | Treatment assignment P(A|X) | Outcome E[Y|A,X] |
| Unconfoundedness | ✓ Required | ✓ Required |
| Additional assumption | Correct PS model | Correct outcome model |
| Can check | Covariate balance | Residual diagnostics |
| Vulnerable when | PS model misspecified | Outcome model misspecified |
| What to get right | Treatment selection | Outcome prediction |
| Robust to functional form? | Yes, if balanced | No, unless balanced |
Key Takeaways
1. Both methods adjust for confounding
- IPTW: Adjusts via reweighting
- Regression: Adjusts via modeling (“holding constant”)
- Both aim to remove confounding
2. Both require same fundamentals
- Unconfoundedness: (Y₁, Y₀) ⊥ A | X
- Only adjust for OBSERVED confounders
- Neither fixes unmeasured confounding
- Results still observational, not experimental
3. Different modeling requirements
- IPTW: Must correctly model P(A|X)
- Regression: Must correctly model E[Y|A,X]
- Both vulnerable to misspecification
4. Balance helps both methods
- IPTW creates balance by design (when PS model correct)
- Regression benefits from balance (less sensitive to functional form)
- We demonstrated this empirically!
5. Critical reading skills
When reading papers, look for:
- “Adjusted for…” → Regression
- “Propensity score weighted” → IPTW
- “Doubly robust” → Combines both
Always ask:
- Is unconfoundedness plausible?
- Were balance/diagnostics checked?
- Could unmeasured confounding explain results?
Summary
Today we learned:
- Regression mechanics: Simple → multiple → omitted variable bias
- OVB formula: Predicts direction of bias from omitted confounders
- Balance matters: Functional form misspecification less critical when balanced
- IPTW vs Regression: Different strategies, same fundamental requirements
Next class: Practice applying these concepts to real studies