Week 6 Hour 1: Regression Adjustment for Causal Inference

Statistical Literacy - Synchronous Session


Overview

Learning Goal: Understand regression adjustment as an alternative to IPTW for controlling confounding.

Key Questions:

  1. How does regression “adjust” for confounding?
  2. When does regression give us causal effects?
  3. How does regression compare to IPTW?

Dataset: NHEFS smoking cessation study (males 25-35 years old)

  • Treatment: Quit smoking between 1971-1982
  • Outcome: Change in BMI
  • Confounders:
    • Comorbidities: Baseline BMI, cholesterol, diabetes, high blood pressure
    • Lifestyle Variables: Exercise frequency, alcohol consumption
    • Demographics: Age, education (school years), race, marital status

Section A: IPTW Bridge & Review

Recap: What is IPTW?

Inverse Probability of Treatment Weighting (IPTW):

  • Goal: Create balance on observed confounders by reweighting
  • Method: Weight by inverse of propensity score

Propensity Score: \(e(X) = P(A=1 | X)\)

  • Probability of treatment given confounders
  • Calculated from logistic regression model

IPTW Weights:

  • Treated: \(w = \frac{1}{e(X)}\)
  • Control: \(w = \frac{1}{1-e(X)}\)

Result: Creates pseudo-population where treatment is independent of confounders


DAG: Confounders → Treatment


Initial Imbalance: Love Plot

Key Observation: Several variables show substantial imbalance (SMD > 0.1) before weighting, indicating confounding.


Propensity Score Distributions

Key Insight: IPTW reweights observations so that propensity score distributions become similar between treatment groups.


Balance After IPTW

Key Result: After IPTW, all variables have SMD < 0.1, indicating good balance.


Key Insight: PS Model Matters

What IPTW models: Treatment assignment \(P(A|X)\)

Critical point: You can only balance variables included in PS model!

Let’s see this with NHEFS data…


Balance Check: Love Plots

Observation:

  • Model 1 balances comorbidities ✓, but NOT lifestyle or demographics
  • Model 2 balances comorbidities + lifestyle ✓, but NOT demographics
  • Key point: You can only balance variables included in the PS model!

IPTW Estimates

Effect of Quitting Smoking on BMI Change: IPTW Estimates
Naive IPTW: Comorbidities IPTW: +Lifestyle IPTW: +Demographics
+ p < 0.1, * p < 0.05, ** p < 0.01, *** p < 0.001
Quit Smoking 1.505* 1.826** 1.915** 2.213**
(0.584) (0.681) (0.703) (0.757)
Num.Obs. 218 218 218 218

Key point: Different PS models → different estimates

Now let’s see a different approach: Regression adjustment


Section B: Regression Mechanics

Simple Regression: Concept

Linear regression model: \(Y = \beta_0 + \beta_1 X + \varepsilon\)

  • \(\beta_1\) = expected change in Y for 1-unit change in X
  • Fitted line minimizes squared residuals
  • This is descriptive association, not causal

Let’s see two examples with NHEFS data…


Example 1: BMI Over Time

Question: How does baseline BMI predict follow-up BMI?

Result: \(\hat{\beta}_1 \approx\) 0.95 (p < .001)

Interpretation: A 1-unit increase in 1971 BMI is associated with 0.95-unit increase in 1982 BMI on average.


Example 2: BMI Change

Question: How does baseline BMI predict BMI change?

Result: \(\hat{\beta}_1 \approx\) -0.053 (p = 0.119)

Interpretation: Weak, non-significant association between baseline BMI and BMI change.

Key point: Different outcome → different relationship


Multiple Regression: Concept

Multiple regression: \(Y = \beta_0 + \beta_1 X_1 + \beta_2 X_2 + ... + \beta_p X_p + \varepsilon\)

Critical interpretation: Each \(\beta\) represents effect “holding other variables constant”

This “holding constant” language is KEY to adjustment strategy!

Connection to Week 5: Like stratification, but allows continuous variables


Adding Treatment Variable

Now add quit smoking as a predictor:

Model: BMI Change = \(\beta_0 + \beta_1 \times\) Quit Smoking \(+ \beta_2 \times\) Baseline BMI

Results:

  • Quit smoking: \(\hat{\beta}_1 \approx\) 1.53 (p = 0.009)
  • Baseline BMI: \(\hat{\beta}_2 \approx\) -0.056 (p = 0.1)

Interpretation:

  • After adjusting for baseline BMI, quitting smoking is associated with 1.53-unit increase in BMI
  • Coefficient for quit smoking is now “adjusted” for baseline BMI

Visualizing Multiple Regression

Key insights:

  • Two parallel lines (quitters vs non-quitters)
  • Vertical distance = treatment effect (1.53 units)
  • Parallel slopes = assumes constant effect across all BMI levels

Progressive Adjustment

What if we add MORE confounders?

Watch the coefficient for 'Quit Smoking' change as we add confounders
Baseline +Comorbidities +Lifestyle +Demographics
+ p < 0.1, * p < 0.05, ** p < 0.01, *** p < 0.001
(Intercept) 4.494** 6.212*** 5.615** 4.630
(1.511) (1.809) (1.858) (3.112)
qsmkTRUE 1.530** 1.800** 1.887** 2.071***
(0.582) (0.598) (0.596) (0.609)
bmi71 -0.056+ -0.045 -0.040 -0.032
(0.034) (0.035) (0.035) (0.035)
cholesterol -0.009 -0.008 -0.008
(0.007) (0.007) (0.007)
diabetes 2.108 2.274 1.989
(1.536) (1.527) (1.526)
hbp -2.465 -2.618+ -2.359
(1.564) (1.556) (1.554)
exercise1 -0.320 -0.252
(0.564) (0.563)
exercise2 1.158+ 1.108+
(0.630) (0.632)
alcoholfreq 0.055 0.039
(0.213) (0.219)
marital 0.231
(0.230)
race1 1.287
(0.799)
school -0.106
(0.091)
age 0.040
(0.078)
Num.Obs. 218 218 218 218
R2 0.042 0.066 0.092 0.117

Observation: Quit smoking coefficient changes from 1.53 → 2.07

This is omitted variable bias in action!


Section C: Omitted Variable Bias

The Problem

Omitted Variable Bias (OVB): What happens when we leave out a confounder?

Setup:

  • True model: \(Y = \beta_0 + \beta_1 A + \beta_2 Z + \varepsilon\)
  • Fitted model: \(Y = \beta_0 + \beta_1 A + \varepsilon\) (omit Z)
  • Result: \(\hat{\beta}_1\) is BIASED

When does bias occur? Need BOTH:

  1. Z affects Y (\(\beta_2 \neq 0\))
  2. Z correlated with A (Z is a confounder!)

OVB Direction Formula

Direction of bias:

\[\text{Sign(Bias)} = \text{Sign}(\beta_2) \times \text{Sign}(\text{Corr}(A, Z))\]

\(\beta_2\) (Z → Y) Corr(A,Z) Bias Direction
Positive (+) Positive (+) Positive (overestimate)
Positive (+) Negative (−) Negative (underestimate)
Negative (−) Positive (+) Negative (underestimate)
Negative (−) Negative (−) Positive (overestimate)

Let’s test this with NHEFS data…


NHEFS Example: Omitting Exercise

Scenario: What if we omit “exercise” from the model?

Results:

  • \(\beta_2\) (exercise → outcome): 0.476 (POSITIVE)
  • Corr(quit smoking, exercise): -0.013 (NEGATIVE)
  • Predicted bias direction: NEGATIVE (underestimate)
  • Observed bias: -0.079 (NEGATIVE)
  • Match? YES ✓

Verification Table

Effect of Including vs Omitting Exercise
Without Exercise With Exercise
+ p < 0.1, * p < 0.05, ** p < 0.01, *** p < 0.001
Quit Smoking 1.800** 1.880**
(0.598) (0.594)
Num.Obs. 218 218
R2 0.066 0.091

Key takeaways:

  1. OVB formula correctly predicts bias direction
  2. This is the regression version of unmeasured confounding
  3. If we don’t measure/include confounder → biased estimate
  4. Same problem IPTW faces!

NHEFS Example 2: Omitting Age

New question: What if we omit age from our model?

Analysis:

  • \(\beta_2\) (age → outcome): 0.08 (POSITIVE)
  • Corr(quit smoking, age): 0.081 (POSITIVE)
  • Predicted bias direction: POSITIVE
  • Observed bias: 0.04 (POSITIVE)
  • Match? YES ✓

Interpretation:

  • Age has a positive effect on BMI change (older people gain more weight)
  • Age is positively correlated with quitting smoking
  • Omitting age creates POSITIVE bias in our treatment effect estimate

NHEFS Example 3: Omitting Baseline BMI

New question: What if we omit baseline BMI from our model?

Note: This is interesting because baseline BMI is a baseline measure of our outcome variable!

Analysis:

  • \(\beta_2\) (baseline BMI → outcome): -0.053 (NEGATIVE)
  • Corr(quit smoking, baseline BMI): 0.026 (POSITIVE)
  • Predicted bias direction: NEGATIVE
  • Observed bias: -0.025 (NEGATIVE)
  • Match? YES ✓

Interpretation:

  • Baseline BMI has a negative effect on BMI change (regression to the mean)
  • Higher baseline BMI is positively correlated with quitting
  • Omitting baseline BMI creates NEGATIVE bias

Why this matters: Baseline measures of the outcome are always important confounders to include!


Comparison Table: Multiple OVB Examples

Omitted Variable β₂ (Z → Y) Corr(A, Z) Predicted Bias Observed Bias Formula Works?
as.numeric(exercise) Exercise 0.476 (+) -0.013 (-) NEGATIVE (underestimate) -0.079 (-) ✓ YES
age Age 0.08 (+) 0.081 (+) POSITIVE 0.04 (+) ✓ YES
bmi71 Baseline BMI -0.053 (-) 0.026 (+) NEGATIVE -0.025 (-) ✓ YES

Key Takeaways:

  1. OVB formula works consistently across different types of confounders
  2. Different confounders can create bias in different directions
  3. Baseline measures of the outcome are critical confounders
  4. All measured confounders should be included in the model!

Section D: IPTW Makes Regression Robust

The Key Insight

Powerful discovery: When we use IPTW weighting, the treatment effect estimate stays stable regardless of which confounders we include in the regression!

What this means:

  • With IPTW weights: qsmkTRUE coefficient ≈ 3.5 whether we control for 0, 1, or 10 confounders
  • Without IPTW weights: coefficient changes dramatically as we add/remove confounders

Why this matters:

  • IPTW has already balanced the confounders through weighting
  • Regression doesn’t need to “work hard” to adjust for them
  • This provides double protection against confounding

Let’s demonstrate this with our smoking data…


Four Weighted Regression Models

We’ll fit 4 models with progressively more confounders, all using IPTW weights:

Weighted Regression: Stable Estimates Across Model Specifications
Model Specification qsmkTRUE Estimate Std Error 95% CI Lower 95% CI Upper
qsmkTRUE 1. No confounders 2.213 0.757 0.729 3.697
qsmkTRUE1 2. + Baseline BMI 2.222 0.758 0.735 3.709
qsmkTRUE2 3. + Comorbidities 2.192 0.734 0.753 3.630
qsmkTRUE3 4. + All confounders 2.305 0.670 0.993 3.617

KEY OBSERVATION:

  • All four estimates cluster around 2.23
  • Range across models: Only 0.113
  • Confidence intervals all overlap substantially
  • Conclusion: With IPTW weighting, which confounders we include doesn’t matter much!

Why This Works: Double Robustness

The key insight: When we achieve balance through IPTW, the weighted data mimics randomization.

Two complementary approaches to remove confounding:

  1. IPTW: Creates balance by reweighting so confounders are uncorrelated with treatment
  2. Regression adjustment: Directly controls for confounders in the outcome model

Together = “Double Robustness”: Using IPTW + regression provides extra protection. We saw this! Weighted Model 1 (no confounders) ≈ Weighted Model 4 (all confounders) because IPTW already removed the confounding. This approach is called doubly robust estimation and is popular in practice because you need to misspecify BOTH models to get biased estimates.

Comparison: RCT vs Observational Study with IPTW

Aspect RCT (Randomized Trial) Observational Study + IPTW
Balance Created by randomization Created by reweighting
Exchangeability Holds by design Assumed (conditional on X)
Confounding None (by construction) Removed if all confounders measured
Regression sensitivity Low (already balanced) Low (after IPTW balancing)
Model dependence Minimal Minimal (after good balance)
Key assumption Random assignment worked Unconfoundedness + correct PS model

Main insight: IPTW attempts to recreate the balance that randomization provides naturally in RCTs.


Section E: Synthesis & Comparison

When Does Regression Give Causal Effects?

Case 1: Randomized Control Trial

If quit smoking was randomly assigned:

  • Randomization → quit smoking ⊥ (all confounders)
  • Exchangeability holds by design

What happens with regression:

  • Simple model (Y ~ A): Already unbiased
  • Multiple model (Y ~ A + X): Increases precision, still unbiased
  • Functional form doesn’t matter much (we just demonstrated this!)

Bottom line: In RCT, regression is “nice to have” not “must have”


Case 2: Observational Study

Requirements for unbiased estimate:

Requirement 1: Unconfoundedness \((Y_1, Y_0) \perp A | X\)

  • Must measure and include ALL confounders
  • No unmeasured confounding
  • Same as IPTW!

Requirement 2: Correct functional form

  • Linear relationships (or include polynomials/interactions)
  • No missing interactions
  • Correct link function

Critical point: Regression requires unconfoundedness (like IPTW) PLUS correct outcome model


IPTW vs Regression: Side-by-Side

Aspect IPTW Regression
Strategy Reweight to create balance Model outcome to adjust
What you model Treatment assignment P(A|X) Outcome E[Y|A,X]
Unconfoundedness ✓ Required ✓ Required
Additional assumption Correct PS model Correct outcome model
Can check Covariate balance Residual diagnostics
Vulnerable when PS model misspecified Outcome model misspecified
What to get right Treatment selection Outcome prediction
Robust to functional form? Yes, if balanced No, unless balanced

Key Takeaways

1. Both methods adjust for confounding

  • IPTW: Adjusts via reweighting
  • Regression: Adjusts via modeling (“holding constant”)
  • Both aim to remove confounding

2. Both require same fundamentals

  • Unconfoundedness: (Y₁, Y₀) ⊥ A | X
  • Only adjust for OBSERVED confounders
  • Neither fixes unmeasured confounding
  • Results still observational, not experimental

3. Different modeling requirements

  • IPTW: Must correctly model P(A|X)
  • Regression: Must correctly model E[Y|A,X]
  • Both vulnerable to misspecification

4. Balance helps both methods

  • IPTW creates balance by design (when PS model correct)
  • Regression benefits from balance (less sensitive to functional form)
  • We demonstrated this empirically!

5. Critical reading skills

When reading papers, look for:

  • “Adjusted for…” → Regression
  • “Propensity score weighted” → IPTW
  • “Doubly robust” → Combines both

Always ask:

  • Is unconfoundedness plausible?
  • Were balance/diagnostics checked?
  • Could unmeasured confounding explain results?

Summary

Today we learned:

  1. Regression mechanics: Simple → multiple → omitted variable bias
  2. OVB formula: Predicts direction of bias from omitted confounders
  3. Balance matters: Functional form misspecification less critical when balanced
  4. IPTW vs Regression: Different strategies, same fundamental requirements

Next class: Practice applying these concepts to real studies