Essentials of Causal Inference using Observational Data

Alex Deng
8/7/2013

Notes

  • Deck for the original full 2 hour course link
  • This one hour version focus on concept and less(to my best) technical details
  • If you have problem with math display, try using Firefox or Chrome

Outline

Outline: Key Concepts

  • Overview
    • Simpson's Paradox: Correlation vs. Causation
    • Limitaton: “Tobacco gene” and Fisher's comment
    • Causality vs. predictive modelling
  • Potential outcome framework and missing data in data analysis
  • Ignorability of treatment assignment
  • Propensity function quantifies the probability of being treated given covariate \( X \).
  • \( e(X) \) can be estimated in many ways: generalized linear model(logistic regression), decision tree, genetic matching, etc.
  • Can either use \( e(X) \) to calculate likelihood ratio, i.e. when we estimate treatment effect on treated, then all untreated sample should be weighted proportional to \( e(x)/(1-e(X)) \)
  • We can also apply matching on propensity function (matching on coninuous varaible requires kernel density estimation)

Caveats

Directly use propensity function to weight samples usually does not perform well

  • Model imperfection, e.g. logistic regression assumes logit relationship.
  • \( e(X) \) could be close to 1 for some sample, then the weight will be close to infinity, making the estimation of TT unstable. Similarly, for TUT, when \( e(X) \) close to 0, the weight \( (1-e(x))/e(x) \) is close to infinity and TUT cannot be accurately estimated. (unbiased but high variance)
  • Matching on propensity function is better.
  • But there are still better ways…

Regression with weighted Samples

Weighted samples perspective naturally leads to other applications. One extension of regression model is to use weighted samples instead of all samples having the same weight.

This leads to so called doublly robust estimator:

  1. If the regression model is correct, then this approach gives unbiased estimator.
  2. If the weights from propensity function/likelihood ratio are correct, this approach also gives unbiased estimator