The idea and data is borrowed from this post by University of Wisconsin. Please refer to this post for more details. I summarize and compare two common commands in Stata to conduct propensity score matching: psmatch2
and teffects psmatch
.
In addition, I use some definitions and explanations from Ani Katchova in the theoretical part of this report.
Assume that a doctor is interested in compare the distress level of two groups: treat
(t=1
) and control
(t=0
). This comparison may be important because he wants to see whether the new drug used in treat group is efficient or not. To do so, he collects data of HDL Cholesterol (i.e., good cholesterol) of treat patients (who used drug) and all control patients (all other patients). As we expect, he conducts a t-test
: ttest y, by(t)
.
## . ttest y, by(t)
##
## Two-sample t test with equal variances
## ------------------------------------------------------------------------------
## Group | Obs Mean Std. Err. Std. Dev. [95% Conf. Interval]
## ---------+--------------------------------------------------------------------
## 0 | 667 -.4232434 .0634689 1.63917 -.5478667 -.2986201
## 1 | 333 1.891074 .0876176 1.598872 1.718718 2.063429
## ---------+--------------------------------------------------------------------
## combined | 1,000 .3474242 .0619004 1.957462 .2259545 .4688939
## ---------+--------------------------------------------------------------------
## diff | -2.314317 .1090943 -2.528398 -2.100236
## ------------------------------------------------------------------------------
## diff = mean(0) - mean(1) t = -21.2139
## Ho: diff = 0 degrees of freedom = 998
##
## Ha: diff < 0 Ha: diff != 0 Ha: diff > 0
## Pr(T < t) = 0.0000 Pr(|T| > |t|) = 0.0000 Pr(T > t) = 1.0000
We see that the p-value
is very small. Can we conclude that there is a significant difference? Let’s see the frequency table of patients:
## . tab t
##
## t | Freq. Percent Cum.
## ------------+-----------------------------------
## 0 | 667 66.70 66.70
## 1 | 333 33.30 100.00
## ------------+-----------------------------------
## Total | 1,000 100.00
We have 333 treat patients but up to 667 control patients. In this case, we should think how to match each treat patient to a control patient (i.e., match 1:1).
A patient needs to use drug to treat for his illness. We will measure the level of HDL Cholesterol (i.e., y
) to check his health after use the drug.
Treatment effect of an individual \(i\) is measured as:
\[ y_{1i} - y_{0i} \] where \(y_{1}\) is the value of \(y\) if he used drug and \(y_{0i}\) is the value of he did not use drug. The treatment effect measures the causal effect of using the drug (i.e., the treatment).
In a population, there are two groups: treat and control group. The Average treatment effect (ATE) of the entire group is:
\[ATE = E[y_{1i}- y_{0i}]=\frac{1}{N} \sum_{i=1}^{N} (y_{1i}- y_{0i})\] where N is the sample size.
Next, we may want to know the Average treatment effect on the treated (ATT/ATET), which is restricted only in patients exposed to the treatment (i.e., \(t = 1\)).
\[ATT = \frac{1}{N} \sum_{i=1|t=1}^{N} (y_{1i}- y_{0i})\]
The propensity score model is a probit/logit model with \(D_{Treat}\) as the dependent variable and some characteristics as independent variables. \(D_{Treat}\) is a dummy receives two values: 1 for the treated and 0 for the control group. So we run the following regression:
\[ D_{Treat} = Constant + Control \; variables + \epsilon \] The propensity score is the conditional (predicted) probability of receiving treatment given pre-treatment characteristics. Match observations from treated and control groups based on their propensity scores. The most common method to match is nearest neighbor matching. In nearest neighbor matching, the match and the treat has closest score.
Figure 1: Nearest neighbor matching
In addition, we have other methods:
i
is matched with several control observations, with weights inversely proportional to the distance between treated and control observations.Figure 2: Kernel matching
In Stata, we can use teffects psmatch
(default by Stata 13 or above) or psmatch2
(written by Edwin Leuven and Barbara Sianesi) to conduct PSM. To install psmatch2
:
ssc install psmatch2, replace
Each command has its own advantages so we will compare both commands. At the end, there are some suggestions how to use it.
We use a data from University of Wisconsin as an example:
use http://ssc.wisc.edu/sscc/pubs/files/psm, replace
use psm, clear
This data includes:
y
x1
and x2
t
We use two characteristics of patients (x1
and x2
) to estimate the propensity score and match. In Stata, we run as follows: psmatch2 Treat Age Sex, logit n(1)
teffects psmatch (y) (t x1 x2), gen(match)
## . teffects psmatch (y) (t x1 x2), gen(match)
##
## Treatment-effects estimation Number of obs = 1,000
## Estimator : propensity-score matching Matches: requested = 1
## Outcome model : matching min = 1
## Treatment model: logit max = 1
## ------------------------------------------------------------------------------
## | AI Robust
## y | Coef. Std. Err. z P>|z| [95% Conf. Interval]
## -------------+----------------------------------------------------------------
## ATE |
## t |
## (1 vs 0) | 1.019367 .1164694 8.75 0.000 .7910912 1.247643
## ------------------------------------------------------------------------------
We can add some predictions to see the propensity score (i.e., option , ps
), potential outcome (i.e., option , po
), and treatment effect (i.e., default or option , te
) for each observation:
predict ps0 ps1, ps
predict y0 y1, po
predict te
The first prediction gives the propensity score for control and treat observation. We see that the observation 1 (a control observation) is matched with observation 467 (a treat observation), so their propensity score is very similar.
The second prediction gives the potential outcome (i.e., \(y\)) for each matching. For example, in observation 1, it is a control observation so \(y_0\) is the \(y\) value of observation 1 and \(y_1\) is the \(y\) value of observation 467. In case of observation 467, it is a treat observation so \(y_1\) is the \(y\) value of observation 467, but \(y_0\) is the \(y\) value of observation 781.
The third one is the treatment effect for each observation. It is \(y_1 - y_0\) for each observation.
From the formula, we can calculate ATE and ATT for this sample as follows:
su te
su te if t == 1
## . su te
##
## Variable | Obs Mean Std. Dev. Min Max
## -------------+---------------------------------------------------------
## te | 1,000 1.057346 1.48425 -3.86259 6.669581
## . su te if t == 1
##
## Variable | Obs Mean Std. Dev. Min Max
## -------------+---------------------------------------------------------
## te | 333 1.019685 1.472275 -3.127719 5.280228
To calculate ATE, these two commands are similar as follows:
psmatch2 t x1 x2, out(y) ate
teffects psmatch (y) (t x1 x2, probit)
## . psmatch2 t x1 x2, out(y) ate
##
## Probit regression Number of obs = 1,000
## LR chi2(2) = 222.39
## Prob > chi2 = 0.0000
## Log likelihood = -525.08947 Pseudo R2 = 0.1748
##
## ------------------------------------------------------------------------------
## t | Coef. Std. Err. z P>|z| [95% Conf. Interval]
## -------------+----------------------------------------------------------------
## x1 | .5305378 .0496371 10.69 0.000 .4332509 .6278247
## x2 | .4772386 .0465109 10.26 0.000 .3860789 .5683984
## _cons | -.5090219 .0455374 -11.18 0.000 -.5982737 -.4197702
## ------------------------------------------------------------------------------
## -------------------------------------------------------------------------------
## > ---------
## Variable Sample | Treated Controls Difference S.E.
## > T-stat
## ----------------------------+--------------------------------------------------
## > ---------
## y Unmatched | 1.8910736 -.423243358 2.31431696 .109094342
## > 21.21
## ATT | 1.8910736 .871388246 1.01968536 .173034999
## > 5.89
## ATU |-.423243358 .652904533 1.07614789 .
## > .
## ATE | 1.05734587 .
## > .
## ----------------------------+--------------------------------------------------
## > ---------
## Note: S.E. does not take into account that the propensity score is estimated.
##
## | psmatch2:
## psmatch2: | Common
## Treatment | support
## assignment | On suppor | Total
## -----------+-----------+----------
## Untreated | 667 | 667
## Treated | 333 | 333
## -----------+-----------+----------
## Total | 1,000 | 1,000
## . teffects psmatch (y) (t x1 x2, probit)
##
## Treatment-effects estimation Number of obs = 1,000
## Estimator : propensity-score matching Matches: requested = 1
## Outcome model : matching min = 1
## Treatment model: probit max = 1
## ------------------------------------------------------------------------------
## | AI Robust
## y | Coef. Std. Err. z P>|z| [95% Conf. Interval]
## -------------+----------------------------------------------------------------
## ATE |
## t |
## (1 vs 0) | 1.057346 .1159899 9.12 0.000 .8300097 1.284682
## ------------------------------------------------------------------------------
To calculate ATT, these two commands are similar as follows:
psmatch2 t x1 x2, out(y)
teffects psmatch (y) (t x1 x2, probit), atet
## . psmatch2 t x1 x2, out(y)
##
## Probit regression Number of obs = 1,000
## LR chi2(2) = 222.39
## Prob > chi2 = 0.0000
## Log likelihood = -525.08947 Pseudo R2 = 0.1748
##
## ------------------------------------------------------------------------------
## t | Coef. Std. Err. z P>|z| [95% Conf. Interval]
## -------------+----------------------------------------------------------------
## x1 | .5305378 .0496371 10.69 0.000 .4332509 .6278247
## x2 | .4772386 .0465109 10.26 0.000 .3860789 .5683984
## _cons | -.5090219 .0455374 -11.18 0.000 -.5982737 -.4197702
## ------------------------------------------------------------------------------
## -------------------------------------------------------------------------------
## > ---------
## Variable Sample | Treated Controls Difference S.E.
## > T-stat
## ----------------------------+--------------------------------------------------
## > ---------
## y Unmatched | 1.8910736 -.423243358 2.31431696 .109094342
## > 21.21
## ATT | 1.8910736 .871388246 1.01968536 .173034999
## > 5.89
## ----------------------------+--------------------------------------------------
## > ---------
## Note: S.E. does not take into account that the propensity score is estimated.
##
## | psmatch2:
## psmatch2: | Common
## Treatment | support
## assignment | On suppor | Total
## -----------+-----------+----------
## Untreated | 667 | 667
## Treated | 333 | 333
## -----------+-----------+----------
## Total | 1,000 | 1,000
## . teffects psmatch (y) (t x1 x2, probit), atet
##
## Treatment-effects estimation Number of obs = 1,000
## Estimator : propensity-score matching Matches: requested = 1
## Outcome model : matching min = 1
## Treatment model: probit max = 1
## ------------------------------------------------------------------------------
## | AI Robust
## y | Coef. Std. Err. z P>|z| [95% Conf. Interval]
## -------------+----------------------------------------------------------------
## ATET |
## t |
## (1 vs 0) | 1.019685 .1227801 8.30 0.000 .7790407 1.26033
## ------------------------------------------------------------------------------
Although these results show that the coefficients of the ATE/ATT in two commands are similar, but the standard errors are different. In particular, teffects psmatch
give smaller SE than psmatch2
, so we should use the teffects psmatch
if we consider this ATE/ATT.
We should use the psmatch2
command if we need the post-regression between treat and matched, since this command created a new useful variable, called _weight
:
The question is why we need _weight
here? This is because some control observation is matched to more than one treated observation. For example, let’s see this frequency table:
tab _weight if !t
## . tab _weight if !t
##
## _weight | Freq. Percent Cum.
## ------------+-----------------------------------
## 1 | 119 63.64 63.64
## 2 | 33 17.65 81.28
## 3 | 14 7.49 88.77
## 4 | 10 5.35 94.12
## 5 | 7 3.74 97.86
## 6 | 1 0.53 98.40
## 8 | 2 1.07 99.47
## 9 | 1 0.53 100.00
## ------------+-----------------------------------
## Total | 187 100.00
We have 333 treat observations, but only 187 observations are chosen as controls. Thus, some observations appear more than one times. For example, the observation _id = 651
appears 9 times. We can restrict that no replacement with the option norepl
in psmatch2
.
In regression, we add [fweight = _weight]
to get the weight of the observations.
psmatch2 t x1 x2, out(y)
reg y x1 x2 t [fweight = _weight]
## . reg y x1 x2 t [fweight = _weight]
##
## Source | SS df MS Number of obs = 666
## -------------+---------------------------------- F(3, 662) = 426.08
## Model | 1366.57216 3 455.524054 Prob > F = 0.0000
## Residual | 707.74744 662 1.06910489 R-squared = 0.6588
## -------------+---------------------------------- Adj R-squared = 0.6573
## Total | 2074.3196 665 3.11927759 Root MSE = 1.034
##
## ------------------------------------------------------------------------------
## y | Coef. Std. Err. t P>|t| [95% Conf. Interval]
## -------------+----------------------------------------------------------------
## x1 | 1.119345 .0442644 25.29 0.000 1.032429 1.20626
## x2 | 1.098903 .0427949 25.68 0.000 1.014873 1.182933
## t | 1.020328 .0801565 12.73 0.000 .8629361 1.177719
## _cons | -.0636372 .0631996 -1.01 0.314 -.1877331 .0604586
## ------------------------------------------------------------------------------
teffects psmatch
psmatch2
is probit, while it is logit model in teffects psmatch
. This report uses probit model in most of the cases, but you can change the option to get the logit model.