The idea and data is borrowed from this post by University of Wisconsin. Please refer to this post for more details. I summarize and compare two common commands in Stata to conduct propensity score matching: psmatch2 and teffects psmatch.

In addition, I use some definitions and explanations from Ani Katchova in the theoretical part of this report.

1 Introduction

Assume that a doctor is interested in compare the distress level of two groups: treat (t=1) and control (t=0). This comparison may be important because he wants to see whether the new drug used in treat group is efficient or not. To do so, he collects data of HDL Cholesterol (i.e., good cholesterol) of treat patients (who used drug) and all control patients (all other patients). As we expect, he conducts a t-test: ttest y, by(t).

## . ttest y, by(t)
## 
## Two-sample t test with equal variances
## ------------------------------------------------------------------------------
##    Group |     Obs        Mean    Std. Err.   Std. Dev.   [95% Conf. Interval]
## ---------+--------------------------------------------------------------------
##        0 |     667   -.4232434    .0634689     1.63917   -.5478667   -.2986201
##        1 |     333    1.891074    .0876176    1.598872    1.718718    2.063429
## ---------+--------------------------------------------------------------------
## combined |   1,000    .3474242    .0619004    1.957462    .2259545    .4688939
## ---------+--------------------------------------------------------------------
##     diff |           -2.314317    .1090943               -2.528398   -2.100236
## ------------------------------------------------------------------------------
##     diff = mean(0) - mean(1)                                      t = -21.2139
## Ho: diff = 0                                     degrees of freedom =      998
## 
##     Ha: diff < 0                 Ha: diff != 0                 Ha: diff > 0
##  Pr(T < t) = 0.0000         Pr(|T| > |t|) = 0.0000          Pr(T > t) = 1.0000

We see that the p-value is very small. Can we conclude that there is a significant difference? Let’s see the frequency table of patients:

## . tab t
## 
##           t |      Freq.     Percent        Cum.
## ------------+-----------------------------------
##           0 |        667       66.70       66.70
##           1 |        333       33.30      100.00
## ------------+-----------------------------------
##       Total |      1,000      100.00

We have 333 treat patients but up to 667 control patients. In this case, we should think how to match each treat patient to a control patient (i.e., match 1:1).

2 Average treatment effect (ATE) and Average treatment effect on the treated (ATT/ATET)

A patient needs to use drug to treat for his illness. We will measure the level of HDL Cholesterol (i.e., y) to check his health after use the drug.

Treatment effect of an individual \(i\) is measured as:

\[ y_{1i} - y_{0i} \] where \(y_{1}\) is the value of \(y\) if he used drug and \(y_{0i}\) is the value of he did not use drug. The treatment effect measures the causal effect of using the drug (i.e., the treatment).

In a population, there are two groups: treat and control group. The Average treatment effect (ATE) of the entire group is:

\[ATE = E[y_{1i}- y_{0i}]=\frac{1}{N} \sum_{i=1}^{N} (y_{1i}- y_{0i})\] where N is the sample size.

Next, we may want to know the Average treatment effect on the treated (ATT/ATET), which is restricted only in patients exposed to the treatment (i.e., \(t = 1\)).

\[ATT = \frac{1}{N} \sum_{i=1|t=1}^{N} (y_{1i}- y_{0i})\]

3 Propensity score matching

The propensity score model is a probit/logit model with \(D_{Treat}\) as the dependent variable and some characteristics as independent variables. \(D_{Treat}\) is a dummy receives two values: 1 for the treated and 0 for the control group. So we run the following regression:

\[ D_{Treat} = Constant + Control \; variables + \epsilon \] The propensity score is the conditional (predicted) probability of receiving treatment given pre-treatment characteristics. Match observations from treated and control groups based on their propensity scores. The most common method to match is nearest neighbor matching. In nearest neighbor matching, the match and the treat has closest score.

Figure 1: Nearest neighbor matching

Figure 1: Nearest neighbor matching

In addition, we have other methods:

  • Radius matching: the difference in score is within a specified range
  • Kernel matching: Each treated observation i is matched with several control observations, with weights inversely proportional to the distance between treated and control observations.
Figure 2: Kernel matching

Figure 2: Kernel matching

4 PSM in Stata

In Stata, we can use teffects psmatch (default by Stata 13 or above) or psmatch2 (written by Edwin Leuven and Barbara Sianesi) to conduct PSM. To install psmatch2:

ssc install psmatch2, replace

Each command has its own advantages so we will compare both commands. At the end, there are some suggestions how to use it.

We use a data from University of Wisconsin as an example:

use http://ssc.wisc.edu/sscc/pubs/files/psm, replace
use psm, clear

This data includes:

  • An output variable: y
  • 2 independent variables: x1 and x2
  • A treatment dummy: t

4.1 Propensity score and ATE/ATT

We use two characteristics of patients (x1 and x2) to estimate the propensity score and match. In Stata, we run as follows: psmatch2 Treat Age Sex, logit n(1)

teffects psmatch (y) (t x1 x2), gen(match)
## . teffects psmatch (y) (t x1 x2), gen(match)
## 
## Treatment-effects estimation                   Number of obs      =      1,000
## Estimator      : propensity-score matching     Matches: requested =          1
## Outcome model  : matching                                     min =          1
## Treatment model: logit                                        max =          1
## ------------------------------------------------------------------------------
##              |              AI Robust
##            y |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
## -------------+----------------------------------------------------------------
## ATE          |
##            t |
##    (1 vs 0)  |   1.019367   .1164694     8.75   0.000     .7910912    1.247643
## ------------------------------------------------------------------------------

We can add some predictions to see the propensity score (i.e., option , ps), potential outcome (i.e., option , po), and treatment effect (i.e., default or option , te) for each observation:

predict ps0 ps1, ps
predict y0 y1, po
predict te

The first prediction gives the propensity score for control and treat observation. We see that the observation 1 (a control observation) is matched with observation 467 (a treat observation), so their propensity score is very similar.

The second prediction gives the potential outcome (i.e., \(y\)) for each matching. For example, in observation 1, it is a control observation so \(y_0\) is the \(y\) value of observation 1 and \(y_1\) is the \(y\) value of observation 467. In case of observation 467, it is a treat observation so \(y_1\) is the \(y\) value of observation 467, but \(y_0\) is the \(y\) value of observation 781.

The third one is the treatment effect for each observation. It is \(y_1 - y_0\) for each observation.

From the formula, we can calculate ATE and ATT for this sample as follows:

su te
su te if t == 1
## . su te
## 
##     Variable |        Obs        Mean    Std. Dev.       Min        Max
## -------------+---------------------------------------------------------
##           te |      1,000    1.057346     1.48425   -3.86259   6.669581
## . su te if t == 1
## 
##     Variable |        Obs        Mean    Std. Dev.       Min        Max
## -------------+---------------------------------------------------------
##           te |        333    1.019685    1.472275  -3.127719   5.280228

4.2 Compare two methods

To calculate ATE, these two commands are similar as follows:

psmatch2 t x1 x2, out(y) ate
teffects psmatch (y) (t x1 x2, probit)
## . psmatch2 t x1 x2, out(y) ate
## 
## Probit regression                               Number of obs     =      1,000
##                                                 LR chi2(2)        =     222.39
##                                                 Prob > chi2       =     0.0000
## Log likelihood = -525.08947                     Pseudo R2         =     0.1748
## 
## ------------------------------------------------------------------------------
##            t |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
## -------------+----------------------------------------------------------------
##           x1 |   .5305378   .0496371    10.69   0.000     .4332509    .6278247
##           x2 |   .4772386   .0465109    10.26   0.000     .3860789    .5683984
##        _cons |  -.5090219   .0455374   -11.18   0.000    -.5982737   -.4197702
## ------------------------------------------------------------------------------
## -------------------------------------------------------------------------------
## > ---------
##         Variable     Sample |    Treated     Controls   Difference         S.E.
## >    T-stat
## ----------------------------+--------------------------------------------------
## > ---------
##                y  Unmatched |  1.8910736  -.423243358   2.31431696   .109094342
## >     21.21
##                         ATT |  1.8910736   .871388246   1.01968536   .173034999
## >      5.89
##                         ATU |-.423243358   .652904533   1.07614789            .
## >         .
##                         ATE |                           1.05734587            .
## >         .
## ----------------------------+--------------------------------------------------
## > ---------
## Note: S.E. does not take into account that the propensity score is estimated.
## 
##            | psmatch2:
##  psmatch2: |   Common
##  Treatment |  support
## assignment | On suppor |     Total
## -----------+-----------+----------
##  Untreated |       667 |       667 
##    Treated |       333 |       333 
## -----------+-----------+----------
##      Total |     1,000 |     1,000
## . teffects psmatch (y) (t x1 x2, probit)
## 
## Treatment-effects estimation                   Number of obs      =      1,000
## Estimator      : propensity-score matching     Matches: requested =          1
## Outcome model  : matching                                     min =          1
## Treatment model: probit                                       max =          1
## ------------------------------------------------------------------------------
##              |              AI Robust
##            y |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
## -------------+----------------------------------------------------------------
## ATE          |
##            t |
##    (1 vs 0)  |   1.057346   .1159899     9.12   0.000     .8300097    1.284682
## ------------------------------------------------------------------------------

To calculate ATT, these two commands are similar as follows:

psmatch2 t x1 x2, out(y)
teffects psmatch (y) (t x1 x2, probit), atet
## . psmatch2 t x1 x2, out(y)
## 
## Probit regression                               Number of obs     =      1,000
##                                                 LR chi2(2)        =     222.39
##                                                 Prob > chi2       =     0.0000
## Log likelihood = -525.08947                     Pseudo R2         =     0.1748
## 
## ------------------------------------------------------------------------------
##            t |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
## -------------+----------------------------------------------------------------
##           x1 |   .5305378   .0496371    10.69   0.000     .4332509    .6278247
##           x2 |   .4772386   .0465109    10.26   0.000     .3860789    .5683984
##        _cons |  -.5090219   .0455374   -11.18   0.000    -.5982737   -.4197702
## ------------------------------------------------------------------------------
## -------------------------------------------------------------------------------
## > ---------
##         Variable     Sample |    Treated     Controls   Difference         S.E.
## >    T-stat
## ----------------------------+--------------------------------------------------
## > ---------
##                y  Unmatched |  1.8910736  -.423243358   2.31431696   .109094342
## >     21.21
##                         ATT |  1.8910736   .871388246   1.01968536   .173034999
## >      5.89
## ----------------------------+--------------------------------------------------
## > ---------
## Note: S.E. does not take into account that the propensity score is estimated.
## 
##            | psmatch2:
##  psmatch2: |   Common
##  Treatment |  support
## assignment | On suppor |     Total
## -----------+-----------+----------
##  Untreated |       667 |       667 
##    Treated |       333 |       333 
## -----------+-----------+----------
##      Total |     1,000 |     1,000
## . teffects psmatch (y) (t x1 x2, probit), atet
## 
## Treatment-effects estimation                   Number of obs      =      1,000
## Estimator      : propensity-score matching     Matches: requested =          1
## Outcome model  : matching                                     min =          1
## Treatment model: probit                                       max =          1
## ------------------------------------------------------------------------------
##              |              AI Robust
##            y |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
## -------------+----------------------------------------------------------------
## ATET         |
##            t |
##    (1 vs 0)  |   1.019685   .1227801     8.30   0.000     .7790407     1.26033
## ------------------------------------------------------------------------------

Although these results show that the coefficients of the ATE/ATT in two commands are similar, but the standard errors are different. In particular, teffects psmatch give smaller SE than psmatch2, so we should use the teffects psmatch if we consider this ATE/ATT.

4.3 Post-Regression

We should use the psmatch2 command if we need the post-regression between treat and matched, since this command created a new useful variable, called _weight:

The question is why we need _weight here? This is because some control observation is matched to more than one treated observation. For example, let’s see this frequency table:

tab _weight if !t
## . tab _weight if !t
## 
##     _weight |      Freq.     Percent        Cum.
## ------------+-----------------------------------
##           1 |        119       63.64       63.64
##           2 |         33       17.65       81.28
##           3 |         14        7.49       88.77
##           4 |         10        5.35       94.12
##           5 |          7        3.74       97.86
##           6 |          1        0.53       98.40
##           8 |          2        1.07       99.47
##           9 |          1        0.53      100.00
## ------------+-----------------------------------
##       Total |        187      100.00

We have 333 treat observations, but only 187 observations are chosen as controls. Thus, some observations appear more than one times. For example, the observation _id = 651 appears 9 times. We can restrict that no replacement with the option norepl in psmatch2.

In regression, we add [fweight = _weight] to get the weight of the observations.

psmatch2 t x1 x2, out(y)
reg y x1 x2 t [fweight = _weight]
## . reg y x1 x2 t [fweight = _weight]
## 
##       Source |       SS           df       MS      Number of obs   =       666
## -------------+----------------------------------   F(3, 662)       =    426.08
##        Model |  1366.57216         3  455.524054   Prob > F        =    0.0000
##     Residual |   707.74744       662  1.06910489   R-squared       =    0.6588
## -------------+----------------------------------   Adj R-squared   =    0.6573
##        Total |   2074.3196       665  3.11927759   Root MSE        =     1.034
## 
## ------------------------------------------------------------------------------
##            y |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
## -------------+----------------------------------------------------------------
##           x1 |   1.119345   .0442644    25.29   0.000     1.032429     1.20626
##           x2 |   1.098903   .0427949    25.68   0.000     1.014873    1.182933
##            t |   1.020328   .0801565    12.73   0.000     .8629361    1.177719
##        _cons |  -.0636372   .0631996    -1.01   0.314    -.1877331    .0604586
## ------------------------------------------------------------------------------

5 Some suggestions

  1. If we want to get the correct SE, we should use teffects psmatch
  2. If we want to run post-regression, we can combine both commands
  3. The default model in psmatch2 is probit, while it is logit model in teffects psmatch. This report uses probit model in most of the cases, but you can change the option to get the logit model.