Impact Evaluation - Problem Set 2

#####Reading in and viewing data
lalonde_df <- read_dta("lalonde.dta")
psid_df <- read_dta("psid.dta")

#View the structure of the data
str(lalonde_df)

## tibble [445 × 11] (S3: tbl_df/tbl/data.frame)
##  $ treat  : num [1:445] 1 1 1 1 1 1 1 1 1 1 ...
##   ..- attr(*, "format.stata")= chr "%9.0g"
##  $ age    : num [1:445] 37 22 30 27 33 22 23 32 22 33 ...
##   ..- attr(*, "format.stata")= chr "%9.0g"
##  $ educ   : num [1:445] 11 9 12 11 8 9 12 11 16 12 ...
##   ..- attr(*, "format.stata")= chr "%9.0g"
##  $ black  : num [1:445] 1 0 1 1 1 1 1 1 1 0 ...
##   ..- attr(*, "format.stata")= chr "%9.0g"
##  $ hisp   : num [1:445] 0 1 0 0 0 0 0 0 0 0 ...
##   ..- attr(*, "format.stata")= chr "%9.0g"
##  $ married: num [1:445] 1 0 0 0 0 0 0 0 0 1 ...
##   ..- attr(*, "format.stata")= chr "%9.0g"
##  $ re74   : num [1:445] 0 0 0 0 0 0 0 0 0 0 ...
##   ..- attr(*, "format.stata")= chr "%9.0g"
##  $ re75   : num [1:445] 0 0 0 0 0 0 0 0 0 0 ...
##   ..- attr(*, "format.stata")= chr "%9.0g"
##  $ re78   : num [1:445] 9930 3596 24910 7506 290 ...
##   ..- attr(*, "format.stata")= chr "%9.0g"
##  $ u75    : num [1:445] 1 1 1 1 1 1 1 1 1 1 ...
##   ..- attr(*, "format.stata")= chr "%9.0g"
##  $ u78    : num [1:445] 1 1 1 1 1 1 1 1 1 1 ...
##   ..- attr(*, "format.stata")= chr "%9.0g"

head(lalonde_df)

## # A tibble: 6 × 11
##   treat   age  educ black  hisp married  re74  re75   re78   u75   u78
##   <dbl> <dbl> <dbl> <dbl> <dbl>   <dbl> <dbl> <dbl>  <dbl> <dbl> <dbl>
## 1     1    37    11     1     0       1     0     0  9930.     1     1
## 2     1    22     9     0     1       0     0     0  3596.     1     1
## 3     1    30    12     1     0       0     0     0 24910.     1     1
## 4     1    27    11     1     0       0     0     0  7506.     1     1
## 5     1    33     8     1     0       0     0     0   290.     1     1
## 6     1    22     9     1     0       0     0     0  4056.     1     1

summary(lalonde_df)

##      treat             age             educ          black       
##  Min.   :0.0000   Min.   :17.00   Min.   : 3.0   Min.   :0.0000  
##  1st Qu.:0.0000   1st Qu.:20.00   1st Qu.: 9.0   1st Qu.:1.0000  
##  Median :0.0000   Median :24.00   Median :10.0   Median :1.0000  
##  Mean   :0.4157   Mean   :25.37   Mean   :10.2   Mean   :0.8337  
##  3rd Qu.:1.0000   3rd Qu.:28.00   3rd Qu.:11.0   3rd Qu.:1.0000  
##  Max.   :1.0000   Max.   :55.00   Max.   :16.0   Max.   :1.0000  
##       hisp            married            re74              re75      
##  Min.   :0.00000   Min.   :0.0000   Min.   :    0.0   Min.   :    0  
##  1st Qu.:0.00000   1st Qu.:0.0000   1st Qu.:    0.0   1st Qu.:    0  
##  Median :0.00000   Median :0.0000   Median :    0.0   Median :    0  
##  Mean   :0.08764   Mean   :0.1685   Mean   : 2102.3   Mean   : 1377  
##  3rd Qu.:0.00000   3rd Qu.:0.0000   3rd Qu.:  824.4   3rd Qu.: 1221  
##  Max.   :1.00000   Max.   :1.0000   Max.   :39570.7   Max.   :25142  
##       re78            u75              u78        
##  Min.   :    0   Min.   :0.0000   Min.   :0.0000  
##  1st Qu.:    0   1st Qu.:0.0000   1st Qu.:0.0000  
##  Median : 3702   Median :1.0000   Median :1.0000  
##  Mean   : 5301   Mean   :0.7326   Mean   :0.6494  
##  3rd Qu.: 8125   3rd Qu.:1.0000   3rd Qu.:1.0000  
##  Max.   :60308   Max.   :1.0000   Max.   :1.0000

#445 observations (185 got treatment, 260 no treatment)

str(psid_df)

## tibble [2,675 × 11] (S3: tbl_df/tbl/data.frame)
##  $ treat: num [1:2675] 0 0 0 0 0 0 0 0 0 0 ...
##   ..- attr(*, "label")= chr "Treatment indicator"
##   ..- attr(*, "format.stata")= chr "%9.0g"
##  $ age  : num [1:2675] 40 54 47 32 47 32 55 29 46 48 ...
##   ..- attr(*, "label")= chr "Age"
##   ..- attr(*, "format.stata")= chr "%9.0g"
##  $ educ : num [1:2675] 8 0 10 8 8 10 8 10 11 6 ...
##   ..- attr(*, "label")= chr "Years of schooling"
##   ..- attr(*, "format.stata")= chr "%9.0g"
##  $ black: num [1:2675] 0 0 0 0 0 0 0 0 0 0 ...
##   ..- attr(*, "label")= chr "Black"
##   ..- attr(*, "format.stata")= chr "%9.0g"
##  $ hisp : num [1:2675] 0 1 0 0 0 0 0 0 0 1 ...
##   ..- attr(*, "label")= chr "Hispanic"
##   ..- attr(*, "format.stata")= chr "%9.0g"
##  $ marr : num [1:2675] 1 1 1 1 1 1 1 1 1 1 ...
##   ..- attr(*, "label")= chr "Married"
##   ..- attr(*, "format.stata")= chr "%9.0g"
##  $ re74 : num [1:2675] 50941 49229 48198 47022 44667 ...
##   ..- attr(*, "label")= chr "1974 real earnings"
##   ..- attr(*, "format.stata")= chr "%9.0g"
##  $ re75 : num [1:2675] 55500 44221 47968 67137 33837 ...
##   ..- attr(*, "label")= chr "1975 real earnings"
##   ..- attr(*, "format.stata")= chr "%9.0g"
##  $ u74  : num [1:2675] 0 0 0 0 0 0 0 0 0 0 ...
##   ..- attr(*, "label")= chr "Unemployed in 1974"
##   ..- attr(*, "format.stata")= chr "%9.0g"
##  $ u75  : num [1:2675] 0 0 0 0 0 0 0 0 0 0 ...
##   ..- attr(*, "label")= chr "Unemployed in 1975"
##   ..- attr(*, "format.stata")= chr "%9.0g"
##  $ re78 : num [1:2675] 53198 20540 55710 59109 38569 ...
##   ..- attr(*, "label")= chr "1978 Real earnings"
##   ..- attr(*, "format.stata")= chr "%9.0g"

head(psid_df)

## # A tibble: 6 × 11
##   treat   age  educ black  hisp  marr   re74   re75   u74   u75   re78
##   <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>  <dbl>  <dbl> <dbl> <dbl>  <dbl>
## 1     0    40     8     0     0     1 50941. 55500      0     0 53198.
## 2     0    54     0     0     1     1 49228. 44221      0     0 20540.
## 3     0    47    10     0     0     1 48198  47968.     0     0 55710.
## 4     0    32     8     0     0     1 47022. 67137.     0     0 59109.
## 5     0    47     8     0     0     1 44667. 33837.     0     0 38569.
## 6     0    32    10     0     0     1 43104. 39387.     0     0 36943.

summary(psid_df)

##      treat              age             educ           black       
##  Min.   :0.00000   Min.   :17.00   Min.   : 0.00   Min.   :0.0000  
##  1st Qu.:0.00000   1st Qu.:25.00   1st Qu.:10.00   1st Qu.:0.0000  
##  Median :0.00000   Median :32.00   Median :12.00   Median :0.0000  
##  Mean   :0.06916   Mean   :34.23   Mean   :11.99   Mean   :0.2916  
##  3rd Qu.:0.00000   3rd Qu.:43.50   3rd Qu.:14.00   3rd Qu.:1.0000  
##  Max.   :1.00000   Max.   :55.00   Max.   :17.00   Max.   :1.0000  
##       hisp              marr             re74             re75       
##  Min.   :0.00000   Min.   :0.0000   Min.   :     0   Min.   :     0  
##  1st Qu.:0.00000   1st Qu.:1.0000   1st Qu.:  8817   1st Qu.:  7605  
##  Median :0.00000   Median :1.0000   Median : 17438   Median : 17008  
##  Mean   :0.03439   Mean   :0.8194   Mean   : 18230   Mean   : 17851  
##  3rd Qu.:0.00000   3rd Qu.:1.0000   3rd Qu.: 25471   3rd Qu.: 25584  
##  Max.   :1.00000   Max.   :1.0000   Max.   :137149   Max.   :156653  
##       u74              u75              re78       
##  Min.   :0.0000   Min.   :0.0000   Min.   :     0  
##  1st Qu.:0.0000   1st Qu.:0.0000   1st Qu.:  9243  
##  Median :0.0000   Median :0.0000   Median : 19432  
##  Mean   :0.1346   Mean   :0.1293   Mean   : 20502  
##  3rd Qu.:0.0000   3rd Qu.:0.0000   3rd Qu.: 28816  
##  Max.   :1.0000   Max.   :1.0000   Max.   :121174

#2,675 observations (185 treatment plus full control group)

##Part I

In this part, you are asked to analyze a subset of the data used by Lalonde (1986). The data is available on the files lalonde.dta and psid.dta. The first file (lalonde.dta) includes the true experimental data. Altogether, 445 individuals participated in the experiment: 185 were randomly selected to receive the treatment (participated in a training program), and 260 did not receive treatment. The second file (psid.dta) includes the same 185 observations from the experimental treatment group, plus 2490 observations (the control group) from people who did not participate in the training program. The control group is taken from the PSID, a large representative sample of the US population. The pre-treatment variables are age (in years), educ (education in years), black (1 if black, 0 otherwise), hisp (1 if hispanic, 0 otherwise), marr (1 if married, 0 otherwise), re74 (real yearly earnings for 1974), re75 (real yearly earnings for 1975), u74 (1 if unemployed in 1974 and 0 otherwise), and u75 (1 if unemployed in 1975 and 0 otherwise), the treatment indicator is treat (1 if received training, 0 if not), and the outcome variable re78 (real yearly earnings for 1978).

1. Estimate the average treatment effect by calculating the difference in 1978 earnings between the experimental treatment and control groups (i.e., using file lalonde.dta). What assumption is needed on the selection mechanism and on the potential outcomes for this to be a valid estimator of the treatment effect?

ATE_londe_re78 = mean(lalonde_df$re78[lalonde_df$treat == 1],na.rm = TRUE) - mean(lalonde_df$re78[lalonde_df$treat == 0],na.rm = TRUE) 
print(round(ATE_londe_re78,3))

## [1] 1794.343

The estimated ATE for the experimental data is about a $1,794 increase in real earning for the group that received training versus the control group. The key for this RCT data is that there was random assignment to control and treatment group in which will have eliminated for selection bias. In other words, the observable characteristics of the treatment and control group should have been relatively similar. This estimate of ATE will serve as our baseline in further problem 1 analysis.

The estimate of the average treatment effect in part (1) will be your benchmark. From now on, you only need to use the non-experimental data, psid.dta.

Estimate the average treatment effect by calculating the difference in 1978 earnings between the treatment and the control group. What assumption is needed on the selection mechanism and on the potential outcomes for this to be a valid estimator of the treatment effect? Is this a valid assumption in this context? Why?

ATE_psid_re78 = mean(psid_df$re78[psid_df$treat == 1],na.rm = TRUE) - mean(psid_df$re78[psid_df$treat == 0],na.rm = TRUE) 
round(ATE_psid_re78,3)

## [1] -15204.78

The average treatment effect estimated by this calculation is approx -$15,204, which is much different from the experimental estimate. In the PSid data, we would need there to be no observable difference between the treatment and control group that confounds the results. However, we know from the lectures that those who received treatment on average had lower starting salaries - see evidence below. Therefore, the estimator would be understated for the true effect

psid_df %>%  
  group_by(treat) %>% 
  summarise( avg_re74 = mean(re74, na.rm = TRUE))

## # A tibble: 2 × 2
##   treat avg_re74
##   <dbl>    <dbl>
## 1     0   19429.
## 2     1    2096.

As we can see, the average salary in 74 for the treatment group was nearly $17.5k less for the treatment group than control group

3. Estimate the average treatment effect by running an OLS regression of 1978 earnings on the treatment indicator and all the control variables (e.g., age educ black hisp marr re74 re75 u74 u75). What assumptions are needed on the selection mechanism and on the potential outcomes for this to be a valid estimator of the treatment effect?

#regress real earnings 1978 on all control variables
model_psid_re78 <- psid_df %>% lm(re78~.,.)
summary(model_psid_re78)

## 
## Call:
## lm(formula = re78 ~ ., data = .)
## 
## Residuals:
##    Min     1Q Median     3Q    Max 
## -65427  -4347   -369   3783 111501 
## 
## Coefficients:
##               Estimate Std. Error t value Pr(>|t|)    
## (Intercept)  9.536e+02  1.371e+03   0.696   0.4866    
## treat        1.154e+02  1.007e+03   0.115   0.9088    
## age         -8.977e+01  2.194e+01  -4.091 4.41e-05 ***
## educ         5.141e+02  7.644e+01   6.726 2.13e-11 ***
## black       -4.542e+02  4.969e+02  -0.914   0.3607    
## hisp         2.197e+03  1.092e+03   2.013   0.0442 *  
## marr         1.205e+03  5.855e+02   2.058   0.0397 *  
## re74         3.126e-01  3.163e-02   9.883  < 2e-16 ***
## re75         5.436e-01  3.090e-02  17.592  < 2e-16 ***
## u74         -1.462e+03  9.472e+02  -1.543   0.1228    
## u75          2.390e+03  1.024e+03   2.333   0.0197 *  
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 10060 on 2664 degrees of freedom
## Multiple R-squared:  0.5871, Adjusted R-squared:  0.5855 
## F-statistic: 378.8 on 10 and 2664 DF,  p-value: < 2.2e-16

The resulting coefficient for treatment implies that the impact of treatment is $115, though it is not significant in this model. This estimate understates our baseline ATE from 1. However, we have reason to believe that there is bias, namely the conditional independence assumption is violated as there would be correlation between the treatment and error term due to relative starting point compared to new salary. In problem 4 we address this by showing earnings growth as our dependent variable.

4. Estimate the average treatment effect using the simple difference in wages between 1978 and 1975: re78 − re75 = γ1 + γ 2treated + u. What assumptions are needed for this estimator to be a consistent estimator of the average treatment effect?

#create new variable for difference between 78 and 75 earnings in real terms
psid_diff_df <- psid_df %>%  mutate(diff_78_75 = re78 - re75)

#Regress new variable for earning growth against treatment
model_psid_diff_1 <- psid_diff_df %>% lm(diff_78_75~treat,.)
summary(model_psid_diff_1)

## 
## Call:
## lm(formula = diff_78_75 ~ treat, data = .)
## 
## Residuals:
##    Min     1Q Median     3Q    Max 
## -73924  -3911   -956   3888 118683 
## 
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)    
## (Intercept)   2490.6      214.0  11.637  < 2e-16 ***
## treat         2326.5      813.9   2.859  0.00429 ** 
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 10680 on 2673 degrees of freedom
## Multiple R-squared:  0.003048,   Adjusted R-squared:  0.002675 
## F-statistic: 8.172 on 1 and 2673 DF,  p-value: 0.004288

#ATE mean difference calculation same as OLS
ATE_psid_redif = mean(psid_diff_df$diff_78_75[psid_df$treat == 1],na.rm = TRUE) - mean(psid_diff_df$diff_78_75[psid_df$treat == 0],na.rm = TRUE) 
round(ATE_psid_redif,3)

## [1] 2326.505

The ATE estimate when regressing against the earnings growth between 1978 and 1975 is significant and much closer to the experimental result of 1794. It is likely overstated though because of confounding factors that are not included as covariates.

5. Estimate the average treatment effect using the regression-adjusted differences in wage growth between 1978 and 1975: re78 − re75 = γ 1 + γ 2treated + δ0X + u.

Note that you do not need to control for re75 in X, as it’s already included on the left-hand side.

#regress real earnings difference between 1975 and 1978 on all control variables except re 1975
model_psid_diff_2 <- psid_diff_df %>% lm(diff_78_75~treat + age + educ + black + hisp + marr + re74 +u74 + u75,.)
summary(model_psid_diff_2)

## 
## Call:
## lm(formula = diff_78_75 ~ treat + age + educ + black + hisp + 
##     marr + re74 + u74 + u75, data = .)
## 
## Residuals:
##    Min     1Q Median     3Q    Max 
## -69217  -4434   -405   3939 114605 
## 
## Coefficients:
##               Estimate Std. Error t value Pr(>|t|)    
## (Intercept)  1.596e+03  1.425e+03   1.120  0.26274    
## treat        1.029e+03  1.045e+03   0.985  0.32488    
## age         -1.111e+02  2.277e+01  -4.879 1.13e-06 ***
## educ         3.791e+02  7.892e+01   4.803 1.65e-06 ***
## black       -1.763e+02  5.163e+02  -0.341  0.73280    
## hisp         2.803e+03  1.134e+03   2.470  0.01356 *  
## marr         1.091e+03  6.088e+02   1.792  0.07322 .  
## re74        -5.531e-02  2.026e-02  -2.729  0.00639 ** 
## u74          4.835e+03  8.795e+02   5.498 4.21e-08 ***
## u75         -2.643e+03  1.005e+03  -2.631  0.00857 ** 
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 10470 on 2665 degrees of freedom
## Multiple R-squared:  0.04552,    Adjusted R-squared:  0.0423 
## F-statistic: 14.12 on 9 and 2665 DF,  p-value: < 2.2e-16

The estimated treatment effect is no longer significant in the new model. The coefficient for treatment is $1029, about half of the baseline estimate. But note that if we restrict our covariate matrix to treatment, age, education level, is black, is hispanic, the resulting coefficient is almost spot on with the experimental results.

model_psid_diff_3 <- psid_diff_df %>% lm(diff_78_75~treat + age + educ + black + hisp,.)
summary(model_psid_diff_3)

## 
## Call:
## lm(formula = diff_78_75 ~ treat + age + educ + black + hisp, 
##     data = .)
## 
## Residuals:
##    Min     1Q Median     3Q    Max 
## -72223  -3997   -516   3703 118912 
## 
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)    
## (Intercept)  2850.96    1353.14   2.107  0.03522 *  
## treat        1791.19     875.04   2.047  0.04076 *  
## age          -112.53      20.61  -5.460 5.21e-08 ***
## educ          287.69      74.26   3.874  0.00011 ***
## black         -89.39     512.63  -0.174  0.86159    
## hisp         3011.76    1143.63   2.633  0.00850 ** 
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 10560 on 2669 degrees of freedom
## Multiple R-squared:  0.02734,    Adjusted R-squared:  0.02551 
## F-statistic:    15 on 5 and 2669 DF,  p-value: 1.493e-14

##Part II Matching In this part, you are asked to analyze a subset of the data used by Lalonde (1986) as you examined in Part I, and subsequently reanalyzed by Dehejia and Wahba (1999). The data (psid.dta) includes the same 185 observations from the experimental treatment group, plus 2490 observations (the control group) from people who did not participate in the training program. The control group is taken from the PSID, a large representative sample of the US population. The pre-treatment variables are age (in years), educ (education in years), black (1 if black, 0 otherwise), hisp (1 if hispanic, 0 otherwise), marr (1 if married, 0 otherwise), re74 (real yearly earnings for 1974), re75 (real yearly earnings for 1975), u74 (1 if unemployed in 1974 and 0 otherwise), and u75 (1 if unemployed in 1975 and 0 otherwise), the treatment indicator is treat (1 if received training, 0 if not), and the outcome variable re78 (real yearly earnings for 1978).

1. Now use pscore command in STATA to estimate the probability of treatment (the propensity score) as a function of all the pre-treatment variables (e.g., age educ black hisp marr re75 re74 u74 u75), without the comsup (common support) option. Is the balancing property for the propensity score satisfied?

probit_psid <- psid_df %>% glm(treat ~ age + educ + black + hisp + marr + re74 + u74 + u75,., family = binomial(link = "probit"))

## Warning: glm.fit: fitted probabilities numerically 0 or 1 occurred

summary(probit_psid)

## 
## Call:
## glm(formula = treat ~ age + educ + black + hisp + marr + re74 + 
##     u74 + u75, family = binomial(link = "probit"), data = .)
## 
## Deviance Residuals: 
##     Min       1Q   Median       3Q      Max  
## -2.4807  -0.1114  -0.0191  -0.0018   3.8567  
## 
## Coefficients:
##               Estimate Std. Error z value Pr(>|z|)    
## (Intercept)  6.924e-01  4.856e-01   1.426  0.15395    
## age         -6.232e-02  8.858e-03  -7.036 1.98e-12 ***
## educ        -7.259e-02  2.789e-02  -2.602  0.00926 ** 
## black        1.383e+00  1.710e-01   8.085 6.19e-16 ***
## hisp         1.267e+00  2.979e-01   4.253 2.11e-05 ***
## marr        -9.490e-01  1.349e-01  -7.036 1.98e-12 ***
## re74        -3.850e-05  1.349e-05  -2.854  0.00432 ** 
## u74          1.313e-01  2.020e-01   0.650  0.51589    
## u75          1.498e+00  2.375e-01   6.310 2.80e-10 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## (Dispersion parameter for binomial family taken to be 1)
## 
##     Null deviance: 1345.30  on 2674  degrees of freedom
## Residual deviance:  471.77  on 2666  degrees of freedom
## AIC: 489.77
## 
## Number of Fisher Scoring iterations: 9

No the balancing property is not satisfied as the control and treatment groups maintain differences in observable characteristics.

pscores_psid <- probit_psid %>% predict.glm(.,type = "response")
pscores_df <- cbind(psid_df,pscores_psid)


pscores_df %>%  
  group_by(treat) %>% 
  summarise(across(everything(), mean, na.rm = TRUE))

## Warning: There was 1 warning in `summarise()`.
## ℹ In argument: `across(everything(), mean, na.rm = TRUE)`.
## ℹ In group 1: `treat = 0`.
## Caused by warning:
## ! The `...` argument of `across()` is deprecated as of dplyr 1.1.0.
## Supply arguments directly to `.fns` through an anonymous function instead.
## 
##   # Previously
##   across(a:b, mean, na.rm = TRUE)
## 
##   # Now
##   across(a:b, \(x) mean(x, na.rm = TRUE))

## # A tibble: 2 × 12
##   treat   age  educ black   hisp  marr   re74   re75   u74    u75   re78 pscor…¹
##   <dbl> <dbl> <dbl> <dbl>  <dbl> <dbl>  <dbl>  <dbl> <dbl>  <dbl>  <dbl>   <dbl>
## 1     0  34.9  12.1 0.251 0.0325 0.866 19429. 19063.   0.1 0.0863 21554.  0.0271
## 2     1  25.8  10.3 0.843 0.0595 0.189  2096.  1532.   0.6 0.708   6349.  0.632 
## # … with abbreviated variable name ¹pscores_psid

2.Reestimate the propensity score using the comsup (common support) option. In R, you can construct the comsup variable, which is defined to be the observations including all treated plus those controls in the region of common support (this is the STATA default). Alternatively, you may also consider only overlapping regions. The R correspondence of kdensity in Stata is density(). Ignore blockid for now.

Comment on the differences between the two distributions and on what this implies for your estimate of the average treatment effect estimated in Part I above, where you estimated the average treatment effect by calculating the difference in 1978 earnings between the treatment and the control group.

#range of common support
comsup_range <- range(pscores_df$pscores_psid[pscores_df$treat == 1])
#create variable for if in range
comsup_range

## [1] 0.0005889806 0.9841137182

pscores_df$comsup <- ifelse((pscores_df$pscores_psid >= comsup_range[1] & pscores_df$pscores_psid <= comsup_range[2]),1,0)
summary(pscores_df)

##      treat              age             educ           black       
##  Min.   :0.00000   Min.   :17.00   Min.   : 0.00   Min.   :0.0000  
##  1st Qu.:0.00000   1st Qu.:25.00   1st Qu.:10.00   1st Qu.:0.0000  
##  Median :0.00000   Median :32.00   Median :12.00   Median :0.0000  
##  Mean   :0.06916   Mean   :34.23   Mean   :11.99   Mean   :0.2916  
##  3rd Qu.:0.00000   3rd Qu.:43.50   3rd Qu.:14.00   3rd Qu.:1.0000  
##  Max.   :1.00000   Max.   :55.00   Max.   :17.00   Max.   :1.0000  
##       hisp              marr             re74             re75       
##  Min.   :0.00000   Min.   :0.0000   Min.   :     0   Min.   :     0  
##  1st Qu.:0.00000   1st Qu.:1.0000   1st Qu.:  8817   1st Qu.:  7605  
##  Median :0.00000   Median :1.0000   Median : 17438   Median : 17008  
##  Mean   :0.03439   Mean   :0.8194   Mean   : 18230   Mean   : 17851  
##  3rd Qu.:0.00000   3rd Qu.:1.0000   3rd Qu.: 25471   3rd Qu.: 25584  
##  Max.   :1.00000   Max.   :1.0000   Max.   :137149   Max.   :156653  
##       u74              u75              re78         pscores_psid      
##  Min.   :0.0000   Min.   :0.0000   Min.   :     0   Min.   :0.0000000  
##  1st Qu.:0.0000   1st Qu.:0.0000   1st Qu.:  9243   1st Qu.:0.0000079  
##  Median :0.0000   Median :0.0000   Median : 19432   Median :0.0004575  
##  Mean   :0.1346   Mean   :0.1293   Mean   : 20502   Mean   :0.0689564  
##  3rd Qu.:0.0000   3rd Qu.:0.0000   3rd Qu.: 28816   3rd Qu.:0.0186811  
##  Max.   :1.0000   Max.   :1.0000   Max.   :121174   Max.   :0.9841137  
##      comsup      
##  Min.   :0.0000  
##  1st Qu.:0.0000  
##  Median :0.0000  
##  Mean   :0.4819  
##  3rd Qu.:1.0000  
##  Max.   :1.0000

#check for balance
pscores_df %>%  
  group_by(comsup) %>% 
  summarise(across(everything(), mean, na.rm = TRUE))

## # A tibble: 2 × 13
##   comsup treat   age  educ  black   hisp  marr   re74   re75    u74   u75   re78
##    <dbl> <dbl> <dbl> <dbl>  <dbl>  <dbl> <dbl>  <dbl>  <dbl>  <dbl> <dbl>  <dbl>
## 1      0 0      37.8  12.8 0.0519 0.0159 0.964 26350. 25182. 0.0303 0     27261.
## 2      1 0.144  30.4  11.1 0.549  0.0543 0.664  9499.  9968. 0.247  0.268 13236.
## # … with 1 more variable: pscores_psid <dbl>

ATE_pscores_df = mean(pscores_df$re78[pscores_df$comsup == 1],na.rm = TRUE) - mean(pscores_df$re78[pscores_df$comsup == 0],na.rm = TRUE) 
round(ATE_pscores_df,3)

## [1] -14024.85

The two distributions (control and treatment) show slightly more balance in terms of characteristic observations, but it still apears plagued by bias. ATE is very close to that estimated just using the treatment effect dummy variable.

3a) Estimate the average treatment effect for the treated using nearest neighbor matching based on the propensity score.

matchit_psid <- matchit(treat ~ age + educ + black + hisp + marr + re74 + u74 + u75 + re78, psid_df, link = "probit", replace = TRUE, method = "nearest")
summary(matchit_psid)

## 
## Call:
## matchit(formula = treat ~ age + educ + black + hisp + marr + 
##     re74 + u74 + u75 + re78, data = psid_df, method = "nearest", 
##     link = "probit", replace = TRUE)
## 
## Summary of Balance for All Data:
##          Means Treated Means Control Std. Mean Diff. Var. Ratio eCDF Mean
## distance        0.6307        0.0269          1.9247    11.7301    0.4818
## age            25.8162       34.8506         -1.2627     0.4696    0.2317
## educ           10.3459       12.1169         -0.8808     0.4255    0.1091
## black           0.8432        0.2506          1.6301          .    0.5926
## hisp            0.0595        0.0325          0.1139          .    0.0269
## marr            0.1892        0.8663         -1.7287          .    0.6771
## re74         2095.5740    19428.7453         -3.5471     0.1329    0.4681
## u74             0.6000        0.1000          1.0206          .    0.5000
## u75             0.7081        0.0863          1.3676          .    0.6218
## re78         6349.1454    21553.9213         -1.9326     0.2558    0.3521
##          eCDF Max
## distance   0.8636
## age        0.3771
## educ       0.4029
## black      0.5926
## hisp       0.0269
## marr       0.6771
## re74       0.7292
## u74        0.5000
## u75        0.6218
## re78       0.5915
## 
## Summary of Balance for Matched Data:
##          Means Treated Means Control Std. Mean Diff. Var. Ratio eCDF Mean
## distance        0.6307        0.6310         -0.0009     0.9396    0.0016
## age            25.8162       23.7405          0.2901     0.9642    0.0843
## educ           10.3459       10.5892         -0.1210     0.4869    0.0550
## black           0.8432        0.7622          0.2230          .    0.0811
## hisp            0.0595        0.1676         -0.4572          .    0.1081
## marr            0.1892        0.1622          0.0690          .    0.0270
## re74         2095.5740     2500.8936         -0.0829     1.1251    0.0174
## u74             0.6000        0.3405          0.5296          .    0.2595
## u75             0.7081        0.6000          0.2378          .    0.1081
## re78         6349.1454     4831.2552          0.1929     1.4236    0.0405
##          eCDF Max Std. Pair Dist.
## distance   0.1189          0.0373
## age        0.3514          0.9685
## educ       0.2054          1.2178
## black      0.0811          0.8474
## hisp       0.1081          0.8229
## marr       0.0270          0.4554
## re74       0.1351          0.5350
## u74        0.2595          0.9930
## u75        0.1081          0.5707
## re78       0.2378          0.8871
## 
## Sample Sizes:
##               Control Treated
## All           2490.       185
## Matched (ESS)   18.02     185
## Matched         64.       185
## Unmatched     2426.         0
## Discarded        0.         0

#Find ATE
Mean_treat_matchit = 6349.1454
Mean_control_matchit = 4831.2552
ATE_matchit_psid = Mean_treat_matchit - Mean_control_matchit
ATE_matchit_psid

## [1] 1517.89

ATE for the nearest match results in estimated ATE of $1,517.89, within ~$270 of baseline.

4. Which estimation methods, 3a) or 3b), do a good job of recovering the experimental average treatment effect and why?

I am missing the response to 3b, but I would expect kernal matching to produce a better estimate than nearest matching. Because the psid data has such a larger control group, it would be more accurate to compare the average observations within a weighted range rather than a singular observation as is done in 3a.

7. Evaluate the two econometricians’ claims in light of the results in Part II Questions 3 through 6.

Econometrician A: “Propensity score matching methods are remarkably effective in recovering the average treatment effect, and should become a standard element in the econometrician’s toolbox for the analysis and estimation of causal effects. However, one should take care that all important covariates are included in the control set. Propensity score matching methods can be badly misleading if important variables are omitted from the control set.”

Econometrician B: “The main importance of the propensity score is that it gives the researcher a useful tool for constructing comparison groups that are similar to the treatment group. Once such a comparison group has been constructed, most econometric estimators are equally effective in recovering the treatment effect.”

I would be prone to agree with Econometrician B as both matching and OLS are prone to omitted variable bias. So while matching is useful for a robustness check and can help avoid the extreme ranges captured in OLS. However the average treatment effect will tend towards the same result.

3. Evaluations of Articles

1. Case et al. (2005)

The Case et al. study which explored the interaction between childhood health and circumstances on longterm health and socio-economic status. Based on the data used, I’d like to start with a note on the external validity. Given that we are focusing on individuals born in a specific birth month (March), there has been some research suggesting that the month of birth itself may in fact have an impact on health. It would be better if the sample came from a more balanced spread of the population in terms of birth month. It is also the case that the study also limited its consideration of employment and socio-economic status to men because half of the women were not working at age 42. I would assume that men are more likely engaged in physically demanding work where health would be more detrimental to earnings, therefore the study may overstate the effect of childhood health on status later in life, especially when applied to women.

On the internal validity question, the study alludes to the fact that other control factors are likely missing. For example, location and relative air quality could be a cause of chronic respiratory disease. The family history of chronic or genetic conditions could help explain health status. Each would lead to an overstatement of the causal power in the findings. Self-reporting is also a key feature of the underlying data. Is it possible that mother’s who smoked during pregnancy have a certain propensity for lying about this so that the true heavy smoker figure was greater than 12%? In which case, this non-classical measurement error, therefore the error increases as mother’s smoke more. Unfortunately, it’s unclear whether this leads to over or understated results.

I don’t know how to classify this last point, but I notice that the number of observations remain static 5439 across the longitudinal results. Is it that deaths are accounted for as a health outcome? Or were they removed from the analysis for missing values? If the latter, I wonder if we are missing some key observations.

2. Gilligan and Sergenti (2007)

The statistical study by Gilligan and Sergenti assesses the causal case for UN peacekeeping efforts using the matching technique, which represents a methadological contribution to the question. Previous research suffered from bias because of the non-random assignment of peace keeping missions. Gilligan and Sergenti assert that no single model can predict counterfactuals wherein the UN is present without influence. Rather, they find control examples that best match (using replacement) cases where the UN has intervened. The value of this method are transparency, use of actual vs extrapolated cases, larger n as compared to most studies.

With the above advantages noted, there are some immediate internal validity concerns. The authors note plainly that they have now way of controlling unobserved variables that might effect war outcomes and likelihood of intervention by the UN as one could do with a Heckman model, however, despite their confidence in abilities as political scientists, there are likely confounding variables not captured. For example, religion may influence greater commitment to fractions or valuation of peace. I also notice a lack of consideration for the economic factors, besides noting why growth after conflict was not usable. I would suspect countries facing economic hardships are more keen to fight. I would also suspect some degree on natural resources within a country may lead to greater conflict, like oil in middle eastern countries. These unobserved variables could lead to bias in the covariates used.

I also suspect this study could be prone to classical measurement error as it likely very difficult to precisely measure peacetime in months during fluid outbreaks of violence. Especially, considering much of this fighting could be happening in remote and under observed areas. I would further think it a mistake to assume that UN intervention is a binary selection. It would be more accurate to measure intervention by level of engagement, monetary commitment, and/or consensus among voting members rather than expecting all missions to be equal.

Final note, the authors admit that the UN is less likely to intervene in conflicts involving large and powerful countries. The leverage just isn’t there, which means a few things. First, even if causal effect is shown for UN intervention it would likely only apply a certain subset of the world’s countries, so the result cannot be extrapolated globally. Further, it could mean that the effect is overstated due to unobserved heterogeneity.

Impact Evaluation - Problem Set 2

Thomas Laffey

2023-05-30

3. Evaluations of Articles