Q1

1

describing data

The data includes for each person: years of schooling, cohort, region, log of his wage, and data about his region: average num of schools, if the program was intense in his reagion, number of children that went to school in 71.

path <- "C:/Users/dorgo/Documents/R/Indo_Schooling.dta"
data1<-read_dta(path)
data1$after<-data1$birth_year>62 #adding before/after info to data


regions<-unique(data1[c("birth_region", "num_schools", "program_intensity", "children71")])

The size of each cohort is:

data1 %>% 
  count(birth_year) %>% kable() %>% kable_styling(bootstrap_options = c("striped", "hover"), full_width = F) 
birth_year n
50 2003
51 1455
52 1920
53 2118
54 2102
55 2576
56 2140
57 2369
58 2604
59 2616
60 3536
61 2388
62 2875
68 3114
69 3072
70 3473
71 2529
72 2734

The size of each birth region is:

data1 %>% 
  count(birth_region)%>% kable() %>% kable_styling(bootstrap_options = c("striped", "hover"), full_width = F)
birth_region n
1101 109
1102 52
1103 110
1104 41
1105 87
1106 96
1107 150
1108 130
1171 122
1172 52
1201 61
1202 165
1203 83
1204 256
1205 153
1206 248
1207 271
1208 71
1209 72
1210 280
1211 211
1271 112
1272 110
1273 187
1274 130
1275 612
1276 135
1301 131
1302 120
1303 61
1304 127
1305 173
1306 115
1307 136
1308 74
1371 342
1372 54
1373 99
1374 56
1375 120
1376 93
1401 70
1402 51
1403 213
1404 90
1405 176
1471 104
1472 35
1501 96
1502 62
1503 71
1504 69
1505 40
1571 114
1601 85
1602 96
1603 108
1604 112
1605 74
1606 65
1607 200
1608 126
1671 296
1672 127
1701 97
1702 69
1703 24
1771 59
1801 128
1802 87
1803 56
1804 27
1871 152
3171 821
3172 863
3173 786
3174 808
3175 430
3201 161
3202 125
3203 319
3204 276
3205 234
3206 372
3207 291
3208 261
3209 242
3210 198
3211 351
3212 187
3213 205
3214 234
3215 238
3216 135
3217 221
3218 104
3219 194
3220 247
3271 197
3272 139
3273 406
3274 161
3275 99
3301 299
3302 374
3303 186
3304 165
3305 362
3306 337
3307 187
3308 278
3309 234
3310 390
3311 182
3312 320
3313 173
3314 190
3315 188
3316 142
3317 148
3318 295
3319 283
3320 253
3321 196
3322 286
3323 146
3324 217
3325 210
3326 252
3327 236
3328 267
3329 286
3371 203
3372 324
3373 142
3374 386
3375 155
3376 196
3401 126
3402 243
3403 186
3404 261
3471 300
3501 157
3502 222
3503 134
3504 230
3505 220
3506 349
3507 363
3508 213
3509 298
3510 321
3511 123
3512 128
3513 216
3514 242
3515 273
3516 272
3517 301
3518 250
3519 236
3520 184
3521 185
3522 149
3523 149
3524 194
3525 165
3526 122
3527 103
3528 89
3529 131
3571 176
3572 110
3573 185
3574 145
3575 130
3576 140
3577 131
3578 463
5101 94
5102 198
5103 185
5104 210
5105 117
5106 117
5107 100
5108 183
5171 135
5201 116
5202 117
5203 144
5204 73
5205 69
5206 166
5271 102
5301 45
5302 38
5303 108
5304 58
5305 51
5306 76
5307 85
5308 110
5309 45
5310 102
5311 69
5312 74
6101 101
6102 168
6103 75
6104 59
6105 66
6106 62
6171 148
6201 56
6202 54
6203 68
6204 77
6205 42
6271 28
6301 45
6302 72
6303 78
6304 36
6305 64
6306 98
6307 85
6308 71
6309 78
6371 171
6401 38
6402 101
6403 71
6404 85
6471 118
6472 82
7101 142
7102 67
7103 241
7104 183
7171 132
7172 165
7173 38
7201 73
7202 115
7203 152
7204 114
7301 57
7302 93
7303 39
7304 85
7305 62
7306 109
7307 73
7308 76
7309 82
7310 66
7311 132
7312 84
7313 64
7314 84
7315 63
7316 62
7317 93
7318 144
7319 85
7320 62
7321 31
7371 360
7372 81
7401 142
7402 77
7403 74
7404 30
8101 111
8102 131
8103 117
8104 68
8171 164
8201 53
8202 39
8203 82
8204 35
8205 36
8206 50
8207 43
8208 78
8209 93
8271 14

Summary of the data:

summary(data1)%>% kable() %>% kable_styling(bootstrap_options = c("striped", "hover")) 
education birth_year birth_region
log_wage </th>
num_schools program_intensity children71 after
Min. : 0.000 Min. :50.00 Min. :1101 Min. : 9.21 Min. :0.5908 Min. :0.0000 Min. : 3796 Mode :logical
1st Qu.: 6.000 1st Qu.:55.00 1st Qu.:3173 1st Qu.:11.74 1st Qu.:1.3171 1st Qu.:0.0000 1st Qu.: 63580 FALSE:30702
Median : 9.000 Median :60.00 Median :3319 Median :12.18 Median :1.7603 Median :0.0000 Median :159434 TRUE :14922
Mean : 9.347 Mean :60.96 Mean :3670 Mean :12.12 Mean :2.0262 Mean :0.4156 Mean :162622 NA
3rd Qu.:12.000 3rd Qu.:69.00 3rd Qu.:3573 3rd Qu.:12.58 3rd Qu.:2.3986 3rd Qu.:1.0000 3rd Qu.:221623 NA
Max. :19.000 Max. :72.00 Max. :8271 Max. :16.15 Max. :8.5983 Max. :1.0000 Max. :542835 NA

The average education level in the sample is 9.3471857

Number of schools in each region is:

regions[c("birth_region", "num_schools")]%>% kable() %>% kable_styling(bootstrap_options = c("striped", "hover"),full_width = F)
birth_region num_schools
1101 2.7295361
1102 2.6737239
1103 2.3670490
1104 2.0621350
1105 2.4550869
1106 2.3862851
1107 2.6452501
1108 2.4007659
1171 4.2678442
1172 5.0011368
1201 2.8798370
1202 2.1565940
1203 6.3137689
1204 1.6434790
1205 2.5201330
1206 2.3662519
1207 2.3666439
1208 2.7791309
1209 2.8007390
1210 1.7145150
1211 1.6756830
1271 4.8361540
1272 3.3309050
1273 2.7290239
1274 7.7355838
1275 1.1905100
1276 5.1110229
1301 1.5348140
1302 1.4699860
1303 1.7952110
1304 1.3995460
1305 1.3971270
1306 1.4346910
1307 1.8653700
1308 0.7795604
1371 1.1695690
1372 2.3205221
1373 4.2149630
1374 2.4052920
1375 1.3021600
1376 1.3560090
1401 2.3033280
1402 2.6923530
1403 2.1274519
1404 2.2996349
1405 1.8801950
1471 1.2654840
1472 2.1274519
1501 2.5409589
1502 3.4376070
1503 3.5463350
1504 2.3805931
1505 3.7855811
1571 1.8936890
1601 2.2052779
1602 1.9399470
1603 1.9793310
1604 1.7560660
1605 2.3951440
1606 2.0425949
1607 1.8233089
1608 2.0670049
1671 1.1948800
1672 1.6418860
1701 2.2461729
1702 2.4714079
1703 2.6943281
1771 2.8562009
1801 2.5934949
1802 2.4646139
1803 3.0068729
1804 3.0068729
1871 2.8613350
3171 1.0884160
3172 1.0172660
3173 1.0067750
3174 1.1077690
3175 1.0343610
3201 2.1341860
3202 3.0801890
3203 1.5842730
3204 2.1243589
3205 1.7952410
3206 1.4221630
3207 1.7423950
3208 1.3101290
3209 1.3752080
3210 1.5409280
3211 1.6889070
3212 1.7470860
3213 1.7089300
3214 3.0490570
3215 2.0846210
3216 2.3832610
3217 1.8563091
3218 2.2266030
3219 1.6635849
3220 2.0320580
3271 2.5647359
3272 3.7019899
3273 0.6890698
3274 2.4293089
3275 1.6635849
3301 2.1549530
3302 1.2609030
3303 1.9701350
3304 2.1974881
3305 1.7169050
3306 1.5067199
3307 1.9186161
3308 1.6306280
3309 1.6773560
3310 1.1128130
3311 1.5905020
3312 1.5373360
3313 1.6841180
3314 1.7702270
3315 1.7602950
3316 2.1317220
3317 2.8792040
3318 2.1859889
3319 2.3458500
3320 2.5842540
3321 1.9162300
3322 1.5911850
3323 1.9042790
3324 1.7280720
3325 2.7725649
3326 1.8434210
3327 2.0638101
3328 1.3900610
3329 2.5278530
3371 2.2286930
3372 1.3171149
3373 2.9230270
3374 1.3237309
3375 3.1954820
3376 2.7156489
3401 1.4292470
3402 1.4927810
3403 1.1108890
3404 1.3131150
3471 1.9011170
3501 1.0978611
3502 1.8671300
3503 1.2078190
3504 1.1902070
3505 0.8884923
3506 1.0577960
3507 1.7812400
3508 1.9329630
3509 1.9194790
3510 1.4854010
3511 2.9950581
3512 3.5144720
3513 2.7770450
3514 1.3776720
3515 1.2027540
3516 1.4826070
3517 1.4451070
3518 1.6200269
3519 1.3576649
3520 1.2807170
3521 1.6804140
3522 1.8170160
3523 2.2335050
3524 1.9502480
3525 1.7752399
3526 2.7881260
3527 3.4313951
3528 2.6238761
3529 3.7579989
3571 1.3503670
3572 2.7800300
3573 0.9724348
3574 4.1374002
3575 4.4542098
3576 1.8904819
3577 1.3537910
3578 1.0447520
5101 6.2169509
5102 3.8386741
5103 1.9794390
5104 2.4221449
5105 2.2933781
5106 2.7326550
5107 2.5184560
5108 5.0397038
5171 6.2169509
5201 2.6824999
5202 2.5176351
5203 2.1165099
5204 2.4011450
5205 3.8011401
5206 2.4745700
5271 2.6824999
5301 1.5901910
5302 2.1041999
5303 1.4917210
5304 1.0987900
5305 1.2669050
5306 1.3587980
5307 1.3002290
5308 1.3847899
5309 1.0491490
5310 1.2710381
5311 1.7286550
5312 1.3011520
6101 3.0580201
6102 3.5343959
6103 4.2833261
6104 3.7351811
6105 4.0835981
6106 4.8514628
6171 3.5148201
6201 5.9337578
6202 1.2420820
6203 3.1296151
6204 3.9591200
6205 2.9761910
6271 5.8611360
6301 2.5307620
6302 3.4769270
6303 3.1039579
6304 2.9740570
6305 3.7328010
6306 2.6002550
6307 1.4256949
6308 3.0846801
6309 2.7469950
6371 2.7410600
6401 4.5569620
6402 2.7785671
6403 8.2856102
6404 3.1951880
6471 2.0191040
6472 2.5428770
7101 1.1304560
7102 2.7679579
7103 1.0272530
7104 1.8386230
7171 3.8657529
7172 2.3266089
7173 1.0272530
7201 2.6525199
7202 2.3825841
7203 2.3786600
7204 3.0813611
7301 2.2497699
7302 1.4332870
7303 1.7116520
7304 1.1004590
7305 1.5964080
7306 0.9575184
7307 1.4452670
7308 0.5908243
7309 2.9157190
7310 1.0450490
7311 5.9082479
7312 1.3693269
7313 1.4170830
7314 1.4583380
7315 1.2177920
7316 1.3776170
7317 1.6752900
7318 1.2380700
7319 1.4001040
7320 1.8684980
7321 2.4706609
7371 1.1563500
7372 1.7667850
7401 2.1162281
7402 3.2918561
7403 2.3998170
7404 3.3388979
8101 1.1649840
8102 1.3271520
8103 8.5982695
8104 8.5982695
8171 2.1352310
8201 1.5582010
8202 1.1430660
8203 2.7032320
8204 1.3381300
8205 3.0836079
8206 1.7664779
8207 2.7515249
8208 2.2697790
8209 2.3986039
8271 2.7032320

2

a

beta1 is the effect of one more school year on the log wage of a individual in the time of the test.

b

The assumptions that should hold: 1. E(schools_year*epsilon)=0 2. iid of the observations

c

Assumption 1 is probably not holding, for example, schools are coordinated with parents’ wages.

lm_model<-lm(log_wage~ education, data=data1)
lm_model$coefficients
## (Intercept)   education 
## 11.40289321  0.07703306

3

Number of schools in regions with low intensity is:

## [1] 1.884218

Number of schools in regions with high intensity is:

mean(int1$X1.num_schools)
## [1] 2.818086

b

int_levels<- split(data1, data1$program_intensity)
low_int<-as.data.frame(int_levels[1])
mean(low_int$X0.education)
## [1] 9.856125
high_int<-as.data.frame(int_levels[2])
mean(high_int$X1.education)
## [1] 8.631579
summary(lm(education ~ program_intensity,data1))
## 
## Call:
## lm(formula = education ~ program_intensity, data = data1)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -9.8561 -3.6316  0.3684  2.1439 10.3684 
## 
## Coefficients:
##                   Estimate Std. Error t value Pr(>|t|)    
## (Intercept)        9.85612    0.02434  404.88   <2e-16 ***
## program_intensity -1.22455    0.03776  -32.43   <2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 3.975 on 45622 degrees of freedom
## Multiple R-squared:  0.02253,    Adjusted R-squared:  0.02251 
## F-statistic:  1052 on 1 and 45622 DF,  p-value: < 2.2e-16

The difference cannot indicate causal effect of schools construction on years of education because the option of reverse causlity, i.e. in regions with more years of education more schools were build.

c

summary(lm(education ~ after, data1))
## 
## Call:
## lm(formula = education ~ after, data = data1)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -9.6128 -3.2181 -0.2181  2.7819  9.7819 
## 
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)    
## (Intercept)  9.21810    0.02292 402.172   <2e-16 ***
## afterTRUE    0.39469    0.04008   9.848   <2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 4.016 on 45622 degrees of freedom
## Multiple R-squared:  0.002121,   Adjusted R-squared:  0.002099 
## F-statistic: 96.98 on 1 and 45622 DF,  p-value: < 2.2e-16

The difference cannot indicate causal effect - like in the prievios question there may be reverse causlity, i.e. the program was intense in regions with more(less) education years.

d

before_after<- split(data1, data1$after)
after_group<-as.data.frame(before_after[2])

coeff <- lm(TRUE.education ~ TRUE.program_intensity, after_group)%>% 
  summary() %>% coef() 
after_diff<-coeff[2,1]
after_diff
## [1] -1.18051

In this section we calcuted the diffrence just in the areas with high intensity, so the reverse causlity is reject, despite this, we cannot indicate causal effect, because there is option that the effect is result of time trend.

e

before_group<-as.data.frame(before_after[1])

coeff <- lm(FALSE.education ~ FALSE.program_intensity, before_group)%>% 
  summary() %>% coef() 
before_diff<-coeff[2,1]
before_diff
## [1] -1.256866

f

coeff <- lm(X0.education ~ X0.after, low_int)%>% 
  summary() %>% coef() 
low_int_diff<-coeff[2,1]
low_int_diff
## [1] 0.3856688
coeff <- lm(X1.education ~ X1.after, high_int)%>% 
  summary() %>% coef() 
high_int_diff<-coeff[2,1]
high_int_diff
## [1] 0.4620243
diff_in_diff_int<-high_int_diff-low_int_diff
diff_in_diff_b_a<-after_diff-before_diff
diff_in_diff_b_a-diff_in_diff_int
## [1] -6.518119e-13
diff_in_diff_int
## [1] 0.07635548
diff_in_diff_b_a
## [1] 0.07635548

Under some assumptions (that we will note in the next answer) this diffrences indicate the causal effect. the sign make sense becuase we assume that the effect of the program will be positive as the sign indicates it is.

h

The main assumption is:
common trends - that without the program the differences in years of education in the two groups will be the same.

i+j

summary(lm(education~ program_intensity + after + program_intensity*after, data=data1))
## 
## Call:
## lm(formula = education ~ program_intensity + after + program_intensity * 
##     after, data = data1)
## 
## Residuals:
##      Min       1Q   Median       3Q      Max 
## -10.1184  -3.4759   0.5241   2.2673  10.5241 
## 
## Coefficients:
##                             Estimate Std. Error t value Pr(>|t|)    
## (Intercept)                  9.73272    0.02948 330.099  < 2e-16 ***
## program_intensity           -1.25687    0.04608 -27.277  < 2e-16 ***
## afterTRUE                    0.38567    0.05212   7.399 1.39e-13 ***
## program_intensity:afterTRUE  0.07636    0.08023   0.952    0.341    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 3.97 on 45620 degrees of freedom
## Multiple R-squared:  0.02493,    Adjusted R-squared:  0.02487 
## F-statistic: 388.8 on 3 and 45620 DF,  p-value: < 2.2e-16

The differences is not significantly diffrent from 0, we can add control varibales or estimate with fixed effect regression.

4

a

summary(lm(log_wage~ program_intensity + after + program_intensity*after, data=data1))
## 
## Call:
## lm(formula = log_wage ~ program_intensity + after + program_intensity * 
##     after, data = data1)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -3.0687 -0.3606  0.0569  0.4001  4.2786 
## 
## Coefficients:
##                              Estimate Std. Error  t value Pr(>|t|)    
## (Intercept)                 12.278991   0.004882 2515.116   <2e-16 ***
## program_intensity           -0.136504   0.007630  -17.891   <2e-16 ***
## afterTRUE                   -0.304179   0.008631  -35.243   <2e-16 ***
## program_intensity:afterTRUE  0.001172   0.013285    0.088     0.93    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.6574 on 45620 degrees of freedom
## Multiple R-squared:  0.05498,    Adjusted R-squared:  0.05492 
## F-statistic: 884.8 on 3 and 45620 DF,  p-value: < 2.2e-16

The intersting variable is the interaction variable, i.e. program_intensity*after. #### b

summary(lm(education~ program_intensity + after + program_intensity*after, data=data1))
## 
## Call:
## lm(formula = education ~ program_intensity + after + program_intensity * 
##     after, data = data1)
## 
## Residuals:
##      Min       1Q   Median       3Q      Max 
## -10.1184  -3.4759   0.5241   2.2673  10.5241 
## 
## Coefficients:
##                             Estimate Std. Error t value Pr(>|t|)    
## (Intercept)                  9.73272    0.02948 330.099  < 2e-16 ***
## program_intensity           -1.25687    0.04608 -27.277  < 2e-16 ***
## afterTRUE                    0.38567    0.05212   7.399 1.39e-13 ***
## program_intensity:afterTRUE  0.07636    0.08023   0.952    0.341    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 3.97 on 45620 degrees of freedom
## Multiple R-squared:  0.02493,    Adjusted R-squared:  0.02487 
## F-statistic: 388.8 on 3 and 45620 DF,  p-value: < 2.2e-16

c

data1$schols_after<-data1$num_schools*data1$after

fe_model<-plm(education~ num_schools + after + schols_after
,data = data1,model = "within", index = c("birth_region" ))
summary(fe_model)
## Oneway (individual) effect Within Model
## 
## Call:
## plm(formula = education ~ num_schools + after + schols_after, 
##     data = data1, model = "within", index = c("birth_region"))
## 
## Unbalanced Panel: n = 290, T = 14-863, N = 45624
## 
## Residuals:
##      Min.   1st Qu.    Median   3rd Qu.      Max. 
## -12.15798  -2.63126   0.36874   2.58778  11.58098 
## 
## Coefficients:
##              Estimate Std. Error t-value  Pr(>|t|)    
## afterTRUE    0.295037   0.079177  3.7263 0.0001946 ***
## schols_after 0.067900   0.034098  1.9913 0.0464558 *  
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Total Sum of Squares:    638620
## Residual Sum of Squares: 636710
## R-Squared:      0.0029785
## Adj. R-Squared: -0.0034217
## F-statistic: 67.7116 on 2 and 45332 DF, p-value: < 2.22e-16
fe_model_clus<-fe_model
fe_model_clus%<>%coeftest(vcov=vcovHC(fe_model,type="HC1",cluster="group"))
fe_model_clus
## 
## t test of coefficients:
## 
##              Estimate Std. Error t value Pr(>|t|)  
## afterTRUE    0.295037   0.120519  2.4481  0.01437 *
## schols_after 0.067900   0.041419  1.6393  0.10115  
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

The difference is that we allow for diffrent fixed effect in every region, we get more significant results, without clutering we get significe of 5%, and with clustring we get almost 10%.

d

dumm_year1<-dummy(data1$birth_year, sep = ".scools.")
dumm_year2<-dummy(data1$birth_year, sep = ".c71.")
dumm_year3<-dummy(data1$birth_year, sep=".")

first_inter<-(data1$num_schools)*dumm_year1
sec_inter<-(data1$children71)*dumm_year2

dumm_year1<-as.data.frame(dumm_year1)
b50<-dumm_year1$birth_year.scools.50

birth_region<-data1$birth_region

new_data<-cbind(data1,first_inter, sec_inter , dumm_year3)

model1<-plm(education~ first_inter+ sec_inter + dumm_year3 + num_schools + children71,data = new_data, model = "within", index = c("birth_region"))

summary(model1)
## Oneway (individual) effect Within Model
## 
## Call:
## plm(formula = education ~ first_inter + sec_inter + dumm_year3 + 
##     num_schools + children71, data = new_data, model = "within", 
##     index = c("birth_region"))
## 
## Unbalanced Panel: n = 290, T = 14-863, N = 45624
## 
## Residuals:
##      Min.   1st Qu.    Median   3rd Qu.      Max. 
## -12.18439  -2.66707   0.27045   2.54882  11.95078 
## 
## Coefficients: (3 dropped because of singularities)
##                                                                 Estimate
## first_interC:/Users/dorgo/Documents/R/ecob_2_2.Rmd.scools.50 -2.4350e-01
## first_interC:/Users/dorgo/Documents/R/ecob_2_2.Rmd.scools.51 -2.8615e-01
## first_interC:/Users/dorgo/Documents/R/ecob_2_2.Rmd.scools.52 -2.3187e-01
## first_interC:/Users/dorgo/Documents/R/ecob_2_2.Rmd.scools.53 -1.2166e-01
## first_interC:/Users/dorgo/Documents/R/ecob_2_2.Rmd.scools.54 -2.2176e-01
## first_interC:/Users/dorgo/Documents/R/ecob_2_2.Rmd.scools.55 -1.1929e-01
## first_interC:/Users/dorgo/Documents/R/ecob_2_2.Rmd.scools.56 -2.9329e-01
## first_interC:/Users/dorgo/Documents/R/ecob_2_2.Rmd.scools.57 -1.9670e-01
## first_interC:/Users/dorgo/Documents/R/ecob_2_2.Rmd.scools.58 -2.1750e-01
## first_interC:/Users/dorgo/Documents/R/ecob_2_2.Rmd.scools.59 -2.7758e-01
## first_interC:/Users/dorgo/Documents/R/ecob_2_2.Rmd.scools.60 -8.1495e-02
## first_interC:/Users/dorgo/Documents/R/ecob_2_2.Rmd.scools.61 -1.2765e-01
## first_interC:/Users/dorgo/Documents/R/ecob_2_2.Rmd.scools.62 -2.3863e-01
## first_interC:/Users/dorgo/Documents/R/ecob_2_2.Rmd.scools.68  2.5139e-02
## first_interC:/Users/dorgo/Documents/R/ecob_2_2.Rmd.scools.69 -5.3796e-02
## first_interC:/Users/dorgo/Documents/R/ecob_2_2.Rmd.scools.70  3.5967e-03
## first_interC:/Users/dorgo/Documents/R/ecob_2_2.Rmd.scools.71 -4.5069e-02
## sec_interC:/Users/dorgo/Documents/R/ecob_2_2.Rmd.c71.50      -2.5460e-06
## sec_interC:/Users/dorgo/Documents/R/ecob_2_2.Rmd.c71.51      -5.1982e-06
## sec_interC:/Users/dorgo/Documents/R/ecob_2_2.Rmd.c71.52      -4.0203e-06
## sec_interC:/Users/dorgo/Documents/R/ecob_2_2.Rmd.c71.53      -4.4413e-06
## sec_interC:/Users/dorgo/Documents/R/ecob_2_2.Rmd.c71.54      -2.8453e-06
## sec_interC:/Users/dorgo/Documents/R/ecob_2_2.Rmd.c71.55      -4.0534e-06
## sec_interC:/Users/dorgo/Documents/R/ecob_2_2.Rmd.c71.56      -4.2513e-06
## sec_interC:/Users/dorgo/Documents/R/ecob_2_2.Rmd.c71.57      -2.8096e-06
## sec_interC:/Users/dorgo/Documents/R/ecob_2_2.Rmd.c71.58      -5.0351e-06
## sec_interC:/Users/dorgo/Documents/R/ecob_2_2.Rmd.c71.59      -4.7916e-06
## sec_interC:/Users/dorgo/Documents/R/ecob_2_2.Rmd.c71.60      -2.7987e-06
## sec_interC:/Users/dorgo/Documents/R/ecob_2_2.Rmd.c71.61      -4.6337e-06
## sec_interC:/Users/dorgo/Documents/R/ecob_2_2.Rmd.c71.62      -3.4431e-06
## sec_interC:/Users/dorgo/Documents/R/ecob_2_2.Rmd.c71.68      -1.2624e-06
## sec_interC:/Users/dorgo/Documents/R/ecob_2_2.Rmd.c71.69      -1.5904e-06
## sec_interC:/Users/dorgo/Documents/R/ecob_2_2.Rmd.c71.70      -1.7440e-06
## sec_interC:/Users/dorgo/Documents/R/ecob_2_2.Rmd.c71.71      -1.5863e-06
## dumm_year3C:/Users/dorgo/Documents/R/ecob_2_2.Rmd.50          4.0391e-01
## dumm_year3C:/Users/dorgo/Documents/R/ecob_2_2.Rmd.51          1.3251e+00
## dumm_year3C:/Users/dorgo/Documents/R/ecob_2_2.Rmd.52          7.6005e-01
## dumm_year3C:/Users/dorgo/Documents/R/ecob_2_2.Rmd.53          8.3522e-01
## dumm_year3C:/Users/dorgo/Documents/R/ecob_2_2.Rmd.54          8.5583e-01
## dumm_year3C:/Users/dorgo/Documents/R/ecob_2_2.Rmd.55          2.7077e-01
## dumm_year3C:/Users/dorgo/Documents/R/ecob_2_2.Rmd.56          1.0614e+00
## dumm_year3C:/Users/dorgo/Documents/R/ecob_2_2.Rmd.57          6.3223e-01
## dumm_year3C:/Users/dorgo/Documents/R/ecob_2_2.Rmd.58          9.7913e-01
## dumm_year3C:/Users/dorgo/Documents/R/ecob_2_2.Rmd.59          1.2362e+00
## dumm_year3C:/Users/dorgo/Documents/R/ecob_2_2.Rmd.60          1.8237e-01
## dumm_year3C:/Users/dorgo/Documents/R/ecob_2_2.Rmd.61          1.1585e+00
## dumm_year3C:/Users/dorgo/Documents/R/ecob_2_2.Rmd.62          1.4090e+00
## dumm_year3C:/Users/dorgo/Documents/R/ecob_2_2.Rmd.68          7.4368e-01
## dumm_year3C:/Users/dorgo/Documents/R/ecob_2_2.Rmd.69          7.6188e-01
## dumm_year3C:/Users/dorgo/Documents/R/ecob_2_2.Rmd.70          2.4310e-01
## dumm_year3C:/Users/dorgo/Documents/R/ecob_2_2.Rmd.71          5.4013e-01
##                                                               Std. Error
## first_interC:/Users/dorgo/Documents/R/ecob_2_2.Rmd.scools.50  1.1284e-01
## first_interC:/Users/dorgo/Documents/R/ecob_2_2.Rmd.scools.51  1.1754e-01
## first_interC:/Users/dorgo/Documents/R/ecob_2_2.Rmd.scools.52  1.1197e-01
## first_interC:/Users/dorgo/Documents/R/ecob_2_2.Rmd.scools.53  1.0718e-01
## first_interC:/Users/dorgo/Documents/R/ecob_2_2.Rmd.scools.54  1.0698e-01
## first_interC:/Users/dorgo/Documents/R/ecob_2_2.Rmd.scools.55  1.0544e-01
## first_interC:/Users/dorgo/Documents/R/ecob_2_2.Rmd.scools.56  1.0759e-01
## first_interC:/Users/dorgo/Documents/R/ecob_2_2.Rmd.scools.57  1.0553e-01
## first_interC:/Users/dorgo/Documents/R/ecob_2_2.Rmd.scools.58  1.0094e-01
## first_interC:/Users/dorgo/Documents/R/ecob_2_2.Rmd.scools.59  9.9750e-02
## first_interC:/Users/dorgo/Documents/R/ecob_2_2.Rmd.scools.60  9.5864e-02
## first_interC:/Users/dorgo/Documents/R/ecob_2_2.Rmd.scools.61  1.1269e-01
## first_interC:/Users/dorgo/Documents/R/ecob_2_2.Rmd.scools.62  1.0036e-01
## first_interC:/Users/dorgo/Documents/R/ecob_2_2.Rmd.scools.68  9.4736e-02
## first_interC:/Users/dorgo/Documents/R/ecob_2_2.Rmd.scools.69  9.8607e-02
## first_interC:/Users/dorgo/Documents/R/ecob_2_2.Rmd.scools.70  9.3857e-02
## first_interC:/Users/dorgo/Documents/R/ecob_2_2.Rmd.scools.71  1.0089e-01
## sec_interC:/Users/dorgo/Documents/R/ecob_2_2.Rmd.c71.50       1.0945e-06
## sec_interC:/Users/dorgo/Documents/R/ecob_2_2.Rmd.c71.51       1.1985e-06
## sec_interC:/Users/dorgo/Documents/R/ecob_2_2.Rmd.c71.52       1.0945e-06
## sec_interC:/Users/dorgo/Documents/R/ecob_2_2.Rmd.c71.53       1.0835e-06
## sec_interC:/Users/dorgo/Documents/R/ecob_2_2.Rmd.c71.54       1.0808e-06
## sec_interC:/Users/dorgo/Documents/R/ecob_2_2.Rmd.c71.55       1.0275e-06
## sec_interC:/Users/dorgo/Documents/R/ecob_2_2.Rmd.c71.56       1.0894e-06
## sec_interC:/Users/dorgo/Documents/R/ecob_2_2.Rmd.c71.57       1.0633e-06
## sec_interC:/Users/dorgo/Documents/R/ecob_2_2.Rmd.c71.58       1.0242e-06
## sec_interC:/Users/dorgo/Documents/R/ecob_2_2.Rmd.c71.59       1.0319e-06
## sec_interC:/Users/dorgo/Documents/R/ecob_2_2.Rmd.c71.60       9.6296e-07
## sec_interC:/Users/dorgo/Documents/R/ecob_2_2.Rmd.c71.61       1.0661e-06
## sec_interC:/Users/dorgo/Documents/R/ecob_2_2.Rmd.c71.62       1.0228e-06
## sec_interC:/Users/dorgo/Documents/R/ecob_2_2.Rmd.c71.68       1.0051e-06
## sec_interC:/Users/dorgo/Documents/R/ecob_2_2.Rmd.c71.69       1.0096e-06
## sec_interC:/Users/dorgo/Documents/R/ecob_2_2.Rmd.c71.70       9.7628e-07
## sec_interC:/Users/dorgo/Documents/R/ecob_2_2.Rmd.c71.71       1.0456e-06
## dumm_year3C:/Users/dorgo/Documents/R/ecob_2_2.Rmd.50          3.6274e-01
## dumm_year3C:/Users/dorgo/Documents/R/ecob_2_2.Rmd.51          3.8705e-01
## dumm_year3C:/Users/dorgo/Documents/R/ecob_2_2.Rmd.52          3.6216e-01
## dumm_year3C:/Users/dorgo/Documents/R/ecob_2_2.Rmd.53          3.5344e-01
## dumm_year3C:/Users/dorgo/Documents/R/ecob_2_2.Rmd.54          3.5044e-01
## dumm_year3C:/Users/dorgo/Documents/R/ecob_2_2.Rmd.55          3.4072e-01
## dumm_year3C:/Users/dorgo/Documents/R/ecob_2_2.Rmd.56          3.5343e-01
## dumm_year3C:/Users/dorgo/Documents/R/ecob_2_2.Rmd.57          3.4464e-01
## dumm_year3C:/Users/dorgo/Documents/R/ecob_2_2.Rmd.58          3.3125e-01
## dumm_year3C:/Users/dorgo/Documents/R/ecob_2_2.Rmd.59          3.3249e-01
## dumm_year3C:/Users/dorgo/Documents/R/ecob_2_2.Rmd.60          3.1570e-01
## dumm_year3C:/Users/dorgo/Documents/R/ecob_2_2.Rmd.61          3.5712e-01
## dumm_year3C:/Users/dorgo/Documents/R/ecob_2_2.Rmd.62          3.2771e-01
## dumm_year3C:/Users/dorgo/Documents/R/ecob_2_2.Rmd.68          3.2329e-01
## dumm_year3C:/Users/dorgo/Documents/R/ecob_2_2.Rmd.69          3.3086e-01
## dumm_year3C:/Users/dorgo/Documents/R/ecob_2_2.Rmd.70          3.2013e-01
## dumm_year3C:/Users/dorgo/Documents/R/ecob_2_2.Rmd.71          3.4066e-01
##                                                              t-value
## first_interC:/Users/dorgo/Documents/R/ecob_2_2.Rmd.scools.50 -2.1578
## first_interC:/Users/dorgo/Documents/R/ecob_2_2.Rmd.scools.51 -2.4346
## first_interC:/Users/dorgo/Documents/R/ecob_2_2.Rmd.scools.52 -2.0709
## first_interC:/Users/dorgo/Documents/R/ecob_2_2.Rmd.scools.53 -1.1350
## first_interC:/Users/dorgo/Documents/R/ecob_2_2.Rmd.scools.54 -2.0729
## first_interC:/Users/dorgo/Documents/R/ecob_2_2.Rmd.scools.55 -1.1314
## first_interC:/Users/dorgo/Documents/R/ecob_2_2.Rmd.scools.56 -2.7260
## first_interC:/Users/dorgo/Documents/R/ecob_2_2.Rmd.scools.57 -1.8640
## first_interC:/Users/dorgo/Documents/R/ecob_2_2.Rmd.scools.58 -2.1548
## first_interC:/Users/dorgo/Documents/R/ecob_2_2.Rmd.scools.59 -2.7828
## first_interC:/Users/dorgo/Documents/R/ecob_2_2.Rmd.scools.60 -0.8501
## first_interC:/Users/dorgo/Documents/R/ecob_2_2.Rmd.scools.61 -1.1328
## first_interC:/Users/dorgo/Documents/R/ecob_2_2.Rmd.scools.62 -2.3778
## first_interC:/Users/dorgo/Documents/R/ecob_2_2.Rmd.scools.68  0.2654
## first_interC:/Users/dorgo/Documents/R/ecob_2_2.Rmd.scools.69 -0.5456
## first_interC:/Users/dorgo/Documents/R/ecob_2_2.Rmd.scools.70  0.0383
## first_interC:/Users/dorgo/Documents/R/ecob_2_2.Rmd.scools.71 -0.4467
## sec_interC:/Users/dorgo/Documents/R/ecob_2_2.Rmd.c71.50      -2.3262
## sec_interC:/Users/dorgo/Documents/R/ecob_2_2.Rmd.c71.51      -4.3374
## sec_interC:/Users/dorgo/Documents/R/ecob_2_2.Rmd.c71.52      -3.6731
## sec_interC:/Users/dorgo/Documents/R/ecob_2_2.Rmd.c71.53      -4.0989
## sec_interC:/Users/dorgo/Documents/R/ecob_2_2.Rmd.c71.54      -2.6325
## sec_interC:/Users/dorgo/Documents/R/ecob_2_2.Rmd.c71.55      -3.9450
## sec_interC:/Users/dorgo/Documents/R/ecob_2_2.Rmd.c71.56      -3.9022
## sec_interC:/Users/dorgo/Documents/R/ecob_2_2.Rmd.c71.57      -2.6422
## sec_interC:/Users/dorgo/Documents/R/ecob_2_2.Rmd.c71.58      -4.9163
## sec_interC:/Users/dorgo/Documents/R/ecob_2_2.Rmd.c71.59      -4.6435
## sec_interC:/Users/dorgo/Documents/R/ecob_2_2.Rmd.c71.60      -2.9063
## sec_interC:/Users/dorgo/Documents/R/ecob_2_2.Rmd.c71.61      -4.3464
## sec_interC:/Users/dorgo/Documents/R/ecob_2_2.Rmd.c71.62      -3.3664
## sec_interC:/Users/dorgo/Documents/R/ecob_2_2.Rmd.c71.68      -1.2560
## sec_interC:/Users/dorgo/Documents/R/ecob_2_2.Rmd.c71.69      -1.5754
## sec_interC:/Users/dorgo/Documents/R/ecob_2_2.Rmd.c71.70      -1.7864
## sec_interC:/Users/dorgo/Documents/R/ecob_2_2.Rmd.c71.71      -1.5172
## dumm_year3C:/Users/dorgo/Documents/R/ecob_2_2.Rmd.50          1.1135
## dumm_year3C:/Users/dorgo/Documents/R/ecob_2_2.Rmd.51          3.4237
## dumm_year3C:/Users/dorgo/Documents/R/ecob_2_2.Rmd.52          2.0987
## dumm_year3C:/Users/dorgo/Documents/R/ecob_2_2.Rmd.53          2.3631
## dumm_year3C:/Users/dorgo/Documents/R/ecob_2_2.Rmd.54          2.4422
## dumm_year3C:/Users/dorgo/Documents/R/ecob_2_2.Rmd.55          0.7947
## dumm_year3C:/Users/dorgo/Documents/R/ecob_2_2.Rmd.56          3.0032
## dumm_year3C:/Users/dorgo/Documents/R/ecob_2_2.Rmd.57          1.8345
## dumm_year3C:/Users/dorgo/Documents/R/ecob_2_2.Rmd.58          2.9558
## dumm_year3C:/Users/dorgo/Documents/R/ecob_2_2.Rmd.59          3.7180
## dumm_year3C:/Users/dorgo/Documents/R/ecob_2_2.Rmd.60          0.5777
## dumm_year3C:/Users/dorgo/Documents/R/ecob_2_2.Rmd.61          3.2441
## dumm_year3C:/Users/dorgo/Documents/R/ecob_2_2.Rmd.62          4.2996
## dumm_year3C:/Users/dorgo/Documents/R/ecob_2_2.Rmd.68          2.3003
## dumm_year3C:/Users/dorgo/Documents/R/ecob_2_2.Rmd.69          2.3027
## dumm_year3C:/Users/dorgo/Documents/R/ecob_2_2.Rmd.70          0.7594
## dumm_year3C:/Users/dorgo/Documents/R/ecob_2_2.Rmd.71          1.5855
##                                                               Pr(>|t|)    
## first_interC:/Users/dorgo/Documents/R/ecob_2_2.Rmd.scools.50 0.0309474 *  
## first_interC:/Users/dorgo/Documents/R/ecob_2_2.Rmd.scools.51 0.0149131 *  
## first_interC:/Users/dorgo/Documents/R/ecob_2_2.Rmd.scools.52 0.0383708 *  
## first_interC:/Users/dorgo/Documents/R/ecob_2_2.Rmd.scools.53 0.2563741    
## first_interC:/Users/dorgo/Documents/R/ecob_2_2.Rmd.scools.54 0.0381917 *  
## first_interC:/Users/dorgo/Documents/R/ecob_2_2.Rmd.scools.55 0.2578829    
## first_interC:/Users/dorgo/Documents/R/ecob_2_2.Rmd.scools.56 0.0064131 ** 
## first_interC:/Users/dorgo/Documents/R/ecob_2_2.Rmd.scools.57 0.0623279 .  
## first_interC:/Users/dorgo/Documents/R/ecob_2_2.Rmd.scools.58 0.0311828 *  
## first_interC:/Users/dorgo/Documents/R/ecob_2_2.Rmd.scools.59 0.0053915 ** 
## first_interC:/Users/dorgo/Documents/R/ecob_2_2.Rmd.scools.60 0.3952713    
## first_interC:/Users/dorgo/Documents/R/ecob_2_2.Rmd.scools.61 0.2573245    
## first_interC:/Users/dorgo/Documents/R/ecob_2_2.Rmd.scools.62 0.0174194 *  
## first_interC:/Users/dorgo/Documents/R/ecob_2_2.Rmd.scools.68 0.7907383    
## first_interC:/Users/dorgo/Documents/R/ecob_2_2.Rmd.scools.69 0.5853707    
## first_interC:/Users/dorgo/Documents/R/ecob_2_2.Rmd.scools.70 0.9694320    
## first_interC:/Users/dorgo/Documents/R/ecob_2_2.Rmd.scools.71 0.6550917    
## sec_interC:/Users/dorgo/Documents/R/ecob_2_2.Rmd.c71.50      0.0200147 *  
## sec_interC:/Users/dorgo/Documents/R/ecob_2_2.Rmd.c71.51      1.445e-05 ***
## sec_interC:/Users/dorgo/Documents/R/ecob_2_2.Rmd.c71.52      0.0002399 ***
## sec_interC:/Users/dorgo/Documents/R/ecob_2_2.Rmd.c71.53      4.158e-05 ***
## sec_interC:/Users/dorgo/Documents/R/ecob_2_2.Rmd.c71.54      0.0084780 ** 
## sec_interC:/Users/dorgo/Documents/R/ecob_2_2.Rmd.c71.55      7.993e-05 ***
## sec_interC:/Users/dorgo/Documents/R/ecob_2_2.Rmd.c71.56      9.544e-05 ***
## sec_interC:/Users/dorgo/Documents/R/ecob_2_2.Rmd.c71.57      0.0082386 ** 
## sec_interC:/Users/dorgo/Documents/R/ecob_2_2.Rmd.c71.58      8.852e-07 ***
## sec_interC:/Users/dorgo/Documents/R/ecob_2_2.Rmd.c71.59      3.435e-06 ***
## sec_interC:/Users/dorgo/Documents/R/ecob_2_2.Rmd.c71.60      0.0036587 ** 
## sec_interC:/Users/dorgo/Documents/R/ecob_2_2.Rmd.c71.61      1.387e-05 ***
## sec_interC:/Users/dorgo/Documents/R/ecob_2_2.Rmd.c71.62      0.0007621 ***
## sec_interC:/Users/dorgo/Documents/R/ecob_2_2.Rmd.c71.68      0.2091103    
## sec_interC:/Users/dorgo/Documents/R/ecob_2_2.Rmd.c71.69      0.1151826    
## sec_interC:/Users/dorgo/Documents/R/ecob_2_2.Rmd.c71.70      0.0740470 .  
## sec_interC:/Users/dorgo/Documents/R/ecob_2_2.Rmd.c71.71      0.1292217    
## dumm_year3C:/Users/dorgo/Documents/R/ecob_2_2.Rmd.50         0.2654986    
## dumm_year3C:/Users/dorgo/Documents/R/ecob_2_2.Rmd.51         0.0006184 ***
## dumm_year3C:/Users/dorgo/Documents/R/ecob_2_2.Rmd.52         0.0358521 *  
## dumm_year3C:/Users/dorgo/Documents/R/ecob_2_2.Rmd.53         0.0181264 *  
## dumm_year3C:/Users/dorgo/Documents/R/ecob_2_2.Rmd.54         0.0146029 *  
## dumm_year3C:/Users/dorgo/Documents/R/ecob_2_2.Rmd.55         0.4267816    
## dumm_year3C:/Users/dorgo/Documents/R/ecob_2_2.Rmd.56         0.0026728 ** 
## dumm_year3C:/Users/dorgo/Documents/R/ecob_2_2.Rmd.57         0.0665874 .  
## dumm_year3C:/Users/dorgo/Documents/R/ecob_2_2.Rmd.58         0.0031198 ** 
## dumm_year3C:/Users/dorgo/Documents/R/ecob_2_2.Rmd.59         0.0002010 ***
## dumm_year3C:/Users/dorgo/Documents/R/ecob_2_2.Rmd.60         0.5634869    
## dumm_year3C:/Users/dorgo/Documents/R/ecob_2_2.Rmd.61         0.0011790 ** 
## dumm_year3C:/Users/dorgo/Documents/R/ecob_2_2.Rmd.62         1.715e-05 ***
## dumm_year3C:/Users/dorgo/Documents/R/ecob_2_2.Rmd.68         0.0214332 *  
## dumm_year3C:/Users/dorgo/Documents/R/ecob_2_2.Rmd.69         0.0212989 *  
## dumm_year3C:/Users/dorgo/Documents/R/ecob_2_2.Rmd.70         0.4476353    
## dumm_year3C:/Users/dorgo/Documents/R/ecob_2_2.Rmd.71         0.1128519    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Total Sum of Squares:    638620
## Residual Sum of Squares: 632480
## R-Squared:      0.0096011
## Adj. R-Squared: 0.0021649
## F-statistic: 8.60751 on 51 and 45283 DF, p-value: < 2.22e-16

e

coeff1<-as.data.frame(model1$coefficients)
coeff1<-as.data.frame(t(coeff1))
coeff1 %<>% select( starts_with("first"))
coeff1<-as.data.frame(t(coeff1))

row_n<-row.names(coeff1)
row_n<-gsub("first_interbirth_year.scools.", "", row_n)
graph<-cbind(coeff1, row_n)
colnames(graph) <- c("X", "Y")

ggplot(data=graph) +
  geom_point(aes(x=Y, y=X))

The results are as we exept: we see a significant positive difference in the interctions variable between the birth years of those who enjoy the program and those who don’t enjoy it. for the missing cohorts, we exepct that the intarctions dummies will be between the dummies of the cohorts who didn’t enjoy the program and the cohorts who enjoy it.

f

Because all the cohorts before 62 didn’t enjoy the program we don’t need interactions varibales for each of them, we can use one interaction variable for all of them (before).

g

  1. Long period between the control and treatment group - because of the long period (6 year), maybe there was another thing that happened in those years and was corrlated with regions that get intense treatment and biased the results.

  2. Maybe the effect is a result of externalities of the program and not of the schools that were built.

Q2

1

First, we will define parameters. We already have to address two issues:

  1. Set seed - the “set seed” function works differently in R and stata, and there is no reasonable way to imitate stata’s function

  2. We are treating X as deterministic, hence we are generating it only once.

Next, we will define function whose purpose is to create the monte carlo procedure as defined in the problem set. Afterward we will apply it, and then we will refer to the specific questions.

monte_carlo <- function(reg = "reg_2",beta = 1,sample = "none"){
   u_i <- rnorm(50,0,0.25)
  Ti_i <- rbinom(50,1,0.5)
  Y_i <- alpha + beta*x_i + gamma*Ti_i + u_i
  data <- cbind(Y_i,x_i,Ti_i) %>% as.data.frame()
  
  
  if(sample == "none"){
    if(reg == "reg_1"){
    lm(Y_i~x_i+Ti_i,data = data)
     } else {
      lm(Y_i~Ti_i,data = data)
     }
    
    
    }
  
  else{
  if(sample == "general"){
        temp_sample <- sample(c(1:50), replace = FALSE, size = 12)
        data <- data[-temp_sample,] 
        lm(Y_i~Ti_i,data = data)
        
   }
  
  else{
  if(sample == "low"){
    control <- subset(data,Ti_i == 0)
    data %<>%  subset(!(Ti_i==0 & data$Y_i<quantile(control$Y_i,
                                                     probs = 0.25)))
     lm(Y_i~Ti_i,data = data)
    
  }
  
  else{
  if(sample == "defiers"){
    control <- subset(data,Ti_i == 0)
    threshold <- quantile(control$Y_i,probs = 0.25)+0.25
    sub_group <- data[data$Y_i<threshold & data$Ti_i == 1,]
    a <- sample_n(sub_group,0.5*nrow(sub_group))
    b <- data[row.names(a),] 
    b$Y_i <- b$Y_i-0.25
    data[row.names(a),] <- b
    lm(Y_i~Ti_i,data = data)
    
  }
  
    
    }}}}
full_reg   <-   lapply(1:200,monte_carlo, reg = "reg_1",
                       sample = "none")
unfull_reg <-   lapply(1:200,monte_carlo,
                       sample = "none")
random_sam <-   lapply(1:200,monte_carlo, beta = 0,
                       sample = "general")
censored   <-   lapply(1:200,monte_carlo, beta = 0,
                   sample = "low")
def_sample <-   lapply(1:200,monte_carlo, beta = 0,
                     sample = "defiers")
extracting <- function(reg, col = 3){
  
  a <- lapply(reg, function(x)
    summary(x)[["coefficients"]][, "t value"]) %>% 
    as.data.frame() %>% t() %>% as.data.frame()
  reg %<>% lapply(function(x) coef(x)) %>% 
    as.data.frame() %>% t() %>% as.data.frame() 
  reg %<>%  cbind(a)
  rownames(reg) <- 1:nrow(reg)
  reg <- reg[,-col]
}
RMSE <- function(outcome,gamma){
  average_gamma <- outcome[["Ti_i"]] %>% mean()
  bias <- average_gamma-gamma
  variance <- sqrt(sum((outcome[["Ti_i"]]-average_gamma)^2)/
                     nrow(outcome))
  RMSE <- sqrt((sum((bias^2) + 
                      ((outcome[["Ti_i"]]-average_gamma)^2)))/
                 nrow(outcome))
  a <- cbind(RMSE,variance,bias,average_gamma)
  return(a)
  
}

full_reg <- extracting(full_reg, col = 4)
unfull_reg <- extracting(unfull_reg)
censored <- extracting(censored)
def_sample <- extracting(def_sample)
random_sam <- extracting(random_sam)

a)

We expect the two estimator to be consistent, since the treated are randomly selected and X is not correlated with the treatment. The difference we will be in the variance - we expect the variance of (ii) to be higher and consequently the rejection ratio to be lower.

b)

The number of rejection in regression (i) and (ii) are:

(sum(full_reg$Ti_i.1>1.96)/nrow(full_reg)) %>% percent()
## [1] "94.0%"
(sum(unfull_reg$Ti_i.1>1.96)/nrow(unfull_reg)) %>% percent()
## [1] "79.0%"

The RMSE, bias, variance and gamma of (i) and (ii) are given in the following tables, respectively:

full_reg_RMSE <- RMSE(full_reg,gamma = 0.25)
unfull_reg_RMSE <- RMSE(unfull_reg,gamma = 0.25)
full_reg_RMSE%>% kable() %>% kable_styling(bootstrap_options = c("striped", "hover"))
RMSE variance bias average_gamma
0.0695974 0.0695905 0.0009833 0.2509833
unfull_reg_RMSE%>% kable() %>% kable_styling(bootstrap_options = c("striped", "hover"))
RMSE variance bias average_gamma
0.0928201 0.0926893 0.0049265 0.2549265

As we can see the variance of (ii) is higher and the rejected ratio is lower.

2

a)

Now we will see that if we don’t see random individuals the estimator will not be biased. We expect the variance to be higher and the rejection ratio to be lower, since the sample is smaller and more exposed to variance. Next we see the rejection ratio and the RMSE of this situation:

(sum(random_sam$Ti_i.1>1.96)/nrow(random_sam)) %>% percent()
## [1] "86.0%"
random_sam_RMSE <- RMSE(random_sam,gamma = 0.25)
random_sam_RMSE %>% kable() %>% kable_styling(bootstrap_options = c("striped", "hover"))
RMSE variance bias average_gamma
0.0830386 0.0830071 -0.0022881 0.2477119

b)

We expect the estimator to be biased downwards, since the bottom quartile does not appear in the data and the income of the control group appears to be higher than it truly is. Hence, the effect of the treatment appears to be lower than the real effect.

#(sum(censored$Ti_i.1>1.96)/nrow(censored)) %>% percent()
censored_RMSE <- RMSE(censored,gamma = 0.25)
censored_RMSE %>% kable() %>% kable_styling(bootstrap_options = c("striped", "hover"))
RMSE variance bias average_gamma
0.1335273 0.078478 -0.1080311 0.1419689

We see that the estimator is, indeed, biased downwards.

3

a)

The question is ambiguous, but we understand the question as if the researchers do not see the individuals who dropped from the program. Therefore we expect the estimator to be biased downwards. We attribute the low income individuals to the program although they did not participate in it:

#(sum(def_sample$Ti_i.1>1.96)/nrow(def_sample)) %>% percent()
def_sample_RMSE <- RMSE(def_sample,gamma = 0.25)
def_sample_RMSE %>% kable() %>% kable_styling(bootstrap_options = c("striped", "hover"))
RMSE variance bias average_gamma
0.0907585 0.0850131 -0.0317783 0.2182217

b)

The meaning of the coefficient is vague. It composed of the effect of the treatment, but it’s lessen by the counter-effect of lowering the income of the assigned mistakenly.

Q3

1

SUTVA may be violated because of the externalities: The people who were assigned to the treatment - the treatment group - influence the people who weren’t - the control group. The channel in which the influence flows is the labor market; Under diminishing RTS the labor market cannot absorb all of the job-seekers and a job-seeker who find a job diminishes the odds for another job-seeker to find a job. A job program will mostly (and in the extreme cases - exclusively) cause a “rat race” - a situation in which the program helps the enrollees to find a stable job on the expanse of those who weren’t in the program. If it’s correct - SUTVA fails. Tha bias that will take place will make the effect of the program look bigger than it really is; the real effect may be neglegible, because we are accounting for the externalities - not the program.

2

The randomization procedure is divided into two parts

  1. First, they randomize the share of entitled to participate in the program in each ditrict. Each district randomly gets share of 0%, 25%,50%, 75% or 100% of entitled.
  2. In each specific district the individuals who get the entitlment randomly selected. The participation is not mandatory. The researchers claim that each district labor market is distnict from other districts (except for one). The point in this experimental design is to deal with the claims about the externalities - in this experiment they randomize the externalities via the first stage, and then they can measure it.

3

In equation (6) we have coeffeicients on each level of assigning and interaction on assigned and not assigned on each level. That way we can compare between different level of assigning among the not treated and then one can get the influence of the treatment upon the control, which is the externality. The interpetation of the coeffiecients is different between the interaction coeffiecients and the indication coeffiecients - the interaction coeffiecients measures the effect of the treatment on the treated. The indication coeffiecients measures the effect of the treatment on the untreated - the externalities. The hypothesis is that the coeffecient is zero (in (6) each of them, in (7) all of them combined). The meaning of the hypothesis is that there is no externalities.

4

The data includes three different sources.

  1. Data from ANPE- the french beauro of employment - the data includes for every unemployed his age, postal address etc, number of month of unemployment and so on.
  2. Data from the counseling firms - the data let the researchers knowledge about the participants in the program.
  3. Follow-up surveys - the surveys conducted several times, and the most important one took place 8 month after the job-seeker found a job. The surveys helped the researchers to get a real data of current employment.

5

The only group in which we can clearly see externalities is men in districts with 25% of assigned job-seekers. We cannot see other externalities, and moreover, we don’t see difference between the 25%, 50% and 75% districts. Since it seems that the test is too weak in equation (7) the researchers change the test for externalities so that it will test the non-treated districts (with 0% assigned) against the treated (with 25%, 50% and 75% assigned). In this test the externalities become clear.

6

\(\kappa\) represents the percentage of eligible individuals with abilities on the same fields; For example, 30% of the economists are under 30 etc.

\(\sigma\) represents the percentage of job-seekers who were assigned to the program among those who were eligible.

\(\pi\) represents the percentage of workers who were assigned to the program among all the eligible job-seekers on specific field.

Generally, we expect that higher \(\pi\) will yield higher externalities, because the impact of filled vacancy is applied directly on competitors. Since we are assuming that \(\sigma\) is random, all the changes in \(\pi\) channels through \(\kappa\). Hence we see in the figure that the externalities rises with \(\kappa\) and \(\sigma\).