ecob_2

Q1

1 describing data

The data includes for each person: years of schooling, cohort, region, log of his wage, and data about his region: average num of schools, if the program was intense in his reagion, number of children that went to school in 71.

path <- "C:/Users/dorgo/Documents/R/Indo_Schooling.dta"
data1<-read_dta(path)
data1$after<-data1$birth_year>62 #adding before/after info to data


regions<-unique(data1[c("birth_region", "num_schools", "program_intensity", "children71")])

The size of each cohort is:

data1 %>% 
  count(birth_year) %>% kable() %>% kable_styling(bootstrap_options = c("striped", "hover"), full_width = F)

birth_year	n
50	2003
51	1455
52	1920
53	2118
54	2102
55	2576
56	2140
57	2369
58	2604
59	2616
60	3536
61	2388
62	2875
68	3114
69	3072
70	3473
71	2529
72	2734

The size of each birth region is:

data1 %>% 
  count(birth_region)%>% kable() %>% kable_styling(bootstrap_options = c("striped", "hover"), full_width = F)

birth_region	n
1101	109
1102	52
1103	110
1104	41
1105	87
1106	96
1107	150
1108	130
1171	122
1172	52
1201	61
1202	165
1203	83
1204	256
1205	153
1206	248
1207	271
1208	71
1209	72
1210	280
1211	211
1271	112
1272	110
1273	187
1274	130
1275	612
1276	135
1301	131
1302	120
1303	61
1304	127
1305	173
1306	115
1307	136
1308	74
1371	342
1372	54
1373	99
1374	56
1375	120
1376	93
1401	70
1402	51
1403	213
1404	90
1405	176
1471	104
1472	35
1501	96
1502	62
1503	71
1504	69
1505	40
1571	114
1601	85
1602	96
1603	108
1604	112
1605	74
1606	65
1607	200
1608	126
1671	296
1672	127
1701	97
1702	69
1703	24
1771	59
1801	128
1802	87
1803	56
1804	27
1871	152
3171	821
3172	863
3173	786
3174	808
3175	430
3201	161
3202	125
3203	319
3204	276
3205	234
3206	372
3207	291
3208	261
3209	242
3210	198
3211	351
3212	187
3213	205
3214	234
3215	238
3216	135
3217	221
3218	104
3219	194
3220	247
3271	197
3272	139
3273	406
3274	161
3275	99
3301	299
3302	374
3303	186
3304	165
3305	362
3306	337
3307	187
3308	278
3309	234
3310	390
3311	182
3312	320
3313	173
3314	190
3315	188
3316	142
3317	148
3318	295
3319	283
3320	253
3321	196
3322	286
3323	146
3324	217
3325	210
3326	252
3327	236
3328	267
3329	286
3371	203
3372	324
3373	142
3374	386
3375	155
3376	196
3401	126
3402	243
3403	186
3404	261
3471	300
3501	157
3502	222
3503	134
3504	230
3505	220
3506	349
3507	363
3508	213
3509	298
3510	321
3511	123
3512	128
3513	216
3514	242
3515	273
3516	272
3517	301
3518	250
3519	236
3520	184
3521	185
3522	149
3523	149
3524	194
3525	165
3526	122
3527	103
3528	89
3529	131
3571	176
3572	110
3573	185
3574	145
3575	130
3576	140
3577	131
3578	463
5101	94
5102	198
5103	185
5104	210
5105	117
5106	117
5107	100
5108	183
5171	135
5201	116
5202	117
5203	144
5204	73
5205	69
5206	166
5271	102
5301	45
5302	38
5303	108
5304	58
5305	51
5306	76
5307	85
5308	110
5309	45
5310	102
5311	69
5312	74
6101	101
6102	168
6103	75
6104	59
6105	66
6106	62
6171	148
6201	56
6202	54
6203	68
6204	77
6205	42
6271	28
6301	45
6302	72
6303	78
6304	36
6305	64
6306	98
6307	85
6308	71
6309	78
6371	171
6401	38
6402	101
6403	71
6404	85
6471	118
6472	82
7101	142
7102	67
7103	241
7104	183
7171	132
7172	165
7173	38
7201	73
7202	115
7203	152
7204	114
7301	57
7302	93
7303	39
7304	85
7305	62
7306	109
7307	73
7308	76
7309	82
7310	66
7311	132
7312	84
7313	64
7314	84
7315	63
7316	62
7317	93
7318	144
7319	85
7320	62
7321	31
7371	360
7372	81
7401	142
7402	77
7403	74
7404	30
8101	111
8102	131
8103	117
8104	68
8171	164
8201	53
8202	39
8203	82
8204	35
8205	36
8206	50
8207	43
8208	78
8209	93
8271	14

Summary of the data:

summary(data1)%>% kable() %>% kable_styling(bootstrap_options = c("striped", "hover"))

education	birth_year	birth_region	`log_wage </th>`	num_schools	program_intensity	children71	after
Min. : 0.000	Min. :50.00	Min. :1101	Min. : 9.21	Min. :0.5908	Min. :0.0000	Min. : 3796	Mode :logical
1st Qu.: 6.000	1st Qu.:55.00	1st Qu.:3173	1st Qu.:11.74	1st Qu.:1.3171	1st Qu.:0.0000	1st Qu.: 63580	FALSE:30702
Median : 9.000	Median :60.00	Median :3319	Median :12.18	Median :1.7603	Median :0.0000	Median :159434	TRUE :14922
Mean : 9.347	Mean :60.96	Mean :3670	Mean :12.12	Mean :2.0262	Mean :0.4156	Mean :162622	NA
3rd Qu.:12.000	3rd Qu.:69.00	3rd Qu.:3573	3rd Qu.:12.58	3rd Qu.:2.3986	3rd Qu.:1.0000	3rd Qu.:221623	NA
Max. :19.000	Max. :72.00	Max. :8271	Max. :16.15	Max. :8.5983	Max. :1.0000	Max. :542835	NA

The average education level in the sample is 9.3471857

Number of schools in each region is:

regions[c("birth_region", "num_schools")]%>% kable() %>% kable_styling(bootstrap_options = c("striped", "hover"),full_width = F)

birth_region	num_schools
1101	2.7295361
1102	2.6737239
1103	2.3670490
1104	2.0621350
1105	2.4550869
1106	2.3862851
1107	2.6452501
1108	2.4007659
1171	4.2678442
1172	5.0011368
1201	2.8798370
1202	2.1565940
1203	6.3137689
1204	1.6434790
1205	2.5201330
1206	2.3662519
1207	2.3666439
1208	2.7791309
1209	2.8007390
1210	1.7145150
1211	1.6756830
1271	4.8361540
1272	3.3309050
1273	2.7290239
1274	7.7355838
1275	1.1905100
1276	5.1110229
1301	1.5348140
1302	1.4699860
1303	1.7952110
1304	1.3995460
1305	1.3971270
1306	1.4346910
1307	1.8653700
1308	0.7795604
1371	1.1695690
1372	2.3205221
1373	4.2149630
1374	2.4052920
1375	1.3021600
1376	1.3560090
1401	2.3033280
1402	2.6923530
1403	2.1274519
1404	2.2996349
1405	1.8801950
1471	1.2654840
1472	2.1274519
1501	2.5409589
1502	3.4376070
1503	3.5463350
1504	2.3805931
1505	3.7855811
1571	1.8936890
1601	2.2052779
1602	1.9399470
1603	1.9793310
1604	1.7560660
1605	2.3951440
1606	2.0425949
1607	1.8233089
1608	2.0670049
1671	1.1948800
1672	1.6418860
1701	2.2461729
1702	2.4714079
1703	2.6943281
1771	2.8562009
1801	2.5934949
1802	2.4646139
1803	3.0068729
1804	3.0068729
1871	2.8613350
3171	1.0884160
3172	1.0172660
3173	1.0067750
3174	1.1077690
3175	1.0343610
3201	2.1341860
3202	3.0801890
3203	1.5842730
3204	2.1243589
3205	1.7952410
3206	1.4221630
3207	1.7423950
3208	1.3101290
3209	1.3752080
3210	1.5409280
3211	1.6889070
3212	1.7470860
3213	1.7089300
3214	3.0490570
3215	2.0846210
3216	2.3832610
3217	1.8563091
3218	2.2266030
3219	1.6635849
3220	2.0320580
3271	2.5647359
3272	3.7019899
3273	0.6890698
3274	2.4293089
3275	1.6635849
3301	2.1549530
3302	1.2609030
3303	1.9701350
3304	2.1974881
3305	1.7169050
3306	1.5067199
3307	1.9186161
3308	1.6306280
3309	1.6773560
3310	1.1128130
3311	1.5905020
3312	1.5373360
3313	1.6841180
3314	1.7702270
3315	1.7602950
3316	2.1317220
3317	2.8792040
3318	2.1859889
3319	2.3458500
3320	2.5842540
3321	1.9162300
3322	1.5911850
3323	1.9042790
3324	1.7280720
3325	2.7725649
3326	1.8434210
3327	2.0638101
3328	1.3900610
3329	2.5278530
3371	2.2286930
3372	1.3171149
3373	2.9230270
3374	1.3237309
3375	3.1954820
3376	2.7156489
3401	1.4292470
3402	1.4927810
3403	1.1108890
3404	1.3131150
3471	1.9011170
3501	1.0978611
3502	1.8671300
3503	1.2078190
3504	1.1902070
3505	0.8884923
3506	1.0577960
3507	1.7812400
3508	1.9329630
3509	1.9194790
3510	1.4854010
3511	2.9950581
3512	3.5144720
3513	2.7770450
3514	1.3776720
3515	1.2027540
3516	1.4826070
3517	1.4451070
3518	1.6200269
3519	1.3576649
3520	1.2807170
3521	1.6804140
3522	1.8170160
3523	2.2335050
3524	1.9502480
3525	1.7752399
3526	2.7881260
3527	3.4313951
3528	2.6238761
3529	3.7579989
3571	1.3503670
3572	2.7800300
3573	0.9724348
3574	4.1374002
3575	4.4542098
3576	1.8904819
3577	1.3537910
3578	1.0447520
5101	6.2169509
5102	3.8386741
5103	1.9794390
5104	2.4221449
5105	2.2933781
5106	2.7326550
5107	2.5184560
5108	5.0397038
5171	6.2169509
5201	2.6824999
5202	2.5176351
5203	2.1165099
5204	2.4011450
5205	3.8011401
5206	2.4745700
5271	2.6824999
5301	1.5901910
5302	2.1041999
5303	1.4917210
5304	1.0987900
5305	1.2669050
5306	1.3587980
5307	1.3002290
5308	1.3847899
5309	1.0491490
5310	1.2710381
5311	1.7286550
5312	1.3011520
6101	3.0580201
6102	3.5343959
6103	4.2833261
6104	3.7351811
6105	4.0835981
6106	4.8514628
6171	3.5148201
6201	5.9337578
6202	1.2420820
6203	3.1296151
6204	3.9591200
6205	2.9761910
6271	5.8611360
6301	2.5307620
6302	3.4769270
6303	3.1039579
6304	2.9740570
6305	3.7328010
6306	2.6002550
6307	1.4256949
6308	3.0846801
6309	2.7469950
6371	2.7410600
6401	4.5569620
6402	2.7785671
6403	8.2856102
6404	3.1951880
6471	2.0191040
6472	2.5428770
7101	1.1304560
7102	2.7679579
7103	1.0272530
7104	1.8386230
7171	3.8657529
7172	2.3266089
7173	1.0272530
7201	2.6525199
7202	2.3825841
7203	2.3786600
7204	3.0813611
7301	2.2497699
7302	1.4332870
7303	1.7116520
7304	1.1004590
7305	1.5964080
7306	0.9575184
7307	1.4452670
7308	0.5908243
7309	2.9157190
7310	1.0450490
7311	5.9082479
7312	1.3693269
7313	1.4170830
7314	1.4583380
7315	1.2177920
7316	1.3776170
7317	1.6752900
7318	1.2380700
7319	1.4001040
7320	1.8684980
7321	2.4706609
7371	1.1563500
7372	1.7667850
7401	2.1162281
7402	3.2918561
7403	2.3998170
7404	3.3388979
8101	1.1649840
8102	1.3271520
8103	8.5982695
8104	8.5982695
8171	2.1352310
8201	1.5582010
8202	1.1430660
8203	2.7032320
8204	1.3381300
8205	3.0836079
8206	1.7664779
8207	2.7515249
8208	2.2697790
8209	2.3986039
8271	2.7032320

2 a

beta1 is the effect of one more school year on the log wage of a individual in the time of the test.

b

The assumptions that should hold: 1. E(schools_year*epsilon)=0 2. iid of the observations

c

Assumption 1 is probably not holding, for example, schools are coordinated with parents’ wages.

lm_model<-lm(log_wage~ education, data=data1)
lm_model$coefficients

## (Intercept)   education 
## 11.40289321  0.07703306

3

Number of schools in regions with low intensity is:

## [1] 1.884218

Number of schools in regions with high intensity is:

mean(int1$X1.num_schools)

## [1] 2.818086

b

int_levels<- split(data1, data1$program_intensity)
low_int<-as.data.frame(int_levels[1])
mean(low_int$X0.education)

## [1] 9.856125

high_int<-as.data.frame(int_levels[2])
mean(high_int$X1.education)

## [1] 8.631579

summary(lm(education ~ program_intensity,data1))

## 
## Call:
## lm(formula = education ~ program_intensity, data = data1)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -9.8561 -3.6316  0.3684  2.1439 10.3684 
## 
## Coefficients:
##                   Estimate Std. Error t value Pr(>|t|)    
## (Intercept)        9.85612    0.02434  404.88   <2e-16 ***
## program_intensity -1.22455    0.03776  -32.43   <2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 3.975 on 45622 degrees of freedom
## Multiple R-squared:  0.02253,    Adjusted R-squared:  0.02251 
## F-statistic:  1052 on 1 and 45622 DF,  p-value: < 2.2e-16

The difference cannot indicate causal effect of schools construction on years of education because the option of reverse causlity, i.e. in regions with more years of education more schools were build.

c

summary(lm(education ~ after, data1))

## 
## Call:
## lm(formula = education ~ after, data = data1)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -9.6128 -3.2181 -0.2181  2.7819  9.7819 
## 
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)    
## (Intercept)  9.21810    0.02292 402.172   <2e-16 ***
## afterTRUE    0.39469    0.04008   9.848   <2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 4.016 on 45622 degrees of freedom
## Multiple R-squared:  0.002121,   Adjusted R-squared:  0.002099 
## F-statistic: 96.98 on 1 and 45622 DF,  p-value: < 2.2e-16

The difference cannot indicate causal effect - like in the prievios question there may be reverse causlity, i.e. the program was intense in regions with more(less) education years.

d

before_after<- split(data1, data1$after)
after_group<-as.data.frame(before_after[2])

coeff <- lm(TRUE.education ~ TRUE.program_intensity, after_group)%>% 
  summary() %>% coef() 
after_diff<-coeff[2,1]
after_diff

## [1] -1.18051

In this section we calcuted the diffrence just in the areas with high intensity, so the reverse causlity is reject, despite this, we cannot indicate causal effect, because there is option that the effect is result of time trend.

e

before_group<-as.data.frame(before_after[1])

coeff <- lm(FALSE.education ~ FALSE.program_intensity, before_group)%>% 
  summary() %>% coef() 
before_diff<-coeff[2,1]
before_diff

## [1] -1.256866

f

coeff <- lm(X0.education ~ X0.after, low_int)%>% 
  summary() %>% coef() 
low_int_diff<-coeff[2,1]
low_int_diff

## [1] 0.3856688

coeff <- lm(X1.education ~ X1.after, high_int)%>% 
  summary() %>% coef() 
high_int_diff<-coeff[2,1]
high_int_diff

## [1] 0.4620243

diff_in_diff_int<-high_int_diff-low_int_diff
diff_in_diff_b_a<-after_diff-before_diff
diff_in_diff_b_a-diff_in_diff_int

## [1] -6.518119e-13

diff_in_diff_int

## [1] 0.07635548

diff_in_diff_b_a

## [1] 0.07635548

Under some assumptions (that we will note in the next answer) this diffrences indicate the causal effect. the sign make sense becuase we assume that the effect of the program will be positive as the sign indicates it is.

h

The main assumption is:
common trends - that without the program the differences in years of education in the two groups will be the same.

i+j

summary(lm(education~ program_intensity + after + program_intensity*after, data=data1))

## 
## Call:
## lm(formula = education ~ program_intensity + after + program_intensity * 
##     after, data = data1)
## 
## Residuals:
##      Min       1Q   Median       3Q      Max 
## -10.1184  -3.4759   0.5241   2.2673  10.5241 
## 
## Coefficients:
##                             Estimate Std. Error t value Pr(>|t|)    
## (Intercept)                  9.73272    0.02948 330.099  < 2e-16 ***
## program_intensity           -1.25687    0.04608 -27.277  < 2e-16 ***
## afterTRUE                    0.38567    0.05212   7.399 1.39e-13 ***
## program_intensity:afterTRUE  0.07636    0.08023   0.952    0.341    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 3.97 on 45620 degrees of freedom
## Multiple R-squared:  0.02493,    Adjusted R-squared:  0.02487 
## F-statistic: 388.8 on 3 and 45620 DF,  p-value: < 2.2e-16

The differences is not significantly diffrent from 0, we can add control varibales or estimate with fixed effect regression.

4

a

summary(lm(log_wage~ program_intensity + after + program_intensity*after, data=data1))

## 
## Call:
## lm(formula = log_wage ~ program_intensity + after + program_intensity * 
##     after, data = data1)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -3.0687 -0.3606  0.0569  0.4001  4.2786 
## 
## Coefficients:
##                              Estimate Std. Error  t value Pr(>|t|)    
## (Intercept)                 12.278991   0.004882 2515.116   <2e-16 ***
## program_intensity           -0.136504   0.007630  -17.891   <2e-16 ***
## afterTRUE                   -0.304179   0.008631  -35.243   <2e-16 ***
## program_intensity:afterTRUE  0.001172   0.013285    0.088     0.93    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.6574 on 45620 degrees of freedom
## Multiple R-squared:  0.05498,    Adjusted R-squared:  0.05492 
## F-statistic: 884.8 on 3 and 45620 DF,  p-value: < 2.2e-16

The intersting variable is the interaction variable, i.e. program_intensity*after. #### b

summary(lm(education~ program_intensity + after + program_intensity*after, data=data1))

## 
## Call:
## lm(formula = education ~ program_intensity + after + program_intensity * 
##     after, data = data1)
## 
## Residuals:
##      Min       1Q   Median       3Q      Max 
## -10.1184  -3.4759   0.5241   2.2673  10.5241 
## 
## Coefficients:
##                             Estimate Std. Error t value Pr(>|t|)    
## (Intercept)                  9.73272    0.02948 330.099  < 2e-16 ***
## program_intensity           -1.25687    0.04608 -27.277  < 2e-16 ***
## afterTRUE                    0.38567    0.05212   7.399 1.39e-13 ***
## program_intensity:afterTRUE  0.07636    0.08023   0.952    0.341    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 3.97 on 45620 degrees of freedom
## Multiple R-squared:  0.02493,    Adjusted R-squared:  0.02487 
## F-statistic: 388.8 on 3 and 45620 DF,  p-value: < 2.2e-16

c

data1$schols_after<-data1$num_schools*data1$after

fe_model<-plm(education~ num_schools + after + schols_after
,data = data1,model = "within", index = c("birth_region" ))
summary(fe_model)

## Oneway (individual) effect Within Model
## 
## Call:
## plm(formula = education ~ num_schools + after + schols_after, 
##     data = data1, model = "within", index = c("birth_region"))
## 
## Unbalanced Panel: n = 290, T = 14-863, N = 45624
## 
## Residuals:
##      Min.   1st Qu.    Median   3rd Qu.      Max. 
## -12.15798  -2.63126   0.36874   2.58778  11.58098 
## 
## Coefficients:
##              Estimate Std. Error t-value  Pr(>|t|)    
## afterTRUE    0.295037   0.079177  3.7263 0.0001946 ***
## schols_after 0.067900   0.034098  1.9913 0.0464558 *  
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Total Sum of Squares:    638620
## Residual Sum of Squares: 636710
## R-Squared:      0.0029785
## Adj. R-Squared: -0.0034217
## F-statistic: 67.7116 on 2 and 45332 DF, p-value: < 2.22e-16

fe_model_clus<-fe_model
fe_model_clus%<>%coeftest(vcov=vcovHC(fe_model,type="HC1",cluster="group"))
fe_model_clus

## 
## t test of coefficients:
## 
##              Estimate Std. Error t value Pr(>|t|)  
## afterTRUE    0.295037   0.120519  2.4481  0.01437 *
## schols_after 0.067900   0.041419  1.6393  0.10115  
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

The difference is that we allow for diffrent fixed effect in every region, we get more significant results, without clutering we get significe of 5%, and with clustring we get almost 10%.

d

dumm_year1<-dummy(data1$birth_year, sep = ".scools.")
dumm_year2<-dummy(data1$birth_year, sep = ".c71.")
dumm_year3<-dummy(data1$birth_year, sep=".")

first_inter<-(data1$num_schools)*dumm_year1
sec_inter<-(data1$children71)*dumm_year2

dumm_year1<-as.data.frame(dumm_year1)
b50<-dumm_year1$birth_year.scools.50

birth_region<-data1$birth_region

new_data<-cbind(data1,first_inter, sec_inter , dumm_year3)

model1<-plm(education~ first_inter+ sec_inter + dumm_year3 + num_schools + children71,data = new_data, model = "within", index = c("birth_region"))

summary(model1)

## Oneway (individual) effect Within Model
## 
## Call:
## plm(formula = education ~ first_inter + sec_inter + dumm_year3 + 
##     num_schools + children71, data = new_data, model = "within", 
##     index = c("birth_region"))
## 
## Unbalanced Panel: n = 290, T = 14-863, N = 45624
## 
## Residuals:
##      Min.   1st Qu.    Median   3rd Qu.      Max. 
## -12.18439  -2.66707   0.27045   2.54882  11.95078 
## 
## Coefficients: (3 dropped because of singularities)
##                                                                 Estimate
## first_interC:/Users/dorgo/Documents/R/ecob_2_2.Rmd.scools.50 -2.4350e-01
## first_interC:/Users/dorgo/Documents/R/ecob_2_2.Rmd.scools.51 -2.8615e-01
## first_interC:/Users/dorgo/Documents/R/ecob_2_2.Rmd.scools.52 -2.3187e-01
## first_interC:/Users/dorgo/Documents/R/ecob_2_2.Rmd.scools.53 -1.2166e-01
## first_interC:/Users/dorgo/Documents/R/ecob_2_2.Rmd.scools.54 -2.2176e-01
## first_interC:/Users/dorgo/Documents/R/ecob_2_2.Rmd.scools.55 -1.1929e-01
## first_interC:/Users/dorgo/Documents/R/ecob_2_2.Rmd.scools.56 -2.9329e-01
## first_interC:/Users/dorgo/Documents/R/ecob_2_2.Rmd.scools.57 -1.9670e-01
## first_interC:/Users/dorgo/Documents/R/ecob_2_2.Rmd.scools.58 -2.1750e-01
## first_interC:/Users/dorgo/Documents/R/ecob_2_2.Rmd.scools.59 -2.7758e-01
## first_interC:/Users/dorgo/Documents/R/ecob_2_2.Rmd.scools.60 -8.1495e-02
## first_interC:/Users/dorgo/Documents/R/ecob_2_2.Rmd.scools.61 -1.2765e-01
## first_interC:/Users/dorgo/Documents/R/ecob_2_2.Rmd.scools.62 -2.3863e-01
## first_interC:/Users/dorgo/Documents/R/ecob_2_2.Rmd.scools.68  2.5139e-02
## first_interC:/Users/dorgo/Documents/R/ecob_2_2.Rmd.scools.69 -5.3796e-02
## first_interC:/Users/dorgo/Documents/R/ecob_2_2.Rmd.scools.70  3.5967e-03
## first_interC:/Users/dorgo/Documents/R/ecob_2_2.Rmd.scools.71 -4.5069e-02
## sec_interC:/Users/dorgo/Documents/R/ecob_2_2.Rmd.c71.50      -2.5460e-06
## sec_interC:/Users/dorgo/Documents/R/ecob_2_2.Rmd.c71.51      -5.1982e-06
## sec_interC:/Users/dorgo/Documents/R/ecob_2_2.Rmd.c71.52      -4.0203e-06
## sec_interC:/Users/dorgo/Documents/R/ecob_2_2.Rmd.c71.53      -4.4413e-06
## sec_interC:/Users/dorgo/Documents/R/ecob_2_2.Rmd.c71.54      -2.8453e-06
## sec_interC:/Users/dorgo/Documents/R/ecob_2_2.Rmd.c71.55      -4.0534e-06
## sec_interC:/Users/dorgo/Documents/R/ecob_2_2.Rmd.c71.56      -4.2513e-06
## sec_interC:/Users/dorgo/Documents/R/ecob_2_2.Rmd.c71.57      -2.8096e-06
## sec_interC:/Users/dorgo/Documents/R/ecob_2_2.Rmd.c71.58      -5.0351e-06
## sec_interC:/Users/dorgo/Documents/R/ecob_2_2.Rmd.c71.59      -4.7916e-06
## sec_interC:/Users/dorgo/Documents/R/ecob_2_2.Rmd.c71.60      -2.7987e-06
## sec_interC:/Users/dorgo/Documents/R/ecob_2_2.Rmd.c71.61      -4.6337e-06
## sec_interC:/Users/dorgo/Documents/R/ecob_2_2.Rmd.c71.62      -3.4431e-06
## sec_interC:/Users/dorgo/Documents/R/ecob_2_2.Rmd.c71.68      -1.2624e-06
## sec_interC:/Users/dorgo/Documents/R/ecob_2_2.Rmd.c71.69      -1.5904e-06
## sec_interC:/Users/dorgo/Documents/R/ecob_2_2.Rmd.c71.70      -1.7440e-06
## sec_interC:/Users/dorgo/Documents/R/ecob_2_2.Rmd.c71.71      -1.5863e-06
## dumm_year3C:/Users/dorgo/Documents/R/ecob_2_2.Rmd.50          4.0391e-01
## dumm_year3C:/Users/dorgo/Documents/R/ecob_2_2.Rmd.51          1.3251e+00
## dumm_year3C:/Users/dorgo/Documents/R/ecob_2_2.Rmd.52          7.6005e-01
## dumm_year3C:/Users/dorgo/Documents/R/ecob_2_2.Rmd.53          8.3522e-01
## dumm_year3C:/Users/dorgo/Documents/R/ecob_2_2.Rmd.54          8.5583e-01
## dumm_year3C:/Users/dorgo/Documents/R/ecob_2_2.Rmd.55          2.7077e-01
## dumm_year3C:/Users/dorgo/Documents/R/ecob_2_2.Rmd.56          1.0614e+00
## dumm_year3C:/Users/dorgo/Documents/R/ecob_2_2.Rmd.57          6.3223e-01
## dumm_year3C:/Users/dorgo/Documents/R/ecob_2_2.Rmd.58          9.7913e-01
## dumm_year3C:/Users/dorgo/Documents/R/ecob_2_2.Rmd.59          1.2362e+00
## dumm_year3C:/Users/dorgo/Documents/R/ecob_2_2.Rmd.60          1.8237e-01
## dumm_year3C:/Users/dorgo/Documents/R/ecob_2_2.Rmd.61          1.1585e+00
## dumm_year3C:/Users/dorgo/Documents/R/ecob_2_2.Rmd.62          1.4090e+00
## dumm_year3C:/Users/dorgo/Documents/R/ecob_2_2.Rmd.68          7.4368e-01
## dumm_year3C:/Users/dorgo/Documents/R/ecob_2_2.Rmd.69          7.6188e-01
## dumm_year3C:/Users/dorgo/Documents/R/ecob_2_2.Rmd.70          2.4310e-01
## dumm_year3C:/Users/dorgo/Documents/R/ecob_2_2.Rmd.71          5.4013e-01
##                                                               Std. Error
## first_interC:/Users/dorgo/Documents/R/ecob_2_2.Rmd.scools.50  1.1284e-01
## first_interC:/Users/dorgo/Documents/R/ecob_2_2.Rmd.scools.51  1.1754e-01
## first_interC:/Users/dorgo/Documents/R/ecob_2_2.Rmd.scools.52  1.1197e-01
## first_interC:/Users/dorgo/Documents/R/ecob_2_2.Rmd.scools.53  1.0718e-01
## first_interC:/Users/dorgo/Documents/R/ecob_2_2.Rmd.scools.54  1.0698e-01
## first_interC:/Users/dorgo/Documents/R/ecob_2_2.Rmd.scools.55  1.0544e-01
## first_interC:/Users/dorgo/Documents/R/ecob_2_2.Rmd.scools.56  1.0759e-01
## first_interC:/Users/dorgo/Documents/R/ecob_2_2.Rmd.scools.57  1.0553e-01
## first_interC:/Users/dorgo/Documents/R/ecob_2_2.Rmd.scools.58  1.0094e-01
## first_interC:/Users/dorgo/Documents/R/ecob_2_2.Rmd.scools.59  9.9750e-02
## first_interC:/Users/dorgo/Documents/R/ecob_2_2.Rmd.scools.60  9.5864e-02
## first_interC:/Users/dorgo/Documents/R/ecob_2_2.Rmd.scools.61  1.1269e-01
## first_interC:/Users/dorgo/Documents/R/ecob_2_2.Rmd.scools.62  1.0036e-01
## first_interC:/Users/dorgo/Documents/R/ecob_2_2.Rmd.scools.68  9.4736e-02
## first_interC:/Users/dorgo/Documents/R/ecob_2_2.Rmd.scools.69  9.8607e-02
## first_interC:/Users/dorgo/Documents/R/ecob_2_2.Rmd.scools.70  9.3857e-02
## first_interC:/Users/dorgo/Documents/R/ecob_2_2.Rmd.scools.71  1.0089e-01
## sec_interC:/Users/dorgo/Documents/R/ecob_2_2.Rmd.c71.50       1.0945e-06
## sec_interC:/Users/dorgo/Documents/R/ecob_2_2.Rmd.c71.51       1.1985e-06
## sec_interC:/Users/dorgo/Documents/R/ecob_2_2.Rmd.c71.52       1.0945e-06
## sec_interC:/Users/dorgo/Documents/R/ecob_2_2.Rmd.c71.53       1.0835e-06
## sec_interC:/Users/dorgo/Documents/R/ecob_2_2.Rmd.c71.54       1.0808e-06
## sec_interC:/Users/dorgo/Documents/R/ecob_2_2.Rmd.c71.55       1.0275e-06
## sec_interC:/Users/dorgo/Documents/R/ecob_2_2.Rmd.c71.56       1.0894e-06
## sec_interC:/Users/dorgo/Documents/R/ecob_2_2.Rmd.c71.57       1.0633e-06
## sec_interC:/Users/dorgo/Documents/R/ecob_2_2.Rmd.c71.58       1.0242e-06
## sec_interC:/Users/dorgo/Documents/R/ecob_2_2.Rmd.c71.59       1.0319e-06
## sec_interC:/Users/dorgo/Documents/R/ecob_2_2.Rmd.c71.60       9.6296e-07
## sec_interC:/Users/dorgo/Documents/R/ecob_2_2.Rmd.c71.61       1.0661e-06
## sec_interC:/Users/dorgo/Documents/R/ecob_2_2.Rmd.c71.62       1.0228e-06
## sec_interC:/Users/dorgo/Documents/R/ecob_2_2.Rmd.c71.68       1.0051e-06
## sec_interC:/Users/dorgo/Documents/R/ecob_2_2.Rmd.c71.69       1.0096e-06
## sec_interC:/Users/dorgo/Documents/R/ecob_2_2.Rmd.c71.70       9.7628e-07
## sec_interC:/Users/dorgo/Documents/R/ecob_2_2.Rmd.c71.71       1.0456e-06
## dumm_year3C:/Users/dorgo/Documents/R/ecob_2_2.Rmd.50          3.6274e-01
## dumm_year3C:/Users/dorgo/Documents/R/ecob_2_2.Rmd.51          3.8705e-01
## dumm_year3C:/Users/dorgo/Documents/R/ecob_2_2.Rmd.52          3.6216e-01
## dumm_year3C:/Users/dorgo/Documents/R/ecob_2_2.Rmd.53          3.5344e-01
## dumm_year3C:/Users/dorgo/Documents/R/ecob_2_2.Rmd.54          3.5044e-01
## dumm_year3C:/Users/dorgo/Documents/R/ecob_2_2.Rmd.55          3.4072e-01
## dumm_year3C:/Users/dorgo/Documents/R/ecob_2_2.Rmd.56          3.5343e-01
## dumm_year3C:/Users/dorgo/Documents/R/ecob_2_2.Rmd.57          3.4464e-01
## dumm_year3C:/Users/dorgo/Documents/R/ecob_2_2.Rmd.58          3.3125e-01
## dumm_year3C:/Users/dorgo/Documents/R/ecob_2_2.Rmd.59          3.3249e-01
## dumm_year3C:/Users/dorgo/Documents/R/ecob_2_2.Rmd.60          3.1570e-01
## dumm_year3C:/Users/dorgo/Documents/R/ecob_2_2.Rmd.61          3.5712e-01
## dumm_year3C:/Users/dorgo/Documents/R/ecob_2_2.Rmd.62          3.2771e-01
## dumm_year3C:/Users/dorgo/Documents/R/ecob_2_2.Rmd.68          3.2329e-01
## dumm_year3C:/Users/dorgo/Documents/R/ecob_2_2.Rmd.69          3.3086e-01
## dumm_year3C:/Users/dorgo/Documents/R/ecob_2_2.Rmd.70          3.2013e-01
## dumm_year3C:/Users/dorgo/Documents/R/ecob_2_2.Rmd.71          3.4066e-01
##                                                              t-value
## first_interC:/Users/dorgo/Documents/R/ecob_2_2.Rmd.scools.50 -2.1578
## first_interC:/Users/dorgo/Documents/R/ecob_2_2.Rmd.scools.51 -2.4346
## first_interC:/Users/dorgo/Documents/R/ecob_2_2.Rmd.scools.52 -2.0709
## first_interC:/Users/dorgo/Documents/R/ecob_2_2.Rmd.scools.53 -1.1350
## first_interC:/Users/dorgo/Documents/R/ecob_2_2.Rmd.scools.54 -2.0729
## first_interC:/Users/dorgo/Documents/R/ecob_2_2.Rmd.scools.55 -1.1314
## first_interC:/Users/dorgo/Documents/R/ecob_2_2.Rmd.scools.56 -2.7260
## first_interC:/Users/dorgo/Documents/R/ecob_2_2.Rmd.scools.57 -1.8640
## first_interC:/Users/dorgo/Documents/R/ecob_2_2.Rmd.scools.58 -2.1548
## first_interC:/Users/dorgo/Documents/R/ecob_2_2.Rmd.scools.59 -2.7828
## first_interC:/Users/dorgo/Documents/R/ecob_2_2.Rmd.scools.60 -0.8501
## first_interC:/Users/dorgo/Documents/R/ecob_2_2.Rmd.scools.61 -1.1328
## first_interC:/Users/dorgo/Documents/R/ecob_2_2.Rmd.scools.62 -2.3778
## first_interC:/Users/dorgo/Documents/R/ecob_2_2.Rmd.scools.68  0.2654
## first_interC:/Users/dorgo/Documents/R/ecob_2_2.Rmd.scools.69 -0.5456
## first_interC:/Users/dorgo/Documents/R/ecob_2_2.Rmd.scools.70  0.0383
## first_interC:/Users/dorgo/Documents/R/ecob_2_2.Rmd.scools.71 -0.4467
## sec_interC:/Users/dorgo/Documents/R/ecob_2_2.Rmd.c71.50      -2.3262
## sec_interC:/Users/dorgo/Documents/R/ecob_2_2.Rmd.c71.51      -4.3374
## sec_interC:/Users/dorgo/Documents/R/ecob_2_2.Rmd.c71.52      -3.6731
## sec_interC:/Users/dorgo/Documents/R/ecob_2_2.Rmd.c71.53      -4.0989
## sec_interC:/Users/dorgo/Documents/R/ecob_2_2.Rmd.c71.54      -2.6325
## sec_interC:/Users/dorgo/Documents/R/ecob_2_2.Rmd.c71.55      -3.9450
## sec_interC:/Users/dorgo/Documents/R/ecob_2_2.Rmd.c71.56      -3.9022
## sec_interC:/Users/dorgo/Documents/R/ecob_2_2.Rmd.c71.57      -2.6422
## sec_interC:/Users/dorgo/Documents/R/ecob_2_2.Rmd.c71.58      -4.9163
## sec_interC:/Users/dorgo/Documents/R/ecob_2_2.Rmd.c71.59      -4.6435
## sec_interC:/Users/dorgo/Documents/R/ecob_2_2.Rmd.c71.60      -2.9063
## sec_interC:/Users/dorgo/Documents/R/ecob_2_2.Rmd.c71.61      -4.3464
## sec_interC:/Users/dorgo/Documents/R/ecob_2_2.Rmd.c71.62      -3.3664
## sec_interC:/Users/dorgo/Documents/R/ecob_2_2.Rmd.c71.68      -1.2560
## sec_interC:/Users/dorgo/Documents/R/ecob_2_2.Rmd.c71.69      -1.5754
## sec_interC:/Users/dorgo/Documents/R/ecob_2_2.Rmd.c71.70      -1.7864
## sec_interC:/Users/dorgo/Documents/R/ecob_2_2.Rmd.c71.71      -1.5172
## dumm_year3C:/Users/dorgo/Documents/R/ecob_2_2.Rmd.50          1.1135
## dumm_year3C:/Users/dorgo/Documents/R/ecob_2_2.Rmd.51          3.4237
## dumm_year3C:/Users/dorgo/Documents/R/ecob_2_2.Rmd.52          2.0987
## dumm_year3C:/Users/dorgo/Documents/R/ecob_2_2.Rmd.53          2.3631
## dumm_year3C:/Users/dorgo/Documents/R/ecob_2_2.Rmd.54          2.4422
## dumm_year3C:/Users/dorgo/Documents/R/ecob_2_2.Rmd.55          0.7947
## dumm_year3C:/Users/dorgo/Documents/R/ecob_2_2.Rmd.56          3.0032
## dumm_year3C:/Users/dorgo/Documents/R/ecob_2_2.Rmd.57          1.8345
## dumm_year3C:/Users/dorgo/Documents/R/ecob_2_2.Rmd.58          2.9558
## dumm_year3C:/Users/dorgo/Documents/R/ecob_2_2.Rmd.59          3.7180
## dumm_year3C:/Users/dorgo/Documents/R/ecob_2_2.Rmd.60          0.5777
## dumm_year3C:/Users/dorgo/Documents/R/ecob_2_2.Rmd.61          3.2441
## dumm_year3C:/Users/dorgo/Documents/R/ecob_2_2.Rmd.62          4.2996
## dumm_year3C:/Users/dorgo/Documents/R/ecob_2_2.Rmd.68          2.3003
## dumm_year3C:/Users/dorgo/Documents/R/ecob_2_2.Rmd.69          2.3027
## dumm_year3C:/Users/dorgo/Documents/R/ecob_2_2.Rmd.70          0.7594
## dumm_year3C:/Users/dorgo/Documents/R/ecob_2_2.Rmd.71          1.5855
##                                                               Pr(>|t|)    
## first_interC:/Users/dorgo/Documents/R/ecob_2_2.Rmd.scools.50 0.0309474 *  
## first_interC:/Users/dorgo/Documents/R/ecob_2_2.Rmd.scools.51 0.0149131 *  
## first_interC:/Users/dorgo/Documents/R/ecob_2_2.Rmd.scools.52 0.0383708 *  
## first_interC:/Users/dorgo/Documents/R/ecob_2_2.Rmd.scools.53 0.2563741    
## first_interC:/Users/dorgo/Documents/R/ecob_2_2.Rmd.scools.54 0.0381917 *  
## first_interC:/Users/dorgo/Documents/R/ecob_2_2.Rmd.scools.55 0.2578829    
## first_interC:/Users/dorgo/Documents/R/ecob_2_2.Rmd.scools.56 0.0064131 ** 
## first_interC:/Users/dorgo/Documents/R/ecob_2_2.Rmd.scools.57 0.0623279 .  
## first_interC:/Users/dorgo/Documents/R/ecob_2_2.Rmd.scools.58 0.0311828 *  
## first_interC:/Users/dorgo/Documents/R/ecob_2_2.Rmd.scools.59 0.0053915 ** 
## first_interC:/Users/dorgo/Documents/R/ecob_2_2.Rmd.scools.60 0.3952713    
## first_interC:/Users/dorgo/Documents/R/ecob_2_2.Rmd.scools.61 0.2573245    
## first_interC:/Users/dorgo/Documents/R/ecob_2_2.Rmd.scools.62 0.0174194 *  
## first_interC:/Users/dorgo/Documents/R/ecob_2_2.Rmd.scools.68 0.7907383    
## first_interC:/Users/dorgo/Documents/R/ecob_2_2.Rmd.scools.69 0.5853707    
## first_interC:/Users/dorgo/Documents/R/ecob_2_2.Rmd.scools.70 0.9694320    
## first_interC:/Users/dorgo/Documents/R/ecob_2_2.Rmd.scools.71 0.6550917    
## sec_interC:/Users/dorgo/Documents/R/ecob_2_2.Rmd.c71.50      0.0200147 *  
## sec_interC:/Users/dorgo/Documents/R/ecob_2_2.Rmd.c71.51      1.445e-05 ***
## sec_interC:/Users/dorgo/Documents/R/ecob_2_2.Rmd.c71.52      0.0002399 ***
## sec_interC:/Users/dorgo/Documents/R/ecob_2_2.Rmd.c71.53      4.158e-05 ***
## sec_interC:/Users/dorgo/Documents/R/ecob_2_2.Rmd.c71.54      0.0084780 ** 
## sec_interC:/Users/dorgo/Documents/R/ecob_2_2.Rmd.c71.55      7.993e-05 ***
## sec_interC:/Users/dorgo/Documents/R/ecob_2_2.Rmd.c71.56      9.544e-05 ***
## sec_interC:/Users/dorgo/Documents/R/ecob_2_2.Rmd.c71.57      0.0082386 ** 
## sec_interC:/Users/dorgo/Documents/R/ecob_2_2.Rmd.c71.58      8.852e-07 ***
## sec_interC:/Users/dorgo/Documents/R/ecob_2_2.Rmd.c71.59      3.435e-06 ***
## sec_interC:/Users/dorgo/Documents/R/ecob_2_2.Rmd.c71.60      0.0036587 ** 
## sec_interC:/Users/dorgo/Documents/R/ecob_2_2.Rmd.c71.61      1.387e-05 ***
## sec_interC:/Users/dorgo/Documents/R/ecob_2_2.Rmd.c71.62      0.0007621 ***
## sec_interC:/Users/dorgo/Documents/R/ecob_2_2.Rmd.c71.68      0.2091103    
## sec_interC:/Users/dorgo/Documents/R/ecob_2_2.Rmd.c71.69      0.1151826    
## sec_interC:/Users/dorgo/Documents/R/ecob_2_2.Rmd.c71.70      0.0740470 .  
## sec_interC:/Users/dorgo/Documents/R/ecob_2_2.Rmd.c71.71      0.1292217    
## dumm_year3C:/Users/dorgo/Documents/R/ecob_2_2.Rmd.50         0.2654986    
## dumm_year3C:/Users/dorgo/Documents/R/ecob_2_2.Rmd.51         0.0006184 ***
## dumm_year3C:/Users/dorgo/Documents/R/ecob_2_2.Rmd.52         0.0358521 *  
## dumm_year3C:/Users/dorgo/Documents/R/ecob_2_2.Rmd.53         0.0181264 *  
## dumm_year3C:/Users/dorgo/Documents/R/ecob_2_2.Rmd.54         0.0146029 *  
## dumm_year3C:/Users/dorgo/Documents/R/ecob_2_2.Rmd.55         0.4267816    
## dumm_year3C:/Users/dorgo/Documents/R/ecob_2_2.Rmd.56         0.0026728 ** 
## dumm_year3C:/Users/dorgo/Documents/R/ecob_2_2.Rmd.57         0.0665874 .  
## dumm_year3C:/Users/dorgo/Documents/R/ecob_2_2.Rmd.58         0.0031198 ** 
## dumm_year3C:/Users/dorgo/Documents/R/ecob_2_2.Rmd.59         0.0002010 ***
## dumm_year3C:/Users/dorgo/Documents/R/ecob_2_2.Rmd.60         0.5634869    
## dumm_year3C:/Users/dorgo/Documents/R/ecob_2_2.Rmd.61         0.0011790 ** 
## dumm_year3C:/Users/dorgo/Documents/R/ecob_2_2.Rmd.62         1.715e-05 ***
## dumm_year3C:/Users/dorgo/Documents/R/ecob_2_2.Rmd.68         0.0214332 *  
## dumm_year3C:/Users/dorgo/Documents/R/ecob_2_2.Rmd.69         0.0212989 *  
## dumm_year3C:/Users/dorgo/Documents/R/ecob_2_2.Rmd.70         0.4476353    
## dumm_year3C:/Users/dorgo/Documents/R/ecob_2_2.Rmd.71         0.1128519    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Total Sum of Squares:    638620
## Residual Sum of Squares: 632480
## R-Squared:      0.0096011
## Adj. R-Squared: 0.0021649
## F-statistic: 8.60751 on 51 and 45283 DF, p-value: < 2.22e-16

e

coeff1<-as.data.frame(model1$coefficients)
coeff1<-as.data.frame(t(coeff1))
coeff1 %<>% select( starts_with("first"))
coeff1<-as.data.frame(t(coeff1))

row_n<-row.names(coeff1)
row_n<-gsub("first_interbirth_year.scools.", "", row_n)
graph<-cbind(coeff1, row_n)
colnames(graph) <- c("X", "Y")

ggplot(data=graph) +
  geom_point(aes(x=Y, y=X))

The results are as we exept: we see a significant positive difference in the interctions variable between the birth years of those who enjoy the program and those who don’t enjoy it. for the missing cohorts, we exepct that the intarctions dummies will be between the dummies of the cohorts who didn’t enjoy the program and the cohorts who enjoy it.

f

Because all the cohorts before 62 didn’t enjoy the program we don’t need interactions varibales for each of them, we can use one interaction variable for all of them (before).

g

Long period between the control and treatment group - because of the long period (6 year), maybe there was another thing that happened in those years and was corrlated with regions that get intense treatment and biased the results.
Maybe the effect is a result of externalities of the program and not of the schools that were built.

Q2

1

First, we will define parameters. We already have to address two issues:

Set seed - the “set seed” function works differently in R and stata, and there is no reasonable way to imitate stata’s function
We are treating X as deterministic, hence we are generating it only once.

Next, we will define function whose purpose is to create the monte carlo procedure as defined in the problem set. Afterward we will apply it, and then we will refer to the specific questions.

monte_carlo <- function(reg = "reg_2",beta = 1,sample = "none"){
   u_i <- rnorm(50,0,0.25)
  Ti_i <- rbinom(50,1,0.5)
  Y_i <- alpha + beta*x_i + gamma*Ti_i + u_i
  data <- cbind(Y_i,x_i,Ti_i) %>% as.data.frame()
  
  
  if(sample == "none"){
    if(reg == "reg_1"){
    lm(Y_i~x_i+Ti_i,data = data)
     } else {
      lm(Y_i~Ti_i,data = data)
     }
    
    
    }
  
  else{
  if(sample == "general"){
        temp_sample <- sample(c(1:50), replace = FALSE, size = 12)
        data <- data[-temp_sample,] 
        lm(Y_i~Ti_i,data = data)
        
   }
  
  else{
  if(sample == "low"){
    control <- subset(data,Ti_i == 0)
    data %<>%  subset(!(Ti_i==0 & data$Y_i<quantile(control$Y_i,
                                                     probs = 0.25)))
     lm(Y_i~Ti_i,data = data)
    
  }
  
  else{
  if(sample == "defiers"){
    control <- subset(data,Ti_i == 0)
    threshold <- quantile(control$Y_i,probs = 0.25)+0.25
    sub_group <- data[data$Y_i<threshold & data$Ti_i == 1,]
    a <- sample_n(sub_group,0.5*nrow(sub_group))
    b <- data[row.names(a),] 
    b$Y_i <- b$Y_i-0.25
    data[row.names(a),] <- b
    lm(Y_i~Ti_i,data = data)
    
  }
  
    
    }}}}

full_reg   <-   lapply(1:200,monte_carlo, reg = "reg_1",
                       sample = "none")
unfull_reg <-   lapply(1:200,monte_carlo,
                       sample = "none")
random_sam <-   lapply(1:200,monte_carlo, beta = 0,
                       sample = "general")
censored   <-   lapply(1:200,monte_carlo, beta = 0,
                   sample = "low")
def_sample <-   lapply(1:200,monte_carlo, beta = 0,
                     sample = "defiers")

extracting <- function(reg, col = 3){
  
  a <- lapply(reg, function(x)
    summary(x)[["coefficients"]][, "t value"]) %>% 
    as.data.frame() %>% t() %>% as.data.frame()
  reg %<>% lapply(function(x) coef(x)) %>% 
    as.data.frame() %>% t() %>% as.data.frame() 
  reg %<>%  cbind(a)
  rownames(reg) <- 1:nrow(reg)
  reg <- reg[,-col]
}
RMSE <- function(outcome,gamma){
  average_gamma <- outcome[["Ti_i"]] %>% mean()
  bias <- average_gamma-gamma
  variance <- sqrt(sum((outcome[["Ti_i"]]-average_gamma)^2)/
                     nrow(outcome))
  RMSE <- sqrt((sum((bias^2) + 
                      ((outcome[["Ti_i"]]-average_gamma)^2)))/
                 nrow(outcome))
  a <- cbind(RMSE,variance,bias,average_gamma)
  return(a)
  
}

full_reg <- extracting(full_reg, col = 4)
unfull_reg <- extracting(unfull_reg)
censored <- extracting(censored)
def_sample <- extracting(def_sample)
random_sam <- extracting(random_sam)

a)

We expect the two estimator to be consistent, since the treated are randomly selected and X is not correlated with the treatment. The difference we will be in the variance - we expect the variance of (ii) to be higher and consequently the rejection ratio to be lower.

b)

The number of rejection in regression (i) and (ii) are:

(sum(full_reg$Ti_i.1>1.96)/nrow(full_reg)) %>% percent()

## [1] "94.0%"

(sum(unfull_reg$Ti_i.1>1.96)/nrow(unfull_reg)) %>% percent()

## [1] "79.0%"

The RMSE, bias, variance and gamma of (i) and (ii) are given in the following tables, respectively:

full_reg_RMSE <- RMSE(full_reg,gamma = 0.25)
unfull_reg_RMSE <- RMSE(unfull_reg,gamma = 0.25)
full_reg_RMSE%>% kable() %>% kable_styling(bootstrap_options = c("striped", "hover"))

RMSE	variance	bias	average_gamma
0.0695974	0.0695905	0.0009833	0.2509833

unfull_reg_RMSE%>% kable() %>% kable_styling(bootstrap_options = c("striped", "hover"))

RMSE	variance	bias	average_gamma
0.0928201	0.0926893	0.0049265	0.2549265

As we can see the variance of (ii) is higher and the rejected ratio is lower.

2

a)

Now we will see that if we don’t see random individuals the estimator will not be biased. We expect the variance to be higher and the rejection ratio to be lower, since the sample is smaller and more exposed to variance. Next we see the rejection ratio and the RMSE of this situation:

(sum(random_sam$Ti_i.1>1.96)/nrow(random_sam)) %>% percent()

## [1] "86.0%"

random_sam_RMSE <- RMSE(random_sam,gamma = 0.25)
random_sam_RMSE %>% kable() %>% kable_styling(bootstrap_options = c("striped", "hover"))

RMSE	variance	bias	average_gamma
0.0830386	0.0830071	-0.0022881	0.2477119

b)

We expect the estimator to be biased downwards, since the bottom quartile does not appear in the data and the income of the control group appears to be higher than it truly is. Hence, the effect of the treatment appears to be lower than the real effect.

#(sum(censored$Ti_i.1>1.96)/nrow(censored)) %>% percent()
censored_RMSE <- RMSE(censored,gamma = 0.25)
censored_RMSE %>% kable() %>% kable_styling(bootstrap_options = c("striped", "hover"))

RMSE	variance	bias	average_gamma
0.1335273	0.078478	-0.1080311	0.1419689

We see that the estimator is, indeed, biased downwards.

3

a)

The question is ambiguous, but we understand the question as if the researchers do not see the individuals who dropped from the program. Therefore we expect the estimator to be biased downwards. We attribute the low income individuals to the program although they did not participate in it:

#(sum(def_sample$Ti_i.1>1.96)/nrow(def_sample)) %>% percent()
def_sample_RMSE <- RMSE(def_sample,gamma = 0.25)
def_sample_RMSE %>% kable() %>% kable_styling(bootstrap_options = c("striped", "hover"))

RMSE	variance	bias	average_gamma
0.0907585	0.0850131	-0.0317783	0.2182217

b)

The meaning of the coefficient is vague. It composed of the effect of the treatment, but it’s lessen by the counter-effect of lowering the income of the assigned mistakenly.

Q3

1

SUTVA may be violated because of the externalities: The people who were assigned to the treatment - the treatment group - influence the people who weren’t - the control group. The channel in which the influence flows is the labor market; Under diminishing RTS the labor market cannot absorb all of the job-seekers and a job-seeker who find a job diminishes the odds for another job-seeker to find a job. A job program will mostly (and in the extreme cases - exclusively) cause a “rat race” - a situation in which the program helps the enrollees to find a stable job on the expanse of those who weren’t in the program. If it’s correct - SUTVA fails. Tha bias that will take place will make the effect of the program look bigger than it really is; the real effect may be neglegible, because we are accounting for the externalities - not the program.

2

The randomization procedure is divided into two parts

First, they randomize the share of entitled to participate in the program in each ditrict. Each district randomly gets share of 0%, 25%,50%, 75% or 100% of entitled.
In each specific district the individuals who get the entitlment randomly selected. The participation is not mandatory. The researchers claim that each district labor market is distnict from other districts (except for one). The point in this experimental design is to deal with the claims about the externalities - in this experiment they randomize the externalities via the first stage, and then they can measure it.

3

In equation (6) we have coeffeicients on each level of assigning and interaction on assigned and not assigned on each level. That way we can compare between different level of assigning among the not treated and then one can get the influence of the treatment upon the control, which is the externality. The interpetation of the coeffiecients is different between the interaction coeffiecients and the indication coeffiecients - the interaction coeffiecients measures the effect of the treatment on the treated. The indication coeffiecients measures the effect of the treatment on the untreated - the externalities. The hypothesis is that the coeffecient is zero (in (6) each of them, in (7) all of them combined). The meaning of the hypothesis is that there is no externalities.

4

The data includes three different sources.

Data from ANPE- the french beauro of employment - the data includes for every unemployed his age, postal address etc, number of month of unemployment and so on.
Data from the counseling firms - the data let the researchers knowledge about the participants in the program.
Follow-up surveys - the surveys conducted several times, and the most important one took place 8 month after the job-seeker found a job. The surveys helped the researchers to get a real data of current employment.

5

The only group in which we can clearly see externalities is men in districts with 25% of assigned job-seekers. We cannot see other externalities, and moreover, we don’t see difference between the 25%, 50% and 75% districts. Since it seems that the test is too weak in equation (7) the researchers change the test for externalities so that it will test the non-treated districts (with 0% assigned) against the treated (with 25%, 50% and 75% assigned). In this test the externalities become clear.

6

\(\kappa\) represents the percentage of eligible individuals with abilities on the same fields; For example, 30% of the economists are under 30 etc.

\(\sigma\) represents the percentage of job-seekers who were assigned to the program among those who were eligible.

\(\pi\) represents the percentage of workers who were assigned to the program among all the eligible job-seekers on specific field.

Generally, we expect that higher \(\pi\) will yield higher externalities, because the impact of filled vacancy is applied directly on competitors. Since we are assuming that \(\sigma\) is random, all the changes in \(\pi\) channels through \(\kappa\). Hence we see in the figure that the externalities rises with \(\kappa\) and \(\sigma\).

ecob_2

Matan Kolerman & Dor Goldenberg

May 1, 2019

Q1

1

describing data

2

a

b

c

3

b

c

d

e

f

h

i+j

4

a

c

d

e

f

g

Q2

1

a)

b)

2

a)

b)

3

a)

b)

Q3

1

2

3

4

5

6