Readings

Krueger 1999

3.) What is the identification strategy

  1. The authors conducted a randomized control trial where individuals in select schools in Tennessee were randomly assigned to classes with different size classes, small (13-17 students), regular (22-25 students), and regular with aide (22-25 students). The authors did not have baseline data on student’s performance so were unable to balance the sample on the outcome variable.

4.) What are the assumption/threats to this identification strategy

  1. Differential attrition, students enter in and out of treatment. Students who may enter into a small class may have previously been in a large classroom at a different school and vice versa, this creates a bias in an unknown direction.

Ashenfelter and Krueger 1994

3.) What is the identification strategy

  1. For this paper they used twins to control for the non-observables since twins should be a strong counterfactual.

4.) What are the assumption/threats to this identification strategy

  1. Measurement error. They claim that they find no evidence that unobserved ability is positively related to the schooling level completed, however this seems unlikely.

  2. Another potential issue is sampling. In the sample, the authors stated, more women and whites than in CPS sample. This may result in a non-representative sampling. Furthermore, identical twins are not a perfect counterfactual leading to unknown biases.

Replication

twins<-read.csv('./AshenfelterKrueger1994_twins.csv')
#twins<-as.data.table(twins)
colnames(twins)<-c('famid','age', 'educ.T1', 'educ.T2', 'lwage.T1','lwage.T2', 'male.T1','male.T2','white.T1','white.T2')
long<-reshape(twins, direction='long',
              varying=c('educ.T1', 'lwage.T1', 'male.T1','white.T1', 'educ.T2','lwage.T2','male.T2','white.T2'),
              timevar='twin',
              times=c('T1','T2'),
              v.names=c('educ','lwage','male','white'),
              idvar=c('famid','age'))
long<-long[order(long$famid),]
twins$D.Ed=twins$educ.T1-twins$educ.T2
twins$D.lw=twins$lwage.T1-twins$lwage.T2
FD2<-lm(D.lw~D.Ed,data=twins)
stargazer(FD2,type='html')
Dependent variable:
D.lw
D.Ed 0.092***
(0.024)
Constant -0.079*
(0.045)
Observations 149
R2 0.092
Adjusted R2 0.086
Residual Std. Error 0.554 (df = 147)
F Statistic 14.914*** (df = 1; 147)
Note: p<0.1; p<0.05; p<0.01

The coefficient on eduction, B=0.092, should be interpreted as 1 more year of education leads to an 9.2% increase in wages holding all else constant.

long$age2<-long$age^2/100
Table3.1<-lm(lwage~educ+age+age2+male+white, data=long)
stargazer(Table3.1,type='html')
Dependent variable:
lwage
educ 0.084***
(0.014)
age 0.088***
(0.019)
age2 -0.087***
(0.023)
male 0.204***
(0.063)
white -0.410***
(0.127)
Constant -0.471
(0.426)
Observations 298
R2 0.272
Adjusted R2 0.260
Residual Std. Error 0.532 (df = 292)
F Statistic 21.860*** (df = 5; 292)
Note: p<0.1; p<0.05; p<0.01

The coefficient on the education variable should be interpreted that for a 1 additional year of education, your wage should increase by 8.4% holding all else constant. The coefficients on controls shouldn’t be interpreted closely as they are controlling for other factors correlated with wage but not with education such as innate ability, social skills, social norms, etc.