plot(PS4DATA$exper, PS4DATA$lfp, xlab="LFP (1=in LF, 0 otherwise)", ylab="Labour market experience")
I cannot tell whether there is a relationship between LFP and experience form this graph, because it is difficult to access the counts of participating women for each bin of experience.
If labour market participation were a continuous variable (say a probability observed between [0,1]) I might estimate the regression: \[y_i = \alpha + \beta_1 x_i + \epsilon_i,\] where \(y\) is LFP, \(\alpha\) and \(\beta_1\) constants, \(x\) labour market experience, \(\epsilon\) an error term, and \(i\) the observation’s index. We would run
fit1 <- lm(lfp ~ exper, PS4DATA) # note, I am not using heteroskedasticity-corrected errors
fit1_robust <- lm_robust(lfp ~ exper, PS4DATA, se_type="HC1") # I save this for later use
se_fit1 <- sqrt(diag(vcovHC.default(fit1, type = "HC1")))
stargazer(fit1, se = list(se_fit1, NULL), type = "text")
##
## ===============================================
## Dependent variable:
## ---------------------------
## lfp
## -----------------------------------------------
## exper 0.021***
## (0.002)
##
## Constant 0.345***
## (0.029)
##
## -----------------------------------------------
## Observations 753
## R2 0.117
## Adjusted R2 0.116
## Residual Std. Error 0.466 (df = 751)
## F Statistic 99.795*** (df = 1; 751)
## ===============================================
## Note: *p<0.1; **p<0.05; ***p<0.01
Where the coefficient .345 is the significant increase in the probability of LFP for a unit increase labour market experience. The intercept is the mean probability of LFP for individuals with no labour market experience.
fit1_45 <- predict(fit1_robust, newdata = list("exper" = c(45)))
fit1_45
## 1
## 1.291397
A woman with 45 years of work experience is associated a probability greater than 1 of participating in the work force. This is absurd, and results from the fact that the linear probability model(LPM) is unbounded. The plot below makes this clear. We can interpret this as by bounding all computed LPM to [0,1], such that a woman with 45 years of labour market experience has a probability 1 of LFP.
fit1_robust.stats <- summary(fit1_robust)
plot(PS4DATA$exper, PS4DATA$lfp, xlab="LFP (1=in LF, 0 otherwise)", ylab="Labour market experience")
abline(a=fit1_robust.stats$coefficients[1,1],
b=fit1_robust.stats$coefficients[2,1],
col="red", lwd=2)
legend(x="bottomright", legend="linear probability model", col="red", lty=1, lwd=2)
fit2 <- glm(lfp~exper,binomial(link = "probit"),data=PS4DATA)
fit2_45 <- predict(fit2, newdata = list("exper" = c(45)), type = "response")
fit2_45
## 1
## 0.9886487
sort.exper <- sort(PS4DATA$exper)
fit2_preds <- predict(fit2, newdata = list("exper" = sort.exper), type = "response")
plot(PS4DATA$exper, PS4DATA$lfp, xlab="LFP (1=in LF, 0 otherwise)", ylab="Labour market experience", xlim = c(-.25,50), ylim=c(-.25,1.5), pch=1)
abline(a=fit1_robust.stats$coefficients[1,1],
b=fit1_robust.stats$coefficients[2,1],
col="red", lwd=2)
lines(sort.exper, fit2_preds, col="cornflowerblue", lwd=2)
points(x=c(45, 45),y=c(fit1_45, fit2_45), pch="+", col=c("red", "blue"))
text(x=c(45-5, 45-5),y=c(fit1_45+.1, fit2_45-.1), c("LPM|Exper=45","probit|Exper=45"))
legend(x="bottomright", legend=c("LPM", "probit model"), col=c("red", "cornflowerblue"), lty=c(1,1), lwd=c(2,2))
fit3 <- lm(lfp~youngkids, data=PS4DATA)
se_fit3 <- sqrt(diag(vcovHC.default(fit3, type = "HC1")))
stargazer(fit3, se = list(se_fit3, NULL), type = "text")
##
## ===============================================
## Dependent variable:
## ---------------------------
## lfp
## -----------------------------------------------
## youngkids -0.258***
## (0.044)
##
## Constant 0.619***
## (0.020)
##
## -----------------------------------------------
## Observations 753
## R2 0.043
## Adjusted R2 0.041
## Residual Std. Error 0.485 (df = 751)
## F Statistic 33.512*** (df = 1; 751)
## ===============================================
## Note: *p<0.1; **p<0.05; ***p<0.01
The mean uniform proportion of labour force participants for the group of females with young children (youngkids=1) is significantly 25.8 percentage points lower than the group of females without.
In the Becker model, parents choose to have less children as the opportunity cost of child bearing increases. If discrimination lowers the labour-market prospects of subgroups, then discriminated subgroups of the population are both more likely to participate less in the labour market and have a lower opportunity cost of – and thus higher incentives to – bearing young children.
The OV would be discrimiation (D). LFP would be a negative function of D. Youngkids would be a positive function of D.
fit4 <- lm(lfp~youngkids + educ, data=PS4DATA)
se_fit4 <- sqrt(diag(vcovHC.default(fit4, type = "HC1")))
stargazer(fit4, se = list(se_fit4, NULL), type = "text")
##
## ===============================================
## Dependent variable:
## ---------------------------
## lfp
## -----------------------------------------------
## youngkids -0.281***
## (0.043)
##
## educ 0.045***
## (0.007)
##
## Constant 0.070
## (0.091)
##
## -----------------------------------------------
## Observations 753
## R2 0.085
## Adjusted R2 0.083
## Residual Std. Error 0.475 (df = 750)
## F Statistic 34.967*** (df = 2; 750)
## ===============================================
## Note: *p<0.1; **p<0.05; ***p<0.01
Raising young children decreases the mean probability of LFP for women by percentage points, all else equal.
fit4.preds <- predict(fit4, newdata=data.frame("youngkids"=c(1,1), "educ"=c(10,20)))
fit4.preds[2]-fit4.preds[1]
## 2
## 0.4502317
Holding youngkids=1 constant, increasing from 10 to 20 years of education increases the probability of LFP by 45.02 percentage points in the 2c) LPM.
fit5 <- glm(lfp~educ + youngkids, family=binomial("probit"), data=PS4DATA)
se_fit5 <- sqrt(diag(vcovHC.default(fit5, type = "HC1")))
stargazer(fit5, se = list(se_fit5, NULL), type = "text",
column.labels = c("robust se", "normal se"))
##
## =============================================
## Dependent variable:
## ---------------------------
## lfp
## robust se
## ---------------------------------------------
## educ 0.125***
## (0.021)
##
## youngkids -0.748***
## (0.120)
##
## Constant -1.205***
## (0.262)
##
## ---------------------------------------------
## Observations 753
## Log Likelihood -481.589
## Akaike Inf. Crit. 969.179
## =============================================
## Note: *p<0.1; **p<0.05; ***p<0.01
fit5.stats <- summary(fit5)
fit5.coefstats <- fit5.stats$coefficients
beta1 <- fit5.coefstats[3,1]
beta1.sd <- fit5.coefstats[3,2]
beta1.ci <- c(beta1-1.96*beta1.sd,
beta1+1.96*beta1.sd)
fit5.preds <- predict(fit5, newdata=data.frame("youngkids"=c(1,1), "educ"=c(10,20)), type="response")
The 95% CI for beta 1 is
## [1] -0.9846188 -0.5112666
and increasing years of educated from 10 to 20 results in a percentage point increase of LFP of
## [1] 46.55717
```