1. What is the research question that the authors are interested in estimating?

Ans. The authors are interested in estimating the causal impact of childbearing on female labor supply and the implications for family labor allocation. Specifically, they aim to understand how additional children, beyond the second child, affect the labor force participation of women.

2. What is the OLS regression specification/functional form/estimating equation to answer the research question above that a novice Econometrician would run?

Ans. A novice econometrician would run the following OLS regression equation to estimate the relationship between childbearing (X) and female labor supply (Y).

\(Y_i\) = \(\beta_0\) + \(\beta_1X_i\) + \(\epsilon_i\)

where,

3. Why would running the above naive OLS specification cause bias? Please explain?

Ans. Running the above OLS can lead to bias because of the likely presence of endogeneity. Childbearing choices (X) are not exogenous; they may be influenced by unobserved factors affecting labor supply, such as a mother’s preference for working. Additionally, other potential sources of endogeneity, such as reverse causation, omitted variables, or measurement error, can distort the estimated relationship leading to inconsistent results.

4. What instrument (Z) can you use to get around and get true causal effects?

Ans. To obtain unbiased causal effects, the authors use the gender of a woman’s first two children (Z) as an instrument for childbearing beyond the second child (X). This instrument is based on the assumption that the gender of the first two children affects the probability of having a third child but does not directly influence labor supply.

4.1. Please give the first-stage and second-stage regression equations, and explain them. Try to convince the reader this is the better study design and should work.

Ans. The first-stage regression relates childbearing (X) to the instrument (Z) and other control variables. It estimates how the gender of the first two children influences the likelihood of having a third child.

\(X_i\) = \(\beta_0\) + \(\beta_1Z_i\) + \(\beta_2W_i\) + \(\epsilon_i\)

where,

  • \(X_i\) represents childbearing beyond the second child.
  • \(Z_i\) denotes the gender of the first two children (instrument).
  • \(W_i\) includes other control variables.
  • \(\epsilon_i\) is the error term.

The second-stage regression examines the impact of childbearing (X) on female labor supply (Y) after accounting for the endogeneity issue. It uses the predicted values of childbearing from the first stage and control variables.

\(Y_i\) = \(\gamma_0\) + \(\gamma_1\hat{X}_i\) + \(\gamma_2W_i\) + \(\epsilon_i\)

where,

  • \(Y_i\) represents a woman’s labor force participation.
  • \(\hat{X}_i\) is the predicted values of childbearing from the first stage.
  • \(W_i\) includes control variables.
  • \(\epsilon_i\) is the error term.

The first-stage regression addresses endogeneity and it guarantees the validity of the results. From the article’s conclusion, the findings add reliable empirical support for the idea that childbirth reduces the woman’s labor force participation.

4.1.1. Why would the instrument be relevant (strongly correlated with the endogenous variable)? Give an intuitive explanation in words.

Ans. The instrument (Z) “Same sex, ie., either two boys or two” is relevant because it is strongly correlated with the endogenous variable (More than 2 Kids). Families are more likely to have additional children if their first two children are of the same sex, making this a suitable instrument.

4.1.2. Why would the instrument be exogenous (not affect the response variable in the second stage through any other channel except through the fitted value of the endogenous variables from the first stage)?

Ans. The instrument (Z) is considered exogenous as it should not affect female labor supply (Y) through any other way except through the fitted value of childbearing (X) from the first stage. The gender of the first two children is unlikely to have a direct impact on a woman’s labor supply. It does not directly influence labor force participation but influences the likelihood of having additional children based on the gender of the first two children.

4.2. Can you give an example of when exogenetiy could be potentially violated? Just a theoretical case/use your imagination. This is in a way to argue why your opponent’s instrument may not be good and thus cast doubt on their study’s conclusion.

Ans. A potential violation of exogeneity could occur if the instrument (Z) was found to affect female labor supply through another way unrelated to childbearing. For example, if societal attitudes and expectations regarding families with specific genders led to differential labor supply decisions, this could violate the exogeneity assumption.