IV estimator
We present the second story of the IV estimator from the perspective of causal effect. We discuss Card (1993) about analyzing returns to schooling.
We’ll cover LATE in the next class.
The presentation follows Chapter 3 of Adams (2020).
1 Confounding
We are interested in the effect of education years (\(X\)) on earnings (\(Y\)).
The problem with comparing earnings from college attendees with those that have not attended college is confounding: there’s some unobserved characteristic \(U\) affecting both \(X\) and \(Y\).
The figure above illustrates the confounding problem. Here \(U\) affects both the schooling years \(X\) and the earnings \(Y\). Specifically,
the directed arrows \(b\), \(c\), \(e\) represent the causal effects.
\(U\) can affect \(Y\) via two channels:
U -> YandU -> X -> Y.
If we try to estimate \(b\) (the effect of \(X\) on \(Y\)) by regressing \(Y\) on \(X\), this will not give an estimate of \(b\). This is called the backdoor problem:
- There are two pathways connecting \(X\) to \(Y\) .
- One is the “front door” as represented by the arrow directly from X to Y with a weight of b.
- The second is the backdoor, which follows the backward arrow from X to U with the weight of \(1/c\), and then the forward arrow from U to Y, with the weight of \(e\).
2 A confounded linear model
We specify the data generation process as follows:
\[ y_i = a + bx_i + e v_{1i} \tag{1}\] \[ x_i = f + dz_i + v_{2i} + c v_{1i} \tag{2}\]
Specifically,
- \(y_i\) is individual \(i\)’s income and \(x_i\) is their education years;
- \(v_{1i}\) and \(v_{2i}\) are unobserved characteristics while \(z_{i}\) is observed.
An R simulation:
set.seed(123456789)
N <- 1000
a <- 2; b <- 3; c <- 2
e <- 3; f <- -1; d <- 4
z <- runif(N) # generate z
u_1 <- rnorm(N, mean=0, sd=3)
u_2 <- rnorm(N, mean=0, sd=1)
x <- f + d*z + u_2 + c*u_1
y <- a + b*x + e*u_1
lm1 <- lm(y ~ x)summary(lm1)$coefficients Estimate Std. Error t value Pr(>|t|)
(Intercept) 0.5113529 0.07064730 7.238109 9.068979e-13
x 4.4090257 0.01119617 393.797828 0.000000e+00
2.1 IV Estimator
The following DAG illustrates the confounding linear model:
The DAG figure implies that to estimate \(b\), we can do so by
estimating the relationship between \(Z\) and \(Y\). That effect is given by \(b \times d\),
\(d\) can be obtained by estimating the relationship between \(Z\) and \(X\).
Thus, the division of the first estimate by the second gives an estimate of \(b\).
bd_hat <- lm(y ~ z)$coef[2]
d_hat <- lm(x ~ z)$coef[2]
bd_hat/d_hat z
3.015432
2.2 What makes an IV
Below we describe the conditions for IV using the language of causal effect. You should compare this definition to the (sloppier) definition before.
Suppose we are to estimate the effect of \(X\) on \(Y\) and there’s an unobservable confounding variable \(U\). Then \(Z\) is an instrumental variable if:
\(Z\) directly affects the policy variable of interest \(X\): \(Z \to X\).
\(Z\) is independent of \(U\).
\(Z\) affects the policy variable independently of the unobserved effect: \(X = dZ + U\).
Condition 2 is called the independence assumption and Condition 3 is called the additivity assumption.
3 Card (1995): Returns to schooling
David Card, a labor economist, finds that an extra year of schooling increases income by approximately 7.5% by simple linear regression.
Card thinks this estimate is biased and argues for using the “Distance to College” as IV for “Education Years”.
In the figure above, a causal arrow goes from the unobserved characteristic to both income and education.
- That is, the unobserved characteristics of the young men may determine both the amount of education that they get and the income they earn.
- Possible \(U\) include family wealth, genes, etc.
The confounding effect of \(U\) is that we get an inaccurate estimate of \(β\). If a policy is designed to increase education, it won’t have the expected effect on income.
3.1 Distance to College as IV
Card (1995) argues that
young men who grow up near a 4-year college will have lower costs to attending college and are thus more likely to get another year of education.
growing up close to a 4 year college is unlikely to be determined by unobserved characteristics that also determine the amount of education that the young man gets and the income that the young man earns.
In the graph, the assumption is represented as an arrow from “distance to college” to education and no arrow from unobserved characteristics to “distance to college.”
Card (1995) uses the following model:
\[ \log \mathsf{wage}76_i = α_1 + β δ \mathsf{nearCollege}_i + γ_1 \mathsf{observables}_i + \mathsf{unobservables}_{i1} \] \[ ed_i = α_2 + δ \mathsf{nearCollege}_i + γ_2 \mathsf{observables}_i + \mathsf{unobservables}_{i2} \]
- return to schooling is given by \(β\), which measures the impact of an additional year of schooling on log wages.
4 Instrumental validity
To use the instrumental variable method, we need an observed characteristic that satisfies three assumptions.
- First, the observed characteristic needs to causally affect the policy variable.
- This assumption is relatively easy to check. It requires the context or institutional knowledge to check whether there is a direct causal relation between \(X\) and \(Z\). We also need to compute the correlation between \(X\) and \(Z\).
- Second, the observed characteristic is not affected by unobserved characteristics affecting the outcome of interest.
- It’s impossible to test this assumption empirically.
- We can only use institutional knowledge (or “common sense”) to argue that this assumption holds. For example, distance to college (\(Z\)) should be independent of family wealth (\(U\)).
- Third, the observed characteristic’s effect on the policy variable is additively separable from the unobserved characteristic’s effect.
- It’s impossible to test this assumption empirically.
- Some people argue that this assumption usually does not hold in reality, and accordingly we economists should use LATE estimand rather than the 2SLS estimand.