Instrumental Variables serve as a valuable tool that
allows us to run regressional models with variables that would otherwise
be endogenous. IVs do so by employing the process of 2SLS, where an
exogenous variable is used to predict the endogenous variable within our
initial model. Then, the predicted values, free of endogeneity, are used
within the original model.
To give an example of an IV, I
will review a paper written in 2006 by Jeffrey R. Kling titled
“Incarceration Length, Employment, and Earnings.” This paper, originally
published in the American Economic Review, investigates the impact of
the length of incarceration on earnings and job prospects after the
prisoner is released. This is an interesting question because the answer
could roll either way. It is possible that longer time spent in prison
will lead to negative outcomes due to the individual losing touch with
the workforce and becoming “institutionalized.” On the flip side, it is
possible that prisoners become rehabilitated while incarcerated, using
the learning opportunities in prison and other resources to emerge in a
better situation than coming in. This paper is therefore investigating
whether incarceration leads to rehabilitation or is just a form of
punishment.
Firstly, we must address why running a simple OLS model is
not a viable way to answer the paper’s research question.
To explain the impracticality of OLS in this particular
example, we must remember the five Gaus-Markov assumptions required for
a model to be BLUE. Specifically, we must think of the zero conditional
mean assumption. Incarceration Length, while correlated with earnings,
is more importantly correlated with the crime the person committed. If a
man robs a gas station, his crime will be highly correlated with the
incarceration length. Additionally, if a man is robbing a gas station,
he likely has few other options to make money. So, the incarceration
length variable suffers from some endogeneity stemming from omitted
variable bias and a ZCM violation.
To alleviate the
endogeneity issue in the model, Kling opts to use an instrumental
variable to eliminate potential bias. When selecting an IV, two
conditions must be met: instrument exogeneity and instrument relevance.
Instrument exogeneity means that it does not suffer from the same bias
issues as the initial variable for which we are using the IV.
Mathematically, this can be expressed as \(Cov(z|u)=0\), where z is the instrument we
are using. Exogeneity is important because it assures that the
instrument is not affecting the dependent variable via some pathway
included in the error term. The second is relevance. The instrument is
relevant if it impacts our initial endogenous variable. This can be
expressed mathematically as \(Cov(z|x)≠0\).
Kling chose
“leniency of a judge,” as an instrument. Leniency, in this case, is
classified as the number of years the judge gives for the same crime,
whereas judges who, on average, imprisoned people for less time were
categorized as more lenient and vice versa.
Leniency is an
exogenous instrument since judges are assigned randomly and, therefore,
are not correlated with earnings or anything else in the error term. An
argument could be made that judges in low-income areas might become
pessimistic and harsher, leading to a correlation between income and
judge leniency; however, there is no evidence for this counterfactual.
Secondly, the instrument is relevant since there is a clear relationship
between judge leniency and incarceration length.
Where “IncarcerationLengthHAT_{i}” represents the
predicted values of Incarceration Length from the 2SLS regression.