RDD is used to see the impact of a change in a single
variable on an otherwise similair population. This is done by
establishing a specific cutoff in a flow variable, and seeing whether at
that specific threshold there is some sort of discrete jump in the
distribution.
##
## Attaching package: 'dplyr'
## The following objects are masked from 'package:stats':
##
## filter, lag
## The following objects are masked from 'package:base':
##
## intersect, setdiff, setequal, union
Where Tau D represents the discrete jump, c represents a
constant cutoff (in this case 50), and the final term represents the
additional slope of the line AFTER the treatment cutoff.
RD can generate causal estimates, IF and only if the cutoff is exogenous, and cannot be directly manipulated by the subjects that the cutoff affects. When these conditions are met, an RD design is a powerful tool, since it is essentially taking advantage of a natural experiment. An example of an efficient RD design would be, does being part of Phi Beta Kappa affect life outcomes. People can improve their grades, however the PBK cutoff changes every year, and so they cannot precisely manipulate it above or bellow, making the cutoff exogenous and unmanipulated. With a sufficient amount of observations very close to the cutoff, allowing us to mitigate the bias versus accuracy isssue, RD can be very effective.
An opposite of a functioning RD is when the government established tax credits for small businesses of a certain number of employees during COVID. Since companies knew where the cutoff was, they could manipulate how many employees they had, in order to get the tax credit. It is in cases like this when RD is an ineffective design.
In 2004 Lee, Moretti, and Butler used a sharp RD design in their paper “Do Voters Affect or Elect Policies? Evidence from the U.S. House.”
The goal of the paper was to see whether elections and behavior
of congressmen were dictated by divergence theory, convergence theory,
or whether the results were irrelevant in close elections. They did so
by developing an RD design, where on the x-axis they had democratic vote
share, spanning from 0 to 1, and on the y axis they had ADA score, a
metric used for measuring how liberal a candidate’s policies are. The
design looked something like this.
Where, of course if Dem Vote Share > 0.5, then a democrat has won. The results are consistent with divergence theory, which indicates that at the 0.5 line candidates try to be more aggressive in their policy makers to accumulate more votes. I do not believe I would have done anything differently in this experiment, since it is the perfect example of appropriate RD design. The dataset they are using is large, and so their bandwidth can be relatively narrow, the cutoff is not fuzzy, and it cannot be accurately manipulated.