RD Estimator Bias
Setup
Let \(Y,R\) denote outcome and running variable. Set cutoff \(c=0\), and assume RD sharp treatment rule \(T=\boldsymbol{1}\left(R\geq0\right)\). \(q=2\) (second order polynomial for the bias estimation) \(p=1\) (first order polynomial for the outcome regression estimation), \(\nu=0\) (0th order derivative), sample size \(n\) of pairs \(\left(Y_{i},R_{i}\right)_{i=1}^{n}\) observed data points.
Estimator
Say we are interested in estimating the intercept from above the cutoff.
Denote by \(I_{+,i}=\boldsymbol{1}\left(R_{i}\geq c\right)\) the indicator whether the observation is above the cutoff, and by \(\mathcal{I}_+=\left\{ i:I_{+,i}=1\right\}\) the set of units above the cutoff, and its size by \(n_{+}=\left|\mathcal{I}_{+}\right|\). Also denote the following matrices
\[\begin{equation} Y=\left[\begin{array}{c} Y_{1}\\ \vdots\\ Y_{n} \end{array}\right]_{n_{+}\times1},\quad X\left(h\right)=\left[\begin{array}{cc} 1 & R_{1}\\ \vdots\\ 1 & R_{n} \end{array}\right]_{n_{+}\times2} \end{equation}\]
\[\begin{equation} W_{+}=\left[\begin{array}{ccc} K_{h}\left(R_{1}\right) & & 0\\ & \ddots\\ 0 & & K_{h}\left(R_{n}\right) \end{array}\right]_{n_{+}\times n_{+}} \end{equation}\] where \(K_{h}\left(r\right)=K\left(r/h\right)/h\) for some kernel function \(K\) and bandwidth \(h\).
We can write the estimator of the weighted least squares using a first order polynomial by \[\begin{equation} \widehat{\beta}_{+,1}=\arg\min_{a,b}\left\{ \sum_{i:i\in\mathcal{I}_{+}}K_{h}\left(R_{i}\right)\left[Y-a-bR_{i}\right]^{2}\right\} \end{equation}\]
and in matrix form
\[\begin{equation} \widehat{\beta}_{+,1}=\left(X^{\prime}W_{+}X\right)^{-1}X^{\prime}W_{+}Y \end{equation}\]
Estimator CCT 2014 format
For ease of presentation, assume the first \(i=1\) until \(i=n_{+}\) units are above the cutoff, i.e. \(I_{+,i}=1\). Denote the following vectors and matrices
\[\begin{align*} r_{p}\left(r\right) & =\left[1,r,...,r^{p}\right]^{\prime}\\ X_{+,p}\left(h\right) & =\left[r_{p}\left(R_{1}/h\right),...,r_{p}\left(R_{n_{+}}/h\right)\right]\\ S_{+,p} & =\left[\begin{array}{c} \left(X_{1}/h\right)^{p}\\ \vdots\\ \left(X_{n_{+}}/h\right)^{p} \end{array}\right]_{n_{+}\times1} \end{align*}\]
which together make up the matrices
\[\begin{align*} \Gamma_{+,p}\left(h\right) & =X_{+,p}\left(h\right)^{\prime}W_{+}\left(h\right)X_{+,p}\left(h\right)/n\\ \vartheta_{+,p,q}\left(h\right) & =X_{+,p}\left(h\right)^{\prime}W_{+}\left(h\right)S_{+,q}\left(h\right)/n \end{align*}\]
where we set \(p=1\), \(q=2\) unless stated otherwise.
Using this notation, we can re-write the estimator as
\[\begin{equation} \widehat{\beta}_{+,1}=H_{1}\left(\Gamma_{+,1}\left(h\right)\right)^{-1}X_{+,1}\left(h\right)^{\prime}W_{+}\left(h\right)Y/n \end{equation}\]
where \[\begin{equation} H_{1}=\left[\begin{array}{cc} 1 & 0\\ 0 & h^{-1} \end{array}\right] \end{equation}\]
(its a nice exercise to check whether the two estimators are equivalent)
Taylor Expansion
Denote the conditional outcome mean by \(\mu\left(r\right)=\mathbb{E}\left[Y\mid R=r\right]\) and its derivative via \(\mu^{\left(v\right)}\left(r\right)=d^{v}\mu\left(r\right)/dr^{v}\).
The (3rd order) taylor expansion to the right of the cutoff at \(r=c=0\) is given by \[\begin{equation} \mu_{+}\left(r\right)=\mu_{+}^{\left(0\right)}\left(0\right)\frac{r^{0}}{0!}+\mu_{+}^{\left(1\right)}\left(0\right)\frac{r^{1}}{1!}+\mu_{+}^{\left(2\right)}\left(0\right)\frac{r^{2}}{2!}+\mu_{+}^{\left(3\right)}\left(0\right)\frac{r^{3}}{3!}+TR \end{equation}\] where \(TR\) is the remainder from the taylor expansion, which is needed to achive equiality and not approximation of the above equation.
Now, denote for each unit \(i\) the taylor expansion \[\begin{equation} \mu\left(R_{i}\right)=\mu\left(0\right)+\mu^{\left(1\right)}\left(0\right)\frac{R_{i}^{1}}{1!}+\mu^{\left(2\right)}\left(0\right)\frac{R_{i}^{2}}{2!}+\mu^{\left(3\right)}\left(0\right)\frac{R_{i}^{3}}{3!}+TR_{i} \end{equation}\]
Denote the vector of taylor expansions at 0 via \[\begin{equation} M=\mu\left(0\right)\left[\begin{array}{c} 1\\ \vdots\\ 1 \end{array}\right]+\mu^{\left(1\right)}\left(0\right)\left[\begin{array}{c} R_{1}\\ \vdots\\ R_{n_{+}} \end{array}\right]+\frac{\mu^{\left(2\right)}\left(0\right)}{2}\left[\begin{array}{c} R_{1}^{2}\\ \vdots\\ R_{n_{+}}^{2} \end{array}\right]+\frac{\mu^{\left(3\right)}\left(0\right)}{6}\left[\begin{array}{c} R_{1}^{3}\\ \vdots\\ R_{n_{+}}^{3} \end{array}\right]+\left[\begin{array}{c} TR_{1}\\ \vdots\\ TR_{n_{+}} \end{array}\right] \end{equation}\]
Now manipulate this a bit to get \[\begin{align*} M & =\mu\left(0\right)\left[\begin{array}{c} 1\\ \vdots\\ 1 \end{array}\right]+h\mu^{\left(1\right)}\left(0\right)\left[\begin{array}{c} R_{1}h^{-1}\\ \vdots\\ R_{n_{+}}h^{-1} \end{array}\right]+\frac{h^{2}\mu^{\left(2\right)}\left(0\right)}{2}\left[\begin{array}{c} R_{1}^{2}h^{-2}\\ \vdots\\ R_{n_{+}}^{2}h^{-2} \end{array}\right]+\frac{h^{3}\mu^{\left(3\right)}\left(0\right)}{6}\left[\begin{array}{c} R_{1}^{3}h^{-3}\\ \vdots\\ R_{n_{+}}^{3}h^{-3} \end{array}\right]+o_{p}\left(h^{3}\right)\\ & =X_{1}\left(h\right)\left[\begin{array}{c} \mu\left(0\right)\\ h\mu^{\left(1\right)}\left(0\right) \end{array}\right]+\frac{h^{2}\mu^{\left(2\right)}\left(0\right)}{2}S_{+,2}+\frac{h^{3}\mu^{\left(3\right)}\left(0\right)}{6}S_{+,3}+o_{p}\left(h^{3}\right) \end{align*}\]
Bias Statement
For \(s=v=0\), \(l=p=1\), and \(\mathcal{R}_{n}=\left[R_{1},...,R_{n}\right]^{\prime}\), we can characterize the conditional expectation of the WLS estimator by \[\begin{align*} \mathbb{E}\left[\widehat{\beta}_{+,1}\left(h\right)\mid\mathcal{R}_{n}\right] & =H_{1}\left(\Gamma_{+,1}\left(h\right)\right)^{-1}X_{+,1}\left(h\right)^{\prime}W_{+}\left(h\right)M/n\\ & =H_{1}\left(\Gamma_{+,1}\left(h\right)\right)^{-1}X_{+,1}\left(h\right)^{\prime}W_{+}\left(h\right)\\ & \left(X_{1}\left(h\right)\left[\begin{array}{c} \mu\left(0\right)\\ h\mu^{\left(1\right)}\left(0\right) \end{array}\right]+\frac{h^{2}\mu^{\left(2\right)}\left(0\right)}{2}S_{+,2}+\frac{h^{3}\mu^{\left(3\right)}\left(0\right)}{6}S_{+,3}+o_{p}\left(h^{3}\right)\right)/n\\ & =H_{1}\left[\begin{array}{c} \mu\left(0\right)\\ h\mu^{\left(1\right)}\left(0\right) \end{array}\right]\\ & +H_{1}\left(\Gamma_{+,1}\left(h\right)\right)^{-1}X_{+,1}\left(h\right)^{\prime}W_{+}\left(h\right)S_{+,2}/n\frac{h^{2}\mu^{\left(2\right)}\left(0\right)}{2}\\ & +H_{1}\left(\Gamma_{+,1}\left(h\right)\right)^{-1}X_{+,1}\left(h\right)^{\prime}W_{+}\left(h\right)S_{+,3}/n\frac{h^{3}\mu^{\left(3\right)}\left(0\right)}{6}+o_{p}\left(h^{3}\right)\\ & =\left[\begin{array}{c} \mu\left(0\right)\\ \mu^{\left(1\right)}\left(0\right) \end{array}\right]\\ & +H_{1}\left(\Gamma_{+,1}\left(h\right)\right)^{-1}\vartheta_{+,1,2}\left(h\right)\frac{h^{2}\mu^{\left(2\right)}\left(0\right)}{2}\\ & +H_{1}\left(\Gamma_{+,1}\left(h\right)\right)^{-1}\vartheta_{+,1,3}\left(h\right)\frac{h^{3}\mu^{\left(3\right)}\left(0\right)}{6}+o_{p}\left(h^{3}\right) \end{align*}\] where the small o is not divided by \(n\) since it is cancelled out with the \(n^{-1}\) in \(\Gamma\).
Denote \(B_{+v,p,q}=e_{v}H_{p}\left(\Gamma_{+,p}\left(h\right)\right)^{-1}\vartheta_{+,p,q}\left(h\right)\frac{h^{q}\mu^{\left(q\right)}\left(0\right)}{q!}\). Then for the intercept estimator \(\widehat{\mu}_{+,1}^{\left(0\right)}\left(0\right)=e_{0}\widehat{\beta}_{1,+}\) we get \[\begin{equation} \mathbb{\mathbb{E}}\left[\widehat{\mu}_{+,1}^{\left(0\right)}\left(0\right)\mid\mathcal{R}_{n}\right]=\mu_{+}\left(0\right)+e_{0}B_{+,0,1,2}+e_{0}B_{+,0,1,3}+o_{p}\left(h^{3}\right) \end{equation}\]