vote:表示某次选举民主党在州参议院的席位占比,值从0到100,是RD分析中结局变量 margin:表示上次选举中获得相同参议院席位的边际收益,其中大于0表示民主党胜出,反之则为失败。 利用RDD估计民主党赢得参议院席位对其在下次选举中获得该席位的选票份额的影响
install.packages("rdrobust",repos = "http://cran.us.r-project.org")
## 程序包'rdrobust'打开成功,MD5和检查也通过
##
## 下载的二进制程序包在
## C:\Users\zhuoshen\AppData\Local\Temp\Rtmpk3GLs4\downloaded_packages里
install.packages("rddensity",repos = "http://cran.us.r-project.org")
## 程序包'rddensity'打开成功,MD5和检查也通过
##
## 下载的二进制程序包在
## C:\Users\zhuoshen\AppData\Local\Temp\Rtmpk3GLs4\downloaded_packages里
install.packages("tidyverse",repos = "http://cran.us.r-project.org")
## 程序包'tidyverse'打开成功,MD5和检查也通过
##
## 下载的二进制程序包在
## C:\Users\zhuoshen\AppData\Local\Temp\Rtmpk3GLs4\downloaded_packages里
install.packages("stargazer",repos = "http://cran.us.r-project.org")
## 程序包'stargazer'打开成功,MD5和检查也通过
##
## 下载的二进制程序包在
## C:\Users\zhuoshen\AppData\Local\Temp\Rtmpk3GLs4\downloaded_packages里
library(rdrobust)
library(rddensity)
library(magrittr)
library(ggplot2)
library(tidyverse)
## ── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ──
## ✔ dplyr 1.1.4 ✔ readr 2.1.5
## ✔ forcats 1.0.0 ✔ stringr 1.5.1
## ✔ lubridate 1.9.3 ✔ tibble 3.2.1
## ✔ purrr 1.0.2 ✔ tidyr 1.3.1
## ── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
## ✖ tidyr::extract() masks magrittr::extract()
## ✖ dplyr::filter() masks stats::filter()
## ✖ dplyr::lag() masks stats::lag()
## ✖ purrr::set_names() masks magrittr::set_names()
## ℹ Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to become errors
library(stargazer)
##
## Please cite as:
##
## Hlavac, Marek (2022). stargazer: Well-Formatted Regression and Summary Statistics Tables.
## R package version 5.2.3. https://CRAN.R-project.org/package=stargazer
data(rdrobust_RDsenate)
head(rdrobust_RDsenate)
## margin vote
## 1 -7.6885610 36.09757
## 2 -3.9237082 45.46875
## 3 -6.8686604 45.59821
## 4 -27.6680565 48.47606
## 5 -8.2569685 51.74687
## 6 0.7324815 39.80264
驱动变量的直方图,若大致呈现光滑曲线,无明显的断点,说明不受到操控。:
输出结果中,最下面一行robust的p值大于0.05,就说明驱动变量未受到操纵。
## $Estl
## Call: lpdensity
##
## Sample size 640
## Polynomial order for point estimation (p=) 2
## Order of derivative estimated (v=) 1
## Polynomial order for confidence interval (q=) 3
## Kernel function triangular
## Scaling factor 0.460763138948884
## Bandwidth method user provided
##
## Use summary(...) to show estimates.
##
## $Estr
## Call: lpdensity
##
## Sample size 750
## Polynomial order for point estimation (p=) 2
## Order of derivative estimated (v=) 1
## Polynomial order for confidence interval (q=) 3
## Kernel function triangular
## Scaling factor 0.539956803455723
## Bandwidth method user provided
##
## Use summary(...) to show estimates.
##
## $Estplot
可见本例断点前后干预率0-1,基本可判断为sharp类型
rdrobust_RDsenate$treatment <- ifelse(rdrobust_RDsenate$margin < 0, 0, 1)
rdrobust_RDsenate$color <- ifelse(rdrobust_RDsenate$margin < 0, 'blue', 'red')
散点图初步判定关系
将驱动变量分成若干宽度(width)相等的区间(bin),但不能存在骑跨断点,断点两侧的区间数量可不相等 选择合适的区间和带宽。可以采用rdrobust包中的rdplot函数进行判断
## Call: rdplot
##
## Number of Obs. 1297
## Kernel Uniform
##
## Number of Obs. 595 702
## Eff. Number of Obs. 595 702
## Order poly. fit (p) 4 4
## BW poly. fit (h) 100.000 100.000
## Number of bins scale 1 1
##
## Bins Selected 20 20
## Average Bin Length 5.000 5.000
## Median Bin Length 5.000 5.000
##
## IMSE-optimal bins 8 9
## Mimicking Variance bins 15 35
##
## Relative to IMSE-optimal:
## Implied scale 2.500 2.222
## WIMSE variance weight 0.060 0.084
## WIMSE bias weight 0.940 0.916
rdrobust_RDsenate$margin_del <- rdrobust_RDsenate$margin - 0
#linear
fit_1 <- lm(vote ~ margin_del + treatment, data = rdrobust_RDsenate)
# linear interaction
fit_2 <- lm(vote ~ margin_del * treatment, data = rdrobust_RDsenate)
# quadratic
fit_3 <- lm(vote ~ margin_del + I(margin_del ^ 2) + treatment, data = rdrobust_RDsenate)
# quadratic interaction
fit_4 <- lm(vote ~ (margin_del + I(margin_del ^ 2)) * treatment, data = rdrobust_RDsenate)
#cubic
fit_5 <- lm(vote ~ margin_del + I(margin_del ^ 2) + I(margin_del ^ 3) + treatment, data = rdrobust_RDsenate)
#cubic interaction
fit_6 <- lm(vote ~ (margin_del + I(margin_del ^ 2) + I(margin_del ^ 3)) * treatment, data = rdrobust_RDsenate)
stargazer::stargazer(fit_1, fit_2, fit_3, fit_4, fit_5, fit_6, type = 'html', style = 'all')#输出表格形式的结果
##
## <table style="text-align:center"><tr><td colspan="7" style="border-bottom: 1px solid black"></td></tr><tr><td style="text-align:left"></td><td colspan="6"><em>Dependent variable:</em></td></tr>
## <tr><td></td><td colspan="6" style="border-bottom: 1px solid black"></td></tr>
## <tr><td style="text-align:left"></td><td colspan="6">vote</td></tr>
## <tr><td style="text-align:left"></td><td>(1)</td><td>(2)</td><td>(3)</td><td>(4)</td><td>(5)</td><td>(6)</td></tr>
## <tr><td colspan="7" style="border-bottom: 1px solid black"></td></tr><tr><td style="text-align:left">margin_del</td><td>0.348<sup>***</sup></td><td>0.216<sup>***</sup></td><td>0.285<sup>***</sup></td><td>0.421<sup>***</sup></td><td>0.337<sup>***</sup></td><td>0.285<sup>*</sup></td></tr>
## <tr><td style="text-align:left"></td><td>(0.013)</td><td>(0.028)</td><td>(0.017)</td><td>(0.069)</td><td>(0.028)</td><td>(0.151)</td></tr>
## <tr><td style="text-align:left"></td><td>t = 26.078</td><td>t = 7.803</td><td>t = 16.873</td><td>t = 6.133</td><td>t = 12.175</td><td>t = 1.884</td></tr>
## <tr><td style="text-align:left"></td><td>p = 0.000</td><td>p = 0.000</td><td>p = 0.000</td><td>p = 0.000</td><td>p = 0.000</td><td>p = 0.060</td></tr>
## <tr><td style="text-align:left">I(margin_del2)</td><td></td><td></td><td>0.001<sup>***</sup></td><td>0.003<sup>***</sup></td><td>0.001<sup>***</sup></td><td>-0.002</td></tr>
## <tr><td style="text-align:left"></td><td></td><td></td><td>(0.0002)</td><td>(0.001)</td><td>(0.0002)</td><td>(0.005)</td></tr>
## <tr><td style="text-align:left"></td><td></td><td></td><td>t = 6.016</td><td>t = 3.254</td><td>t = 6.325</td><td>t = -0.349</td></tr>
## <tr><td style="text-align:left"></td><td></td><td></td><td>p = 0.000</td><td>p = 0.002</td><td>p = 0.000</td><td>p = 0.727</td></tr>
## <tr><td style="text-align:left">I(margin_del3)</td><td></td><td></td><td></td><td></td><td>-0.00001<sup>**</sup></td><td>-0.00004</td></tr>
## <tr><td style="text-align:left"></td><td></td><td></td><td></td><td></td><td>(0.00000)</td><td>(0.00004)</td></tr>
## <tr><td style="text-align:left"></td><td></td><td></td><td></td><td></td><td>t = -2.386</td><td>t = -1.011</td></tr>
## <tr><td style="text-align:left"></td><td></td><td></td><td></td><td></td><td>p = 0.018</td><td>p = 0.313</td></tr>
## <tr><td style="text-align:left">treatment</td><td>4.785<sup>***</sup></td><td>6.044<sup>***</sup></td><td>6.663<sup>***</sup></td><td>4.935<sup>***</sup></td><td>5.238<sup>***</sup></td><td>7.319<sup>***</sup></td></tr>
## <tr><td style="text-align:left"></td><td>(0.923)</td><td>(0.942)</td><td>(0.963)</td><td>(1.256)</td><td>(1.131)</td><td>(1.593)</td></tr>
## <tr><td style="text-align:left"></td><td>t = 5.184</td><td>t = 6.414</td><td>t = 6.922</td><td>t = 3.929</td><td>t = 4.630</td><td>t = 4.595</td></tr>
## <tr><td style="text-align:left"></td><td>p = 0.00000</td><td>p = 0.000</td><td>p = 0.000</td><td>p = 0.0001</td><td>p = 0.00001</td><td>p = 0.00001</td></tr>
## <tr><td style="text-align:left">margin_del:treatment</td><td></td><td>0.170<sup>***</sup></td><td></td><td>-0.095</td><td></td><td>-0.229</td></tr>
## <tr><td style="text-align:left"></td><td></td><td>(0.032)</td><td></td><td>(0.087)</td><td></td><td>(0.195)</td></tr>
## <tr><td style="text-align:left"></td><td></td><td>t = 5.406</td><td></td><td>t = -1.088</td><td></td><td>t = -1.175</td></tr>
## <tr><td style="text-align:left"></td><td></td><td>p = 0.00000</td><td></td><td>p = 0.277</td><td></td><td>p = 0.241</td></tr>
## <tr><td style="text-align:left">I(margin_del2):treatment</td><td></td><td></td><td></td><td>-0.002<sup>**</sup></td><td></td><td>0.010<sup>*</sup></td></tr>
## <tr><td style="text-align:left"></td><td></td><td></td><td></td><td>(0.001)</td><td></td><td>(0.006)</td></tr>
## <tr><td style="text-align:left"></td><td></td><td></td><td></td><td>t = -2.242</td><td></td><td>t = 1.770</td></tr>
## <tr><td style="text-align:left"></td><td></td><td></td><td></td><td>p = 0.026</td><td></td><td>p = 0.077</td></tr>
## <tr><td style="text-align:left">I(margin_del3):treatment</td><td></td><td></td><td></td><td></td><td></td><td>-0.00002</td></tr>
## <tr><td style="text-align:left"></td><td></td><td></td><td></td><td></td><td></td><td>(0.00004)</td></tr>
## <tr><td style="text-align:left"></td><td></td><td></td><td></td><td></td><td></td><td>t = -0.411</td></tr>
## <tr><td style="text-align:left"></td><td></td><td></td><td></td><td></td><td></td><td>p = 0.682</td></tr>
## <tr><td style="text-align:left">Constant</td><td>47.331<sup>***</sup></td><td>44.904<sup>***</sup></td><td>45.486<sup>***</sup></td><td>46.740<sup>***</sup></td><td>45.997<sup>***</sup></td><td>46.029<sup>***</sup></td></tr>
## <tr><td style="text-align:left"></td><td>(0.542)</td><td>(0.699)</td><td>(0.616)</td><td>(0.896)</td><td>(0.651)</td><td>(1.138)</td></tr>
## <tr><td style="text-align:left"></td><td>t = 87.340</td><td>t = 64.221</td><td>t = 73.798</td><td>t = 52.153</td><td>t = 70.614</td><td>t = 40.457</td></tr>
## <tr><td style="text-align:left"></td><td>p = 0.000</td><td>p = 0.000</td><td>p = 0.000</td><td>p = 0.000</td><td>p = 0.000</td><td>p = 0.000</td></tr>
## <tr><td colspan="7" style="border-bottom: 1px solid black"></td></tr><tr><td style="text-align:left">Observations</td><td>1,297</td><td>1,297</td><td>1,297</td><td>1,297</td><td>1,297</td><td>1,297</td></tr>
## <tr><td style="text-align:left">R<sup>2</sup></td><td>0.578</td><td>0.587</td><td>0.590</td><td>0.591</td><td>0.591</td><td>0.593</td></tr>
## <tr><td style="text-align:left">Adjusted R<sup>2</sup></td><td>0.577</td><td>0.586</td><td>0.589</td><td>0.590</td><td>0.590</td><td>0.591</td></tr>
## <tr><td style="text-align:left">Residual Std. Error</td><td>11.781 (df = 1294)</td><td>11.654 (df = 1293)</td><td>11.624 (df = 1293)</td><td>11.609 (df = 1291)</td><td>11.603 (df = 1292)</td><td>11.587 (df = 1289)</td></tr>
## <tr><td style="text-align:left">F Statistic</td><td>886.433<sup>***</sup> (df = 2; 1294) (p = 0.000)</td><td>613.586<sup>***</sup> (df = 3; 1293) (p = 0.000)</td><td>619.091<sup>***</sup> (df = 3; 1293) (p = 0.000)</td><td>373.392<sup>***</sup> (df = 5; 1291) (p = 0.000)</td><td>467.426<sup>***</sup> (df = 4; 1292) (p = 0.000)</td><td>268.726<sup>***</sup> (df = 7; 1289) (p = 0.000)</td></tr>
## <tr><td colspan="7" style="border-bottom: 1px solid black"></td></tr><tr><td style="text-align:left"><em>Note:</em></td><td colspan="6" style="text-align:right"><sup>*</sup>p<0.1; <sup>**</sup>p<0.05; <sup>***</sup>p<0.01</td></tr>
## </table>
#比较AIC或者回归残差,较小者模型较优。
AIC(fit_1, fit_2, fit_3, fit_4, fit_5, fit_6)
## df AIC
## fit_1 4 10083.70
## fit_2 5 10056.71
## fit_3 5 10049.89
## fit_4 7 10048.72
## fit_5 6 10046.19
## fit_6 9 10045.74
利用rdrobust包的rdrobust函数进行拟合。 局部估计可直接利用rdrobust函数中的bdselect参数指定选择最优带宽的方法。 带宽选择还可以先利用rdbwselect函数计算出最优带宽,然后用rdrobust函数中的h参数手动指定左右的带宽。
结果中前缀mse更适用于进行点估计的带宽选择,而cer更适合区间估计的带宽选择。后缀rd指两侧区间相等,而two则不相等结果中h即为选择的左右最优带宽,而b给出的是用来进行敏感性分析时应该考虑的带宽。
rdbwselect(y = rdrobust_RDsenate$vote, x = rdrobust_RDsenate$margin, all = TRUE) %>% summary()
## Call: rdbwselect
##
## Number of Obs. 1297
## BW type All
## Kernel Triangular
## VCE method NN
##
## Number of Obs. 595 702
## Order est. (p) 1 1
## Order bias (q) 2 2
## Unique Obs. 595 665
##
## =======================================================
## BW est. (h) BW bias (b)
## Left of c Right of c Left of c Right of c
## =======================================================
## mserd 17.754 17.754 28.028 28.028
## msetwo 16.170 18.126 27.104 29.344
## msesum 18.365 18.365 31.319 31.319
## msecomb1 17.754 17.754 28.028 28.028
## msecomb2 17.754 18.126 28.028 29.344
## cerrd 12.407 12.407 28.028 28.028
## certwo 11.299 12.667 27.104 29.344
## cersum 12.834 12.834 31.319 31.319
## cercomb1 12.407 12.407 28.028 28.028
## cercomb2 12.407 12.667 28.028 29.344
## =======================================================
#拟合,c用来指定cut-point,p用来指定局部加权回归的多项式幂次,kernel为选择的核函数
loc_fit_1 <- rdrobust(rdrobust_RDsenate$vote, rdrobust_RDsenate$margin, c = 0, p = 1,
kernel = 'triangular', bwselect = 'mserd',h=17.754)#
#如果有混杂因素需要控制,可以用covs = c('var1', 'var2')
loc_fit_2 <- rdrobust(rdrobust_RDsenate$vote, rdrobust_RDsenate$margin, c = 0, p = 2,
kernel = 'triangular', bwselect = 'msetwo',h=17.754)
summary(loc_fit_1)
## Sharp RD estimates using local polynomial regression.
##
## Number of Obs. 1297
## BW type Manual
## Kernel Triangular
## VCE method NN
##
## Number of Obs. 595 702
## Eff. Number of Obs. 360 323
## Order est. (p) 1 1
## Order bias (q) 2 2
## BW est. (h) 17.754 17.754
## BW bias (b) 17.754 17.754
## rho (h/b) 1.000 1.000
## Unique Obs. 595 702
##
## =============================================================================
## Method Coef. Std. Err. z P>|z| [ 95% C.I. ]
## =============================================================================
## Conventional 7.414 1.459 5.083 0.000 [4.555 , 10.273]
## Robust - - 4.030 0.000 [4.274 , 12.368]
## =============================================================================
summary(loc_fit_2)#结果中coef即为效应值
## Sharp RD estimates using local polynomial regression.
##
## Number of Obs. 1297
## BW type Manual
## Kernel Triangular
## VCE method NN
##
## Number of Obs. 595 702
## Eff. Number of Obs. 360 323
## Order est. (p) 2 2
## Order bias (q) 3 3
## BW est. (h) 17.754 17.754
## BW bias (b) 17.754 17.754
## rho (h/b) 1.000 1.000
## Unique Obs. 595 702
##
## =============================================================================
## Method Coef. Std. Err. z P>|z| [ 95% C.I. ]
## =============================================================================
## Conventional 8.321 2.065 4.030 0.000 [4.274 , 12.368]
## Robust - - 4.167 0.000 [6.017 , 16.704]
## =============================================================================
对称剔除真实断点左右一定较小范围内的个体。 关于范围的大小同样没有明确的标准,通常选取多个范围比较即可。
rdrobust_RD_hole_1 <- subset(rdrobust_RDsenate, abs(margin) > 0.3)
rdrobust_RD_hole_2 <- subset(rdrobust_RDsenate, abs(margin) > 0.4)
sen_hole_1 <- rdrobust(rdrobust_RD_hole_1$vote, rdrobust_RD_hole_1$margin, c = 0, p = 2,
kernel = 'triangular', bwselect = 'msetwo') # #除c意外的其他参数应该与前面分析保持一致
sen_hole_2 <- rdrobust(rdrobust_RD_hole_2$vote, rdrobust_RD_hole_2$margin, c = 0, p = 2,
kernel = 'triangular', bwselect = 'msetwo') # #除c意外的其他参数应该与前面分析保持一致
summary(sen_hole_1)
## Sharp RD estimates using local polynomial regression.
##
## Number of Obs. 1288
## BW type msetwo
## Kernel Triangular
## VCE method NN
##
## Number of Obs. 592 696
## Eff. Number of Obs. 408 360
## Order est. (p) 2 2
## Order bias (q) 3 3
## BW est. (h) 22.386 21.767
## BW bias (b) 34.724 33.145
## rho (h/b) 0.645 0.657
## Unique Obs. 592 659
##
## =============================================================================
## Method Coef. Std. Err. z P>|z| [ 95% C.I. ]
## =============================================================================
## Conventional 7.247 1.984 3.653 0.000 [3.358 , 11.135]
## Robust - - 3.314 0.001 [2.981 , 11.614]
## =============================================================================
summary(sen_hole_2)
## Sharp RD estimates using local polynomial regression.
##
## Number of Obs. 1282
## BW type msetwo
## Kernel Triangular
## VCE method NN
##
## Number of Obs. 590 692
## Eff. Number of Obs. 396 360
## Order est. (p) 2 2
## Order bias (q) 3 3
## BW est. (h) 21.721 22.272
## BW bias (b) 33.862 34.014
## rho (h/b) 0.641 0.655
## Unique Obs. 590 655
##
## =============================================================================
## Method Coef. Std. Err. z P>|z| [ 95% C.I. ]
## =============================================================================
## Conventional 6.988 2.093 3.338 0.001 [2.885 , 11.091]
## Robust - - 3.009 0.003 [2.450 , 11.607]
## =============================================================================
#更换不同的带宽即可
loc_fit_3 <- rdrobust(rdrobust_RDsenate$vote, rdrobust_RDsenate$margin, c = 0, p = 2,
kernel = 'triangular', bwselect = 'mserd',h=28.028)# #如果有混杂因素需要控制,可以用covs = c('var1', 'var2')
loc_fit_4 <- rdrobust(rdrobust_RDsenate$vote, rdrobust_RDsenate$margin, c = 0, p = 2,
kernel = 'triangular', bwselect = 'msetwo',h=20.086)
summary(loc_fit_3)
## Sharp RD estimates using local polynomial regression.
##
## Number of Obs. 1297
## BW type Manual
## Kernel Triangular
## VCE method NN
##
## Number of Obs. 595 702
## Eff. Number of Obs. 465 437
## Order est. (p) 2 2
## Order bias (q) 3 3
## BW est. (h) 28.028 28.028
## BW bias (b) 28.028 28.028
## rho (h/b) 1.000 1.000
## Unique Obs. 595 702
##
## =============================================================================
## Method Coef. Std. Err. z P>|z| [ 95% C.I. ]
## =============================================================================
## Conventional 7.418 1.717 4.320 0.000 [4.052 , 10.783]
## Robust - - 4.113 0.000 [4.739 , 13.369]
## =============================================================================
summary(loc_fit_4)
## Sharp RD estimates using local polynomial regression.
##
## Number of Obs. 1297
## BW type Manual
## Kernel Triangular
## VCE method NN
##
## Number of Obs. 595 702
## Eff. Number of Obs. 391 346
## Order est. (p) 2 2
## Order bias (q) 3 3
## BW est. (h) 20.086 20.086
## BW bias (b) 20.086 20.086
## rho (h/b) 1.000 1.000
## Unique Obs. 595 702
##
## =============================================================================
## Method Coef. Std. Err. z P>|z| [ 95% C.I. ]
## =============================================================================
## Conventional 8.157 1.954 4.175 0.000 [4.327 , 11.986]
## Robust - - 3.965 0.000 [5.189 , 15.332]
## =============================================================================