Regression Discontinuity Design (RDD) has been exploding on the causal inference scene. Despite being around for decades it wasn’t until the early 2000’s RDD use would proliferate. As seen by the chart above the use of RDD in published studies went exponential after 1999–and not surprisingly Angrist played a role. The main reason for RDD’s popularity is its powerful ability to handle selection bias.
RDD is utilized in situations where a cutoff or threshold allows for delineation of treatment and control. Fortunately, society has lots of arbitrary thresholds such as GPA or test scores for school admission, cutoffs for blood alcohol levels, or body mass index in determining medical treatment for COVID.
The main underpinning of RDD is the continuity assumption. To satisfy expected values are required to show a continuous trend. Below we are simulating the data and then creating a threshold at 50 to visualize data on either side of this threshold. As shown in the resulting plot the trend is continuous ABSENT treatment. Through out this example Y can be thought of as a peak salary for a student during their career and X can be viewed as an important test score. This assumption allows for the data just above and just below the cutoff to be as good as randomized.
This will utilize several packages many of which may need to be installed.
library(tidyverse)
## ── Attaching packages ─────────────────────────────────────── tidyverse 1.3.2 ──
## ✔ ggplot2 3.3.6 ✔ purrr 0.3.4
## ✔ tibble 3.1.8 ✔ dplyr 1.0.10
## ✔ tidyr 1.2.0 ✔ stringr 1.4.1
## ✔ readr 2.1.2 ✔ forcats 0.5.1
## ── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
## ✖ dplyr::filter() masks stats::filter()
## ✖ dplyr::lag() masks stats::lag()
library(haven)
library(estimatr)
library(stats)
library(rdrobust)
library(rddensity)
library(rdd)
## Loading required package: sandwich
## Loading required package: lmtest
## Loading required package: zoo
##
## Attaching package: 'zoo'
##
## The following objects are masked from 'package:base':
##
## as.Date, as.Date.numeric
##
## Loading required package: AER
## Loading required package: car
## Loading required package: carData
##
## Attaching package: 'car'
##
## The following object is masked from 'package:dplyr':
##
## recode
##
## The following object is masked from 'package:purrr':
##
## some
##
## Loading required package: survival
## Loading required package: Formula
This will look to replicate an examination of political economy by Lee, Moretti, and Butler in their 2004 piece “Do Voters Affect or Elect Policies? Evidence from the U.S. House”. This research looked to address a major question about the role of elections in candidate legislative voting behavior. Does competition for votes in an election drive candidates towards the center driving policy compromises (Convergence Theory)? Or are voters simply electing policies and the election is the process by which a policy option is chosen (Divergence Theory)?
Their findings support the latter. To examine this they analyzed data from US House races from 1946 to 1995. Using this data they attempt to estimate the effect of a democratic candidate’s electoral strength on subsequent roll-call voting records as measured by scores from the Americans for Democratic Action organization. Simply put if candidates perform well in an election how does that affect their legislative votes.
Selection bias creates serious hurdles to this analysis. The winners of the seat are endogenously determined by things voter demographics, candidate quality, and candidate resources. Enter RDD. By focusing in on a subset of elections around the threshold of winning, 50% of the vote, the variation in electoral strength can be seen as exogenous. Selection bias is circumvented as around the cutoff selection is as good as random. Around the margin of 50% is where voter preferences are most similar.
Choosing the range for this subset of data is important. Here the subset is defined as between 48% and 52%.
read_data <- function(df)
{
full_path <- paste("https://raw.github.com/scunning1975/mixtape/master/",
df, sep = "")
df <- read_dta(full_path)
return(df)
}
lmb_data <- read_data("lmb-data.dta")
lmb_subset <- lmb_data %>%
filter(lagdemvoteshare>.48 & lagdemvoteshare<.52)
While the code below for each is pretty similar, notice the difference in the data used. The local regressions use the subset +/- 0.50 while the global regressions use the entire dataset.
local_lm_1 <- lm_robust(score ~ lagdemocrat, data = lmb_subset, clusters = id)
local_lm_2 <- lm_robust(score ~ democrat, data = lmb_subset, clusters = id)
local_lm_3 <- lm_robust(democrat ~ lagdemocrat, data = lmb_subset, clusters = id)
summary(local_lm_1)
##
## Call:
## lm_robust(formula = score ~ lagdemocrat, data = lmb_subset, clusters = id)
##
## Standard error type: CR2
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|) CI Lower CI Upper DF
## (Intercept) 31.20 1.334 23.39 5.880e-80 28.57 33.82 454.0
## lagdemocrat 21.28 1.951 10.91 3.988e-26 17.45 25.11 912.9
##
## Multiple R-squared: 0.1152 , Adjusted R-squared: 0.1142
## F-statistic: 119 on 1 and 914 DF, p-value: < 2.2e-16
summary(local_lm_2)
##
## Call:
## lm_robust(formula = score ~ democrat, data = lmb_subset, clusters = id)
##
## Standard error type: CR2
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|) CI Lower CI Upper DF
## (Intercept) 18.75 0.8432 22.23 2.313e-75 17.09 20.40 470.0
## democrat 47.71 1.3560 35.18 7.434e-172 45.04 50.37 909.8
##
## Multiple R-squared: 0.5783 , Adjusted R-squared: 0.5779
## F-statistic: 1238 on 1 and 914 DF, p-value: < 2.2e-16
summary(local_lm_3)
##
## Call:
## lm_robust(formula = democrat ~ lagdemocrat, data = lmb_subset,
## clusters = id)
##
## Standard error type: CR2
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|) CI Lower CI Upper DF
## (Intercept) 0.2418 0.02009 12.03 3.935e-29 0.2023 0.2812 454.0
## lagdemocrat 0.4843 0.02893 16.74 4.627e-55 0.4275 0.5411 912.9
##
## Multiple R-squared: 0.2348 , Adjusted R-squared: 0.2339
## F-statistic: 280.2 on 1 and 914 DF, p-value: < 2.2e-16
global_lm_1 <- lm_robust(score ~ lagdemocrat, data = lmb_data, clusters = id)
global_lm_2 <- lm_robust(score ~ democrat, data = lmb_data, clusters = id)
global_lm_3 <- lm_robust(democrat ~ lagdemocrat, data = lmb_data, clusters = id)
summary(global_lm_1)
##
## Call:
## lm_robust(formula = score ~ lagdemocrat, data = lmb_data, clusters = id)
##
## Standard error type: CR2
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|) CI Lower CI Upper DF
## (Intercept) 23.54 0.3375 69.75 0 22.88 24.20 5669
## lagdemocrat 31.51 0.4837 65.14 0 30.56 32.45 12211
##
## Multiple R-squared: 0.2267 , Adjusted R-squared: 0.2267
## F-statistic: 4243 on 1 and 13587 DF, p-value: < 2.2e-16
summary(global_lm_2)
##
## Call:
## lm_robust(formula = score ~ democrat, data = lmb_data, clusters = id)
##
## Standard error type: CR2
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|) CI Lower CI Upper DF
## (Intercept) 17.58 0.2626 66.94 0 17.06 18.09 5479
## democrat 40.76 0.4182 97.48 0 39.94 41.58 11758
##
## Multiple R-squared: 0.3756 , Adjusted R-squared: 0.3755
## F-statistic: 9502 on 1 and 13587 DF, p-value: < 2.2e-16
summary(global_lm_3)
##
## Call:
## lm_robust(formula = democrat ~ lagdemocrat, data = lmb_data,
## clusters = id)
##
## Standard error type: CR2
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|) CI Lower CI Upper DF
## (Intercept) 0.1201 0.004318 27.82 9.396e-160 0.1116 0.1286 5669
## lagdemocrat 0.8179 0.005098 160.44 0.000e+00 0.8079 0.8279 12211
##
## Multiple R-squared: 0.6759 , Adjusted R-squared: 0.6759
## F-statistic: 2.574e+04 on 1 and 13587 DF, p-value: < 2.2e-16
#using all data (note data used is lmb_data, not lmb_subset)
lm_1 <- lm_robust(score ~ lagdemocrat, data = lmb_data, clusters = id)
lm_2 <- lm_robust(score ~ democrat, data = lmb_data, clusters = id)
lm_3 <- lm_robust(democrat ~ lagdemocrat, data = lmb_data, clusters = id)
summary(lm_1)
##
## Call:
## lm_robust(formula = score ~ lagdemocrat, data = lmb_data, clusters = id)
##
## Standard error type: CR2
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|) CI Lower CI Upper DF
## (Intercept) 23.54 0.3375 69.75 0 22.88 24.20 5669
## lagdemocrat 31.51 0.4837 65.14 0 30.56 32.45 12211
##
## Multiple R-squared: 0.2267 , Adjusted R-squared: 0.2267
## F-statistic: 4243 on 1 and 13587 DF, p-value: < 2.2e-16
summary(lm_2)
##
## Call:
## lm_robust(formula = score ~ democrat, data = lmb_data, clusters = id)
##
## Standard error type: CR2
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|) CI Lower CI Upper DF
## (Intercept) 17.58 0.2626 66.94 0 17.06 18.09 5479
## democrat 40.76 0.4182 97.48 0 39.94 41.58 11758
##
## Multiple R-squared: 0.3756 , Adjusted R-squared: 0.3755
## F-statistic: 9502 on 1 and 13587 DF, p-value: < 2.2e-16
summary(lm_3)
##
## Call:
## lm_robust(formula = democrat ~ lagdemocrat, data = lmb_data,
## clusters = id)
##
## Standard error type: CR2
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|) CI Lower CI Upper DF
## (Intercept) 0.1201 0.004318 27.82 9.396e-160 0.1116 0.1286 5669
## lagdemocrat 0.8179 0.005098 160.44 0.000e+00 0.8079 0.8279 12211
##
## Multiple R-squared: 0.6759 , Adjusted R-squared: 0.6759
## F-statistic: 2.574e+04 on 1 and 13587 DF, p-value: < 2.2e-16
lmb_data <- lmb_data %>%
mutate(demvoteshare_c = demvoteshare - 0.5)
lm_1 <- lm_robust(score ~ lagdemocrat + demvoteshare_c, data = lmb_data, clusters = id)
lm_2 <- lm_robust(score ~ democrat + demvoteshare_c, data = lmb_data, clusters = id)
lm_3 <- lm_robust(democrat ~ lagdemocrat + demvoteshare_c, data = lmb_data, clusters = id)
summary(lm_1)
##
## Call:
## lm_robust(formula = score ~ lagdemocrat + demvoteshare_c, data = lmb_data,
## clusters = id)
##
## Standard error type: CR2
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|) CI Lower CI Upper DF
## (Intercept) 22.883 0.4433 51.616 0.000e+00 22.014 23.753 6255
## lagdemocrat 33.451 0.8482 39.436 4.502e-307 31.788 35.114 6936
## demvoteshare_c -5.626 1.8982 -2.964 3.056e-03 -9.347 -1.904 4626
##
## Multiple R-squared: 0.2274 , Adjusted R-squared: 0.2273
## F-statistic: 2116 on 2 and 13576 DF, p-value: < 2.2e-16
summary(lm_2)
##
## Call:
## lm_robust(formula = score ~ democrat + demvoteshare_c, data = lmb_data,
## clusters = id)
##
## Standard error type: CR2
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|) CI Lower CI Upper DF
## (Intercept) 11.03 0.3363 32.81 2.384e-219 10.37 11.69 6820
## democrat 58.50 0.6564 89.13 0.000e+00 57.22 59.79 8717
## demvoteshare_c -48.94 1.6416 -29.81 9.262e-180 -52.16 -45.72 4955
##
## Multiple R-squared: 0.4242 , Adjusted R-squared: 0.4241
## F-statistic: 6192 on 2 and 13576 DF, p-value: < 2.2e-16
summary(lm_3)
##
## Call:
## lm_robust(formula = democrat ~ lagdemocrat + demvoteshare_c,
## data = lmb_data, clusters = id)
##
## Standard error type: CR2
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|) CI Lower CI Upper DF
## (Intercept) 0.2117 0.005275 40.13 1.512e-313 0.2013 0.2220 6255
## lagdemocrat 0.5516 0.010324 53.43 0.000e+00 0.5314 0.5718 6936
## demvoteshare_c 0.7725 0.018758 41.18 3.721e-316 0.7358 0.8093 4626
##
## Multiple R-squared: 0.7354 , Adjusted R-squared: 0.7353
## F-statistic: 4.896e+04 on 2 and 13576 DF, p-value: < 2.2e-16
lm_1 <- lm_robust(score ~ lagdemocrat*demvoteshare_c,
data = lmb_data, clusters = id)
lm_2 <- lm_robust(score ~ democrat*demvoteshare_c,
data = lmb_data, clusters = id)
lm_3 <- lm_robust(democrat ~ lagdemocrat*demvoteshare_c,
data = lmb_data, clusters = id)
summary(lm_1)
##
## Call:
## lm_robust(formula = score ~ lagdemocrat * demvoteshare_c, data = lmb_data,
## clusters = id)
##
## Standard error type: CR2
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|) CI Lower
## (Intercept) 31.44 0.5411 58.10 0.000e+00 30.37
## lagdemocrat 30.51 0.8173 37.33 2.081e-273 28.91
## demvoteshare_c 66.04 3.1610 20.89 7.355e-80 59.84
## lagdemocrat:demvoteshare_c -96.47 3.8530 -25.04 1.623e-117 -104.03
## CI Upper DF
## (Intercept) 32.50 2241.8
## lagdemocrat 32.11 5781.1
## demvoteshare_c 72.25 935.8
## lagdemocrat:demvoteshare_c -88.92 1647.0
##
## Multiple R-squared: 0.2669 , Adjusted R-squared: 0.2668
## F-statistic: 1863 on 3 and 13576 DF, p-value: < 2.2e-16
summary(lm_2)
##
## Call:
## lm_robust(formula = score ~ democrat * demvoteshare_c, data = lmb_data,
## clusters = id)
##
## Standard error type: CR2
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|) CI Lower
## (Intercept) 16.816 0.4186 40.174 1.076e-277 16.00
## democrat 55.431 0.6374 86.960 0.000e+00 54.18
## demvoteshare_c -5.683 2.6106 -2.177 2.976e-02 -10.81
## democrat:demvoteshare_c -55.152 3.2189 -17.134 5.886e-60 -61.47
## CI Upper DF
## (Intercept) 17.6367 2730.7
## democrat 56.6809 6390.3
## demvoteshare_c -0.5592 886.2
## democrat:demvoteshare_c -48.8376 1417.7
##
## Multiple R-squared: 0.4344 , Adjusted R-squared: 0.4343
## F-statistic: 4161 on 3 and 13576 DF, p-value: < 2.2e-16
summary(lm_3)
##
## Call:
## lm_robust(formula = democrat ~ lagdemocrat * demvoteshare_c,
## data = lmb_data, clusters = id)
##
## Standard error type: CR2
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|) CI Lower
## (Intercept) 0.2869 0.007752 37.01 2.051e-234 0.2717
## lagdemocrat 0.5257 0.010453 50.30 0.000e+00 0.5052
## demvoteshare_c 1.4029 0.044367 31.62 7.355e-150 1.3159
## lagdemocrat:demvoteshare_c -0.8486 0.049019 -17.31 8.075e-62 -0.9448
## CI Upper DF
## (Intercept) 0.3021 2241.8
## lagdemocrat 0.5462 5781.1
## demvoteshare_c 1.4900 935.8
## lagdemocrat:demvoteshare_c -0.7525 1647.0
##
## Multiple R-squared: 0.7489 , Adjusted R-squared: 0.7488
## F-statistic: 2.519e+04 on 3 and 13576 DF, p-value: < 2.2e-16
lmb_data <- lmb_data %>%
mutate(demvoteshare_sq = demvoteshare_c^2)
lm_1 <- lm_robust(score ~ lagdemocrat*demvoteshare_c + lagdemocrat*demvoteshare_sq,
data = lmb_data, clusters = id)
lm_2 <- lm_robust(score ~ democrat*demvoteshare_c + democrat*demvoteshare_sq,
data = lmb_data, clusters = id)
lm_3 <- lm_robust(democrat ~ lagdemocrat*demvoteshare_c + lagdemocrat*demvoteshare_sq,
data = lmb_data, clusters = id)
summary(lm_1)
##
## Call:
## lm_robust(formula = score ~ lagdemocrat * demvoteshare_c + lagdemocrat *
## demvoteshare_sq, data = lmb_data, clusters = id)
##
## Standard error type: CR2
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|) CI Lower
## (Intercept) 33.55 0.7135 47.016 3.892e-226 32.15
## lagdemocrat 13.03 1.2856 10.135 4.366e-23 10.51
## demvoteshare_c 134.98 9.7861 13.793 7.598e-23 115.50
## demvoteshare_sq 212.13 22.7626 9.319 1.334e-14 166.86
## lagdemocrat:demvoteshare_c 57.05 15.4123 3.702 2.617e-04 26.70
## lagdemocrat:demvoteshare_sq -641.85 31.3309 -20.486 5.818e-53 -703.60
## CI Upper DF
## (Intercept) 34.95 752.38
## lagdemocrat 15.55 1040.83
## demvoteshare_c 154.45 79.97
## demvoteshare_sq 257.39 84.13
## lagdemocrat:demvoteshare_c 87.40 257.66
## lagdemocrat:demvoteshare_sq -580.10 220.74
##
## Multiple R-squared: 0.3707 , Adjusted R-squared: 0.3705
## F-statistic: 1526 on 5 and 13576 DF, p-value: < 2.2e-16
summary(lm_2)
##
## Call:
## lm_robust(formula = score ~ democrat * demvoteshare_c + democrat *
## demvoteshare_sq, data = lmb_data, clusters = id)
##
## Standard error type: CR2
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|) CI Lower
## (Intercept) 15.61 0.5748 27.152 6.275e-140 14.48
## democrat 44.40 0.9087 48.865 0.000e+00 42.62
## demvoteshare_c -23.85 6.7132 -3.553 3.873e-04 -37.01
## demvoteshare_sq -41.73 14.6858 -2.841 4.578e-03 -70.55
## democrat:demvoteshare_c 111.90 9.7809 11.440 5.295e-30 92.72
## democrat:demvoteshare_sq -229.95 19.5462 -11.765 5.378e-31 -268.29
## CI Upper DF
## (Intercept) 16.73 2160
## democrat 46.18 4444
## demvoteshare_c -10.69 2961
## demvoteshare_sq -12.91 1040
## democrat:demvoteshare_c 131.07 6122
## democrat:demvoteshare_sq -191.62 2113
##
## Multiple R-squared: 0.4559 , Adjusted R-squared: 0.4557
## F-statistic: 2589 on 5 and 13576 DF, p-value: < 2.2e-16
summary(lm_3)
##
## Call:
## lm_robust(formula = democrat ~ lagdemocrat * demvoteshare_c +
## lagdemocrat * demvoteshare_sq, data = lmb_data, clusters = id)
##
## Standard error type: CR2
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|) CI Lower
## (Intercept) 0.32965 0.01272 25.9114 2.627e-106 0.3047
## lagdemocrat 0.32168 0.01844 17.4459 5.475e-60 0.2855
## demvoteshare_c 2.79834 0.19629 14.2563 1.149e-23 2.4077
## demvoteshare_sq 4.29401 0.45554 9.4262 8.122e-15 3.3881
## lagdemocrat:demvoteshare_c 0.09094 0.24142 0.3767 7.067e-01 -0.3845
## lagdemocrat:demvoteshare_sq -8.80437 0.51723 -17.0223 4.614e-42 -9.8237
## CI Upper DF
## (Intercept) 0.3546 752.38
## lagdemocrat 0.3579 1040.83
## demvoteshare_c 3.1890 79.97
## demvoteshare_sq 5.1999 84.13
## lagdemocrat:demvoteshare_c 0.5663 257.66
## lagdemocrat:demvoteshare_sq -7.7850 220.74
##
## Multiple R-squared: 0.822 , Adjusted R-squared: 0.8219
## F-statistic: 8.973e+04 on 5 and 13576 DF, p-value: < 2.2e-16
lmb_data <- lmb_data %>%
filter(demvoteshare > .45 & demvoteshare < .55) %>%
mutate(demvoteshare_sq = demvoteshare_c^2)
lm_1 <- lm_robust(score ~ lagdemocrat*demvoteshare_c + lagdemocrat*demvoteshare_sq,
data = lmb_data, clusters = id)
lm_2 <- lm_robust(score ~ democrat*demvoteshare_c + democrat*demvoteshare_sq,
data = lmb_data, clusters = id)
lm_3 <- lm_robust(democrat ~ lagdemocrat*demvoteshare_c + lagdemocrat*demvoteshare_sq,
data = lmb_data, clusters = id)
summary(lm_1)
##
## Call:
## lm_robust(formula = score ~ lagdemocrat * demvoteshare_c + lagdemocrat *
## demvoteshare_sq, data = lmb_data, clusters = id)
##
## Standard error type: CR2
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|) CI Lower
## (Intercept) 37.121 0.9689 38.312 4.125e-185 35.219
## lagdemocrat 7.347 1.5872 4.629 4.024e-06 4.233
## demvoteshare_c 830.925 20.9558 39.651 2.272e-139 789.725
## demvoteshare_sq 5333.335 838.3250 6.362 4.513e-10 3686.254
## lagdemocrat:demvoteshare_c -156.876 35.7396 -4.389 1.343e-05 -227.067
## lagdemocrat:demvoteshare_sq -10116.678 1435.1301 -7.049 3.858e-12 -12933.700
## CI Upper DF
## (Intercept) 39.02 822.7
## lagdemocrat 10.46 1385.7
## demvoteshare_c 872.12 392.7
## demvoteshare_sq 6980.42 499.1
## lagdemocrat:demvoteshare_c -86.69 598.9
## lagdemocrat:demvoteshare_sq -7299.66 808.3
##
## Multiple R-squared: 0.4447 , Adjusted R-squared: 0.4435
## F-statistic: 469 on 5 and 2386 DF, p-value: < 2.2e-16
summary(lm_2)
##
## Call:
## lm_robust(formula = score ~ democrat * demvoteshare_c + democrat *
## demvoteshare_sq, data = lmb_data, clusters = id)
##
## Standard error type: CR2
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|) CI Lower
## (Intercept) 21.44 1.819 11.7874 4.848e-26 17.86
## democrat 45.19 2.679 16.8688 7.128e-51 39.93
## demvoteshare_c 450.85 161.352 2.7942 5.416e-03 133.79
## demvoteshare_sq 7878.90 2995.192 2.6305 8.763e-03 1995.55
## democrat:demvoteshare_c -688.34 247.711 -2.7788 5.570e-03 -1174.51
## democrat:demvoteshare_sq -3887.82 4802.371 -0.8096 4.184e-01 -13311.05
## CI Upper DF
## (Intercept) 25.02 263.6
## democrat 50.45 500.2
## demvoteshare_c 767.91 469.4
## demvoteshare_sq 13762.26 552.5
## democrat:demvoteshare_c -202.18 894.8
## democrat:demvoteshare_sq 5535.41 1060.8
##
## Multiple R-squared: 0.5626 , Adjusted R-squared: 0.5617
## F-statistic: 617.6 on 5 and 2386 DF, p-value: < 2.2e-16
summary(lm_3)
##
## Call:
## lm_robust(formula = democrat ~ lagdemocrat * demvoteshare_c +
## lagdemocrat * demvoteshare_sq, data = lmb_data, clusters = id)
##
## Standard error type: CR2
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|) CI Lower
## (Intercept) 0.4181 0.01316 31.7648 3.807e-145 0.3923
## lagdemocrat 0.1674 0.01955 8.5660 2.810e-17 0.1291
## demvoteshare_c 15.6990 0.22762 68.9697 1.503e-221 15.2515
## demvoteshare_sq 91.6069 10.89337 8.4094 4.351e-16 70.2044
## lagdemocrat:demvoteshare_c 0.1245 0.35711 0.3487 7.275e-01 -0.5768
## lagdemocrat:demvoteshare_sq -188.3286 16.35131 -11.5176 1.576e-28 -220.4247
## CI Upper DF
## (Intercept) 0.4439 822.7
## lagdemocrat 0.2058 1385.7
## demvoteshare_c 16.1465 392.7
## demvoteshare_sq 113.0094 499.1
## lagdemocrat:demvoteshare_c 0.8258 598.9
## lagdemocrat:demvoteshare_sq -156.2326 808.3
##
## Multiple R-squared: 0.7743 , Adjusted R-squared: 0.7738
## F-statistic: 6704 on 5 and 2386 DF, p-value: < 2.2e-16
#aggregating the data
categories <- lmb_data$lagdemvoteshare
demmeans <- split(lmb_data$score, cut(lmb_data$lagdemvoteshare, 100)) %>%
lapply(mean) %>%
unlist()
agg_lmb_data <- data.frame(score = demmeans, lagdemvoteshare = seq(0.01,1, by = 0.01))
#plotting
lmb_data <- lmb_data %>%
mutate(gg_group = case_when(lagdemvoteshare > 0.5 ~ 1, TRUE ~ 0))
ggplot(lmb_data, aes(lagdemvoteshare, score)) +
geom_point(aes(x = lagdemvoteshare, y = score), data = agg_lmb_data) +
stat_smooth(aes(lagdemvoteshare, score, group = gg_group), method = "lm",
formula = y ~ x + I(x^2)) +
xlim(0,1) + ylim(0,100) +
geom_vline(xintercept = 0.5)
## Warning: Removed 195 rows containing non-finite values (stat_smooth).
## Warning: Removed 39 rows containing missing values (geom_point).
ggplot(lmb_data, aes(lagdemvoteshare, score)) +
geom_point(aes(x = lagdemvoteshare, y = score), data = agg_lmb_data) +
stat_smooth(aes(lagdemvoteshare, score, group = gg_group), method = "loess") +
xlim(0,1) + ylim(0,100) +
geom_vline(xintercept = 0.5)
## `geom_smooth()` using formula 'y ~ x'
## Warning: Removed 195 rows containing non-finite values (stat_smooth).
## Removed 39 rows containing missing values (geom_point).
ggplot(lmb_data, aes(lagdemvoteshare, score)) +
geom_point(aes(x = lagdemvoteshare, y = score), data = agg_lmb_data) +
stat_smooth(aes(lagdemvoteshare, score, group = gg_group), method = "lm") +
xlim(0,1) + ylim(0,100) +
geom_vline(xintercept = 0.5)
## `geom_smooth()` using formula 'y ~ x'
## Warning: Removed 195 rows containing non-finite values (stat_smooth).
## Removed 39 rows containing missing values (geom_point).
smooth_dem0 <- lmb_data %>%
filter(democrat == 0) %>%
select(score, demvoteshare)
smooth_dem0 <- as_tibble(ksmooth(smooth_dem0$demvoteshare, smooth_dem0$score,
kernel = "box", bandwidth = 0.1))
smooth_dem1 <- lmb_data %>%
filter(democrat == 1) %>%
select(score, demvoteshare) %>%
na.omit()
smooth_dem1 <- as_tibble(ksmooth(smooth_dem1$demvoteshare, smooth_dem1$score,
kernel = "box", bandwidth = 0.1))
ggplot() +
geom_smooth(aes(x, y), data = smooth_dem0) +
geom_smooth(aes(x, y), data = smooth_dem1) +
geom_vline(xintercept = 0.5)
## `geom_smooth()` using method = 'gam' and formula 'y ~ s(x, bs = "cs")'
## Warning: Computation failed in `stat_smooth()`:
## NA/NaN/Inf in foreign function call (arg 3)
## `geom_smooth()` using method = 'gam' and formula 'y ~ s(x, bs = "cs")'
## Warning: Computation failed in `stat_smooth()`:
## NA/NaN/Inf in foreign function call (arg 3)
rdr <- rdrobust(y = lmb_data$score,
x = lmb_data$demvoteshare, c = 0.5)
## [1] "Mass points detected in the running variable."
summary(rdr)
## Sharp RD estimates using local polynomial regression.
##
## Number of Obs. 2387
## BW type mserd
## Kernel Triangular
## VCE method NN
##
## Number of Obs. 1206 1181
## Eff. Number of Obs. 369 338
## Order est. (p) 1 1
## Order bias (q) 2 2
## BW est. (h) 0.015 0.015
## BW bias (b) 0.022 0.022
## rho (h/b) 0.678 0.678
## Unique Obs. 636 623
##
## =============================================================================
## Method Coef. Std. Err. z P>|z| [ 95% C.I. ]
## =============================================================================
## Conventional 43.536 3.119 13.961 0.000 [37.424 , 49.649]
## Robust - - 11.546 0.000 [35.879 , 50.550]
## =============================================================================
DCdensity(lmb_data$demvoteshare, cutpoint = 0.5)
## [1] 0.2666621
density <- rddensity(lmb_data$demvoteshare, c = 0.5)
rdplotdensity(density, lmb_data$demvoteshare)
## $Estl
## Call: lpdensity
##
## Sample size 1206
## Polynomial order for point estimation (p=) 2
## Order of derivative estimated (v=) 1
## Polynomial order for confidence interval (q=) 3
## Kernel function triangular
## Scaling factor 0.505029337803856
## Bandwidth method user provided
##
## Use summary(...) to show estimates.
##
## $Estr
## Call: lpdensity
##
## Sample size 1181
## Polynomial order for point estimation (p=) 2
## Order of derivative estimated (v=) 1
## Polynomial order for confidence interval (q=) 3
## Kernel function triangular
## Scaling factor 0.49455155071249
## Bandwidth method user provided
##
## Use summary(...) to show estimates.
##
## $Estplot