Exploring Regression Discontinuity Designs

Based on Carpenter and Dobkin (2009) The Effect of Alcohol Consumption on Mortality: Regression Discontinuity Evidence from the Minimum Drinking Age AEJ: Applied Economics 1(1)164-82

a. Download and read the paper.

✅ ✅

b. Get the Carpenter and Dobkin data. … Make an over21 dummy to help with the following analysis.

# fetch dataset from the url and retrieve it from local storage
download.file("http://masteringmetrics.com/wp-content/uploads/2015/01/AEJfigs.dta", "drinking")
drinking <- haven::read_dta("drinking") 
colnames(drinking)
##  [1] "agecell"             "all"                 "allfitted"          
##  [4] "internal"            "internalfitted"      "external"           
##  [7] "externalfitted"      "alcohol"             "alcoholfitted"      
## [10] "homicide"            "homicidefitted"      "suicide"            
## [13] "suicidefitted"       "mva"                 "mvafitted"          
## [16] "drugs"               "drugsfitted"         "externalother"      
## [19] "externalotherfitted"
head(drinking)
## # A tibble: 6 x 19
##   agecell   all allfitted internal internalfitted external externalfitted
##     <dbl> <dbl>     <dbl>    <dbl>          <dbl>    <dbl>          <dbl>
## 1    19.1  92.8      91.7     16.6           16.7     76.2           75.0
## 2    19.2  95.1      91.9     18.3           16.9     76.8           75.0
## 3    19.2  92.1      92.0     18.9           17.1     73.2           75.0
## 4    19.3  88.4      92.2     16.1           17.3     72.3           74.9
## 5    19.4  88.7      92.3     17.4           17.4     71.3           74.9
## 6    19.5  90.2      92.5     17.9           17.6     72.3           74.9
## # … with 12 more variables: alcohol <dbl>, alcoholfitted <dbl>, homicide <dbl>,
## #   homicidefitted <dbl>, suicide <dbl>, suicidefitted <dbl>, mva <dbl>,
## #   mvafitted <dbl>, drugs <dbl>, drugsfitted <dbl>, externalother <dbl>,
## #   externalotherfitted <dbl>
# to create the 'over21' dummy (as a factor)
library(dplyr)
drinking <- drinking %>% mutate(over21 = as.factor(ifelse(agecell >= 21.0, 1, 0)))
table(drinking$over21)
## 
##  0  1 
## 25 25

c. Reproduce Figure 3 of the paper.

# easier to plot if we transform the table into long format
# but not like this... this is a crude workaround to 
# get around the problem with grouping. hence, (probably) the joins 
# between lines either side of the discontinuity 
library(tidyr)
drinkingPlot <- pivot_longer(drinking, c(2, 4, 6) , names_to = "fit", values_to = "fits")
drinkingPlot <- pivot_longer(drinkingPlot, 2:4 , names_to = "fitted", values_to = "fitteds")
colnames(drinkingPlot) 
##  [1] "agecell"             "alcohol"             "alcoholfitted"      
##  [4] "homicide"            "homicidefitted"      "suicide"            
##  [7] "suicidefitted"       "mva"                 "mvafitted"          
## [10] "drugs"               "drugsfitted"         "externalother"      
## [13] "externalotherfitted" "over21"              "fit"                
## [16] "fits"                "fitted"              "fitteds"
library(ggplot2)
ggplot(na.omit(drinkingPlot), aes(x = agecell))+ 
  geom_point(aes(y = fits, shape = fit)) +   # draw points based on non-fitted values 
  geom_line(aes(y = fitteds, linetype = fitted)) + # lines for fitted values
  scale_shape_manual(values = c(18, 17,0), names("")) + # adjust shapes 
  scale_linetype_manual(values= c( "dotted", "solid", "dashed"), names("")) + 
  labs(title = "Age Profile for Death Rates", x = "Age", y = "Deaths per 100,000 person-years") 

d. Run a simple RD regression (same slope, with a jump) for “all” deaths by age.

# simple Regression Discontinuity model with the trusty lm function
regDisc <- lm(all ~ agecell + over21,  data = na.omit(drinking))
library(stargazer)
stargazer(regDisc, 
                     title = "Simple RD regression estimates",
                     type = "html", keep.stat = c("n","adj.rsq"),
                     dep.var.caption= "", 
                     dep.var.labels = c("All Deaths"),
                     covariate.labels = c("Age", "Over 21"),
                     omit = "Constant")
Simple RD regression estimates
All Deaths
Age -0.975
(0.632)
Over 21 7.663***
(1.440)
Observations 48
Adjusted R2 0.577
Note: p<0.1; p<0.05; p<0.01

e. Plot the results

na.omit(drinking) %>% mutate(fit = predict(regDisc, na.omit(drinking)))  %>% 
  ggplot(aes(x = agecell, y = fit, color = over21)) + 
  geom_line()  + geom_point(aes(y = all)) +
  labs(title=" Simple Regression Discontinuity Plot", x = "Age", y = "Deaths", colour = "Over 21") 

f. Allow more flexibility to your RD

Run another RD where not only the intercept, but also the slope can differ between the under- and over-21 subsamples.

This is equivalent to adding an interaction term in the previous model.

regDiscFlex <- lm(all ~ agecell + over21 + agecell * over21,  data = drinking)
stargazer(regDiscFlex, 
                     title = "Flexible RD regression estimates",
                     type = "html", keep.stat = c("n","adj.rsq"),
                     dep.var.caption= "", 
                     dep.var.labels = c("All Deaths"),
                     covariate.labels = c("Age", "Over 21", "Interaction Term"),
                     omit = "Constant")
Flexible RD regression estimates
All Deaths
Age 0.827
(0.819)
Over 21 83.333***
(24.357)
Interaction Term -3.603***
(1.158)
Observations 48
Adjusted R2 0.645
Note: p<0.1; p<0.05; p<0.01

The coefficient estimate on the Over 21 dummy is much bigger. This may be because this design is less biased by accommodating for non-linearities in deaths over time.

g. Again, plot the data

# no confidence bands around the fits, had some issues with the
# geom_smooth "method" and "formula" parameters

na.omit(drinking) %>%   mutate(fit2 = predict(regDiscFlex, na.omit(drinking)))  %>% 
  ggplot(aes(x = agecell, y = fit2, color = over21)) + geom_point(mapping=(aes(y = all)))+
  geom_line() + 
  labs(title="Flexible Regression Discontinuity Plot", x = "Age", y = "Deaths", colour = "Over 21") 

f. Compare to pre-packaged RDD

library(rdd)
summary(RDestimate(all ~ agecell, data = na.omit(drinking), cutpoint =21))
## 
## Call:
## RDestimate(formula = all ~ agecell, data = na.omit(drinking), 
##     cutpoint = 21)
## 
## Type:
## sharp 
## 
## Estimates:
##            Bandwidth  Observations  Estimate  Std. Error  z value  Pr(>|z|) 
## LATE       1.6561     40            9.001     1.480       6.080    1.199e-09
## Half-BW    0.8281     20            9.579     1.914       5.004    5.609e-07
## Double-BW  3.3123     48            7.953     1.278       6.223    4.882e-10
##               
## LATE       ***
## Half-BW    ***
## Double-BW  ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## F-statistics:
##            F      Num. DoF  Denom. DoF  p        
## LATE       33.08  3         36          3.799e-10
## Half-BW    29.05  3         16          2.078e-06
## Double-BW  32.54  3         44          6.129e-11

Coefficient estimate for Local Average Treatment Effect is now bigger with smaller standard errors. Can be verified in the wider gap between the lines at the point of the discontinuity in the plot below.

plot(RDestimate(all ~ agecell, data = na.omit(drinking), cutpoint =21))

g. Pre-packaged linear RDD

summary(RDestimate(all ~ agecell, data = na.omit(drinking), cutpoint=21, kernel = "rectangular", bw = 2))
## 
## Call:
## RDestimate(formula = all ~ agecell, data = na.omit(drinking), 
##     cutpoint = 21, bw = 2, kernel = "rectangular")
## 
## Type:
## sharp 
## 
## Estimates:
##            Bandwidth  Observations  Estimate  Std. Error  z value  Pr(>|z|) 
## LATE       2          48            7.663     1.273       6.017    1.776e-09
## Half-BW    1          24            9.753     1.902       5.128    2.929e-07
## Double-BW  4          48            7.663     1.273       6.017    1.776e-09
##               
## LATE       ***
## Half-BW    ***
## Double-BW  ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## F-statistics:
##            F      Num. DoF  Denom. DoF  p        
## LATE       29.47  3         44          2.651e-10
## Half-BW    16.82  3         20          2.167e-05
## Double-BW  29.47  3         44          2.651e-10

Here, the LATE estimate is once again 7.663 as in our first specification but the standard error bands are now narrower. This smoothing is the result of our choice of bandwidth and kernel. Some discussion on this may be found in Lee and Lemieux (2010) Regression Discontinuity Designs in Economics. In short, the bandwidth parameter bw allows for adjusting the size of the “bins” of data either side of the region of discontinuity. Bigger bins mean lower noise but having narrower bins allows for a more credible comparison between observations “close enough” on both sides of the cutoff point. The choice of kernel characterizes weighting of the data on the two sides of the discontinuity, rectangular meaning no weights.

plot(RDestimate(all ~ agecell, data = na.omit(drinking), cutpoint=21, kernel = "rectangular", bw = 2))