a.Donwload and read the paper:

d.Simplest regression-discountinuity design

The coefficient of the over age 21 indicator implies the change in all causes of death at age 21. The estimate shows evidence of a discrete increase of 7.66 and it is statistically significant. This result provides evidence of a positive causal relationship between alcohol consumption and deaths.

Regression in level

## 
## Call:
## lm(formula = all ~ agecell + over21, data = HW10)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -5.0559 -1.8483  0.1149  1.4909  5.8043 
## 
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)    
## (Intercept) 112.3097    12.6681   8.866 1.96e-11 ***
## agecell      -0.9747     0.6325  -1.541     0.13    
## over21        7.6627     1.4403   5.320 3.15e-06 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 2.493 on 45 degrees of freedom
## Multiple R-squared:  0.5946, Adjusted R-squared:  0.5765 
## F-statistic: 32.99 on 2 and 45 DF,  p-value: 1.508e-09

Regression in percentage

## 
## Call:
## lm(formula = lnall ~ agecell + over21, data = HW10)
## 
## Residuals:
##       Min        1Q    Median        3Q       Max 
## -0.054506 -0.018848  0.001014  0.016217  0.058464 
## 
## Coefficients:
##              Estimate Std. Error t value Pr(>|t|)    
## (Intercept)  4.717290   0.132047  35.724  < 2e-16 ***
## agecell     -0.009350   0.006592  -1.418    0.163    
## over21       0.078428   0.015013   5.224 4.35e-06 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.02599 on 45 degrees of freedom
## Multiple R-squared:  0.5944, Adjusted R-squared:  0.5764 
## F-statistic: 32.97 on 2 and 45 DF,  p-value: 1.52e-09

f.Allow more flexibility to your RD

The results provide the evidence that model is not well fitted in the linear regression under age 21, while it can be better fitted in the linear regression. The statistical significance implies that the deaths has a negative causal relationship with age when the age is over 21.

Regression under subsamples
Dependent variable:
Deaths under Age 21 Deaths over Age 21
(1) (2)
Age 0.827 -2.776***
(0.857) (0.779)
Constant 76.251*** 159.585***
(17.153) (17.140)
Observations 24 24
R2 0.041 0.366
Adjusted R2 -0.003 0.337
Residual Std. Error (df = 22) 2.388 2.172
F Statistic (df = 1; 22) 0.932 12.692***
Note: p<0.1; p<0.05; p<0.01

h.Compare to pre-packaged RDD

The LATE is 9.001 and it is statistically significant. The result is very different with what we caculated in the simplest RDD regression.

## 
## Call:
## RDestimate(formula = all ~ agecell, data = HW10, cutpoint = 21)
## 
## Type:
## sharp 
## 
## Estimates:
##            Bandwidth  Observations  Estimate  Std. Error  z value  Pr(>|z|) 
## LATE       1.6561     40            9.001     1.480       6.080    1.199e-09
## Half-BW    0.8281     20            9.579     1.914       5.004    5.609e-07
## Double-BW  3.3123     48            7.953     1.278       6.223    4.882e-10
##               
## LATE       ***
## Half-BW    ***
## Double-BW  ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## F-statistics:
##            F      Num. DoF  Denom. DoF  p        
## LATE       33.08  3         36          3.799e-10
## Half-BW    29.05  3         16          2.078e-06
## Double-BW  32.54  3         44          6.129e-11

i.Prepackaged linear RDD

When adding options for kernel and bandwidth, the estimate of LATE is more closed to the estimate of the simplest RDD regression. Because the kernel is set up “triangular” in part h and now it is “rectangular”. In this part, we set up the RDD with a different way to give weights, since triangular kernel function gives a higher weight on values closed to the mean while rectangular function gives the same weight to all values.

## 
## Call:
## RDestimate(formula = all ~ agecell, data = HW10, cutpoint = 21, 
##     bw = 2, kernel = "rectangular")
## 
## Type:
## sharp 
## 
## Estimates:
##            Bandwidth  Observations  Estimate  Std. Error  z value  Pr(>|z|) 
## LATE       2          48            7.663     1.273       6.017    1.776e-09
## Half-BW    1          24            9.753     1.902       5.128    2.929e-07
## Double-BW  4          48            7.663     1.273       6.017    1.776e-09
##               
## LATE       ***
## Half-BW    ***
## Double-BW  ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## F-statistics:
##            F      Num. DoF  Denom. DoF  p        
## LATE       29.47  3         44          2.651e-10
## Half-BW    16.82  3         20          2.167e-05
## Double-BW  29.47  3         44          2.651e-10