Problem Set 4

For this Problem Set we will replicate the well-known paper that uses RDD: “The Effect of Alcohol Consumption on Mortality: Regression Discontinuity Evidence from the Minimum Drinking Age” by Christopher Carpenter and Carlos Dobkin. In this paper, the authors investigate whether alcohol consumption by young adults increases their mortality rate indirectly, by increasing the rate in which they are involved in motor vehicle accidents. Use the rdd_cd file avilable on Canvas. This dataset contains average statistics for several measures on mortality rates (per 100,000) for age groups between 19-22 in the US. Description of variables agecell: age group all: number of deaths (per 100,000) from all causes of death internal: number of deaths (per 100,000) from all internal causes external: number of deaths (per 100,000) from all external causes alcohol: number of deaths (per 100,000) from alcohol-related causes homicide: number of deaths (per 100,000) from homicide suicide: number of deaths (per 100,000) from suicide mva: number of deaths (per 100,000) from motor vehicle accidents drugs: number of deaths (per 100,000) from drug-related causes

  1. Replicate rows 1, 2 and 3 of Table 4.1 (page 160) from Mastering Metrics book. That is, find coefficient of interest on Da for all deaths, motor vehicle accidents (MVA) deaths and suicide including regression both for 19-22 age as well as for 20-21 and including only age as control and another version using age, age2 and age2 interacted with being older than 21. (Hint: you will have to create Da).

The coefficients of Interest in the above model for Ages 19-21 are as follows.

## [1] "All Deaths coefficient" "7.6627089248081"
## [1] "All Deaths Standard Error" "1.31870423659469"
## [1] "MVA coefficient"  "4.53403292645215"
## [1] "MVA Standard Error" "0.750604266756914"
## [1] "suicide coefficient" "1.79428904904453"
## [1] "Suicide Standard Error" "0.45421524685025"
## [1] "quadratic deaths coefficient" "9.54778854390339"
## [1] "All Deaths Quadratic Standard Error" "1.98527727579064"
## [1] "Quadratic MVA coefficient" "4.6628584640389"
## [1] "MVA Quadratic Standard Error" "1.15479948413707"
## [1] "Suicide Quadratic coefficient" "1.81433201213727"
## [1] "Suicide Qaudratic Standard Error" "0.698860567143418"



The coefficients of Interest in the above model for Ages 20-22 are as follows.

## [1] "All Deaths coefficient" "9.75331060809553"
## [1] "All Deaths Standard Error" "1.93705179199469"
## [1] "MVA coefficient"  "4.75928351678797"
## [1] "MVA Standard Error" "1.12344652176188"
## [1] "suicide coefficient" "1.72442604970814"
## [1] "Suicide Standard Error" "0.709663961194574"
## [1] "quadratic deaths coefficient" "9.6110769907366"
## [1] "All Deaths Quadratic Standard Error" "2.89403192615969"
## [1] "Quadratic MVA coefficient" "5.89248895392317"
## [1] "MVA Quadratic Standard Error" "1.56127541137539"
## [1] "Suicide Quadratic coefficient" "1.29659855401229"
## [1] "Suicide Qaudratic Standard Error" "1.12570283163984"




2) Present all coefficients and standard errors from question 1 in one table. You can choose any style you prefer, but if you are unsure, just replicate Table 4.1 from MM book.

knitr::kable(texttbl,
      col.names = (c("Dependent Variable", "Age 19-22 Linear", "Age 19-22 Quadratic", "Age 20-21 Linear", "Age 20-21 Quadratic")))
Dependent Variable Age 19-22 Linear Age 19-22 Quadratic Age 20-21 Linear Age 20-21 Quadratic
All Deaths coefficient 7.662709 9.547789 9.753311 9.611077
All Deaths Standard Error 1.318704 1.985277 1.937052 2.894032
MVA coefficient 4.534033 4.662858 4.759284 5.892489
MVA Standard Error 0.7506043 1.154799 1.123447 1.561275
Suicide coefficient 1.794289 1.814332 1.724426 1.296599
Suicide Standard Error 0.4542152 0.6988606 0.709664 1.125703



Question 3 Using rdd package and RDestimate command, plot individual graphs for all deaths, MVA and suicides (only for 19-22 age). You should have three separate graphs.

all_plot <- RDestimate(all ~ agecell, data=cd, cutpoint = 22 )
                       
plot(all_plot)
title(main= "Number of deaths despite the cause ", xlab= "Age group from 19-22", ylab="No of deaths")

mva_plot <- RDestimate(mva ~ agecell, data=cd, cutpoint= 22 )
                       
plot(mva_plot)
title(main= "Number of deaths in Motor Vehicle Accidents ", xlab= "Age group from 19-22", ylab="No of MVA deaths")

suicide_plot <- RDestimate(suicide ~ agecell, data=cd, cutpoint= 22 )
                       
plot(all_plot)
title(main= "Number of deaths Suicides ", xlab= "Age group from 19-22", ylab="No of  Suicide deaths")

Question 4 Recreate the figure from question 3 for MVA using ggplot. Briefly explain what the figure is showing in the context of the RDD model.

Interpretation The figure shows that No of motor Vehicle accidents are higher in younger people. Which leads to the implication that age groups less than 21, are more likely to drink and drive and cause MVAs. This shows a clear indication that drinking age affects MVAs. As we can see, there is a higher number of MVAs at the drinking age group of 21.

## `geom_smooth()` using formula 'y ~ x'