Synthetic Difference-in-Differences

Application

Loading the California smoking cessation program data

# Loading packages
knitr::opts_chunk$set(echo = TRUE, eval=TRUE, message=FALSE, warning=FALSE, fig.height=4)
necessaryPackages <- c("foreign","reshape","rvest","tidyverse","dplyr","stringr","ggplot2","stargazer","readr","haven","Synth","devtools","SCtools","augsynth","synthdid")
new.packages <- necessaryPackages[
              !(necessaryPackages %in% installed.packages()[,"Package"])]
if(length(new.packages)) install.packages(new.packages)
lapply(necessaryPackages, require, character.only = TRUE)

## [[1]]
## [1] TRUE
## 
## [[2]]
## [1] TRUE
## 
## [[3]]
## [1] TRUE
## 
## [[4]]
## [1] TRUE
## 
## [[5]]
## [1] TRUE
## 
## [[6]]
## [1] TRUE
## 
## [[7]]
## [1] TRUE
## 
## [[8]]
## [1] TRUE
## 
## [[9]]
## [1] TRUE
## 
## [[10]]
## [1] TRUE
## 
## [[11]]
## [1] TRUE
## 
## [[12]]
## [1] TRUE
## 
## [[13]]
## [1] TRUE
## 
## [[14]]
## [1] TRUE
## 
## [[15]]
## [1] TRUE

if(!require(SCtools)) devtools::install_github("bcastanho/SCtools")
if(!require(augsynth)) devtools::install_github("ebenmichael/augsynth")
if(!require(synthdid)) devtools:: install_github("synth-inference/synthdid")


# Importing the dataset
data("california_prop99")
  
# Describing the dataset and setting it up as a panel
  applicationdata = panel.matrices(california_prop99)
  summary(applicationdata)

##    Length Class  Mode   
## Y  1209   -none- numeric
## N0    1   -none- numeric
## T0    1   -none- numeric
## W  1209   -none- numeric

  str(applicationdata)

## List of 4
##  $ Y : num [1:39, 1:31] 89.8 100.3 124.8 120 155 ...
##   ..- attr(*, "dimnames")=List of 2
##   .. ..$ : chr [1:39] "Alabama" "Arkansas" "Colorado" "Connecticut" ...
##   .. ..$ : chr [1:31] "1970" "1971" "1972" "1973" ...
##  $ N0: int 38
##  $ T0: num 19
##  $ W : int [1:39, 1:31] 0 0 0 0 0 0 0 0 0 0 ...
##   ..- attr(*, "dimnames")=List of 2
##   .. ..$ : chr [1:39] "Alabama" "Arkansas" "Colorado" "Connecticut" ...
##   .. ..$ : chr [1:31] "1970" "1971" "1972" "1973" ...

The data source is Abadie, Diamond, and Hainmueller (2010) “Synthetic control methods for comparative case studies: Estimating the effect of California’s tobacco control program.” Journal of the American statistical Association 105, no. 490 (2010): 493-505.

The dataset is a panel of 39 units (states) observed in 31 time periods (years 1970 through 2000). Treated unit is California and and control units are the remaining 38 states. There are 19 pre-treatment time periods (years 1970 through 1988) and 12 post-treatment time periods (years 1989 through 2000).

Goal: Estimate the causal effect of the California smoking cessation program data using the Synthetic Difference-in-Differences (SDID) estimator

# SDID-estimated average treatment effect for the treated
tau.hat = synthdid_estimate(applicationdata$Y, applicationdata$N0, applicationdata$T0)
se = sqrt(vcov(tau.hat, method='placebo'))
sprintf('SDID Estimate: %1.2f', tau.hat) # It matches 15.6 in the second column and second row of Table 1 in Arkhangelsky, Athey, Hirshberg, Imbens, and Wager (2021)

## [1] "SDID Estimate: -15.60"

sprintf('SDID Standard error: %1.2f', se) # It is close to 8.4 in the second column and third row of Table 1 in Arkhangelsky, Athey, Hirshberg, Imbens, and Wager (2021)

## [1] "SDID Standard error: 8.47"

sprintf('95%% CI (%1.2f, %1.2f)', tau.hat - 1.96 * se, tau.hat + 1.96 * se)

## [1] "95% CI (-32.20, 0.99)"

plot(tau.hat) # It matches plot in the first row and third column of Figure 1 in Arkhangelsky, Athey, Hirshberg, Imbens, and Wager (2021)

# Control unit weights and contribution plot
top18unitwgt = synthdid_controls(tau.hat)[1:18, , drop=FALSE]
synthdid_units_plot(tau.hat, units = rownames(top18unitwgt)) # It matches plot in the second row and third column of Figure 1 in Arkhangelsky, Athey, Hirshberg, Imbens, and Wager (2021)

In progress

Replicate Full Table 1 in Arkhangelsky, Athey, Hirshberg, Imbens, and Wager (2021)
Replicate Full Figure 1 in Arkhangelsky, Athey, Hirshberg, Imbens, and Wager (2021)

Synthetic Difference-in-Differences

Application: California smoking cessation program

Edouard Mensah

12/31/2021

Reference and Disclaimer

Application

Loading the California smoking cessation program data

Goal: Estimate the causal effect of the California smoking cessation program data using the Synthetic Difference-in-Differences (SDID) estimator

In progress