Causal inferences with difference-in-differences and synthetic control methods.

Abstract : This study illustrates the effect of terrorism on GDP in Spanish areas between 1955 and 1997. We use difference-in-differences to examine causal impact and sensitivity to external influences post-intervention using a dataset from Abadie and Gardeazabal (2003). We find the impact of terrorist conflict on Basque country’s economy to be a GDP reduction of 0.57 units using the synthetic control method.

License: MIT License. \(~\)

knitr::opts_chunk$set(echo = TRUE)
rm(list=ls())

Workstation Setup

\(~\)

library(SCtools)
## Loading required package: future
library(Synth)
## ##
## ## Synth Package: Implements Synthetic Control Methods.
## ## See http://www.mit.edu/~jhainm/software.htm for additional information.
library(rvest)
## Loading required package: xml2
library(tidyverse)
## ── Attaching packages ─────────────────────────────────────────────────────────── tidyverse 1.3.0 ──
## ✓ ggplot2 3.3.0     ✓ purrr   0.3.4
## ✓ tibble  3.0.1     ✓ dplyr   0.8.5
## ✓ tidyr   1.0.3     ✓ stringr 1.4.0
## ✓ readr   1.3.1     ✓ forcats 0.5.0
## ── Conflicts ────────────────────────────────────────────────────────────── tidyverse_conflicts() ──
## x dplyr::filter()         masks stats::filter()
## x readr::guess_encoding() masks rvest::guess_encoding()
## x dplyr::lag()            masks stats::lag()
## x purrr::pluck()          masks rvest::pluck()
library(dplyr)
library(ggplot2)
library(viridis)
## Loading required package: viridisLite
library(hrbrthemes)
## NOTE: Either Arial Narrow or Roboto Condensed fonts are required to use these themes.
##       Please use hrbrthemes::import_roboto_condensed() to install Roboto Condensed and
##       if Arial Narrow is not on your system, please see https://bit.ly/arialnarrow
library(skimr)
library(kableExtra)
## 
## Attaching package: 'kableExtra'
## The following object is masked from 'package:dplyr':
## 
##     group_rows
library(ggthemes)
library(stargazer)
## 
## Please cite as:
##  Hlavac, Marek (2018). stargazer: Well-Formatted Regression and Summary Statistics Tables.
##  R package version 5.2.2. https://CRAN.R-project.org/package=stargazer
library(kableExtra)

Section 1: Get Basque Dataset

Load Basque Dataset from the Package

\(~\)

data(basque)

Quick Information about the Package

\(~\)

basque%>%
  glimpse()
## Rows: 774
## Columns: 17
## $ regionno              <dbl> 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,…
## $ regionname            <chr> "Spain (Espana)", "Spain (Espana)", "Spain (Esp…
## $ year                  <dbl> 1955, 1956, 1957, 1958, 1959, 1960, 1961, 1962,…
## $ gdpcap                <dbl> 2.354542, 2.480149, 2.603613, 2.637104, 2.66988…
## $ sec.agriculture       <dbl> NA, NA, NA, NA, NA, NA, 19.54, NA, 19.05, NA, 1…
## $ sec.energy            <dbl> NA, NA, NA, NA, NA, NA, 4.71, NA, 4.31, NA, 4.3…
## $ sec.industry          <dbl> NA, NA, NA, NA, NA, NA, 26.42, NA, 26.05, NA, 2…
## $ sec.construction      <dbl> NA, NA, NA, NA, NA, NA, 6.27, NA, 6.83, NA, 7.6…
## $ sec.services.venta    <dbl> NA, NA, NA, NA, NA, NA, 36.62, NA, 38.00, NA, 3…
## $ sec.services.nonventa <dbl> NA, NA, NA, NA, NA, NA, 6.44, NA, 5.77, NA, 6.4…
## $ school.illit          <dbl> NA, NA, NA, NA, NA, NA, NA, NA, NA, 2863.278, 2…
## $ school.prim           <dbl> NA, NA, NA, NA, NA, NA, NA, NA, NA, 18679.10, 1…
## $ school.med            <dbl> NA, NA, NA, NA, NA, NA, NA, NA, NA, 1064.246, 1…
## $ school.high           <dbl> NA, NA, NA, NA, NA, NA, NA, NA, NA, 359.7457, 3…
## $ school.post.high      <dbl> NA, NA, NA, NA, NA, NA, NA, NA, NA, 212.1434, 2…
## $ popdens               <dbl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,…
## $ invest                <dbl> NA, NA, NA, NA, NA, NA, NA, NA, NA, 18.36018, 2…

Idea of Database - Details in EDA

\(~\)

summary(basque)
##     regionno     regionname             year          gdpcap      
##  Min.   : 1.0   Length:774         Min.   :1955   Min.   : 1.243  
##  1st Qu.: 5.0   Class :character   1st Qu.:1965   1st Qu.: 3.693  
##  Median : 9.5   Mode  :character   Median :1976   Median : 5.336  
##  Mean   : 9.5                      Mean   :1976   Mean   : 5.395  
##  3rd Qu.:14.0                      3rd Qu.:1987   3rd Qu.: 6.869  
##  Max.   :18.0                      Max.   :1997   Max.   :12.350  
##                                                                   
##  sec.agriculture   sec.energy      sec.industry   sec.construction
##  Min.   : 1.32   Min.   : 1.600   Min.   : 9.56   Min.   : 4.340  
##  1st Qu.:13.54   1st Qu.: 2.697   1st Qu.:17.81   1st Qu.: 6.240  
##  Median :19.24   Median : 3.675   Median :23.14   Median : 7.135  
##  Mean   :20.27   Mean   : 5.189   Mean   :23.92   Mean   : 7.212  
##  3rd Qu.:27.48   3rd Qu.: 6.080   3rd Qu.:27.48   3rd Qu.: 8.178  
##  Max.   :46.50   Max.   :21.360   Max.   :46.22   Max.   :11.280  
##  NA's   :684     NA's   :684      NA's   :684     NA's   :684     
##  sec.services.venta sec.services.nonventa  school.illit       school.prim     
##  Min.   :26.23      Min.   : 3.430        Min.   :   8.098   Min.   :  151.3  
##  1st Qu.:31.25      1st Qu.: 5.497        1st Qu.:  40.291   1st Qu.:  432.3  
##  Median :34.75      Median : 6.680        Median : 116.232   Median :  852.1  
##  Mean   :36.49      Mean   : 6.935        Mean   : 308.051   Mean   : 2118.5  
##  3rd Qu.:39.19      3rd Qu.: 7.928        3rd Qu.: 252.270   3rd Qu.: 1763.3  
##  Max.   :58.21      Max.   :13.110        Max.   :2863.278   Max.   :19459.6  
##  NA's   :684        NA's   :684           NA's   :666        NA's   :666      
##    school.med       school.high      school.post.high     popdens      
##  Min.   :   8.61   Min.   :  3.063   Min.   :  1.660   Min.   : 22.38  
##  1st Qu.:  26.51   1st Qu.:  9.132   1st Qu.:  4.407   1st Qu.: 44.77  
##  Median :  47.75   Median : 16.696   Median :  7.713   Median : 80.38  
##  Mean   : 145.61   Mean   : 45.944   Mean   : 25.458   Mean   :105.77  
##  3rd Qu.: 119.04   3rd Qu.: 38.758   3rd Qu.: 19.059   3rd Qu.:122.57  
##  Max.   :1696.15   Max.   :474.941   Max.   :252.250   Max.   :442.45  
##  NA's   :666       NA's   :666       NA's   :666       NA's   :756     
##      invest      
##  Min.   : 9.332  
##  1st Qu.:18.742  
##  Median :21.351  
##  Mean   :21.396  
##  3rd Qu.:23.751  
##  Max.   :39.410  
##  NA's   :198

Section 2: Clean Data from Basque Dataset

This study makes use of package “Synth”. Load basque dataset from the package

\(~\)

data(basque)

Data Format

  1. A panel dataframe made up of 18 units: 1 treated (no 17; the Basque country) and 16 control regions (no. 2-16,18).
  2. Region no. 1 is the average for the whole country of Spain.
  3. 1 outcome variable (gdpcap).
  4. 13 predictor variables (6 sectoral production shares, 6 highest educational attainment categories, population density, and the investment rate).
  5. Region names and numbers are stored in regionno and regionname.
  6. 42 time periods (1955 - 1997).
  7. All columns have self-explanatory column names. \(~\)

Clean data

Check for Missing Value

\(~\)

str(basque)
## 'data.frame':    774 obs. of  17 variables:
##  $ regionno             : num  1 1 1 1 1 1 1 1 1 1 ...
##  $ regionname           : chr  "Spain (Espana)" "Spain (Espana)" "Spain (Espana)" "Spain (Espana)" ...
##  $ year                 : num  1955 1956 1957 1958 1959 ...
##  $ gdpcap               : num  2.35 2.48 2.6 2.64 2.67 ...
##  $ sec.agriculture      : num  NA NA NA NA NA ...
##  $ sec.energy           : num  NA NA NA NA NA ...
##  $ sec.industry         : num  NA NA NA NA NA ...
##  $ sec.construction     : num  NA NA NA NA NA ...
##  $ sec.services.venta   : num  NA NA NA NA NA ...
##  $ sec.services.nonventa: num  NA NA NA NA NA ...
##  $ school.illit         : num  NA NA NA NA NA ...
##  $ school.prim          : num  NA NA NA NA NA ...
##  $ school.med           : num  NA NA NA NA NA ...
##  $ school.high          : num  NA NA NA NA NA ...
##  $ school.post.high     : num  NA NA NA NA NA ...
##  $ popdens              : num  NA NA NA NA NA NA NA NA NA NA ...
##  $ invest               : num  NA NA NA NA NA ...
##  - attr(*, "datalabel")= chr ""
##  - attr(*, "time.stamp")= chr "23 Jan 2007 10:13"
##  - attr(*, "formats")= chr [1:17] "%8.0g" "%28s" "%9.0g" "%9.0g" ...
##  - attr(*, "types")= int [1:17] 254 28 254 254 254 254 254 254 254 254 ...
##  - attr(*, "val.labels")= chr [1:17] "" "" "" "" ...
##  - attr(*, "var.labels")= chr [1:17] "" "" "" "" ...
##  - attr(*, "version")= int -8
##  - attr(*, "label.table")=List of 17
##   ..$ : NULL
##   ..$ : NULL
##   ..$ : NULL
##   ..$ : NULL
##   ..$ : NULL
##   ..$ : NULL
##   ..$ : NULL
##   ..$ : NULL
##   ..$ : NULL
##   ..$ : NULL
##   ..$ : NULL
##   ..$ : NULL
##   ..$ : NULL
##   ..$ : NULL
##   ..$ : NULL
##   ..$ : NULL
##   ..$ : NULL
# which(is.na(basque))
sum(is.na(basque))
## [1] 8388

There are 8388 missing value from the dataset.


Section 3: EDA

Study Data- General Idea

\(~\)

summary(basque$year)
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##    1955    1965    1976    1976    1987    1997

Use Skim to Understand the Databse

\(~\)

basque%>% 
  skim() %>% 
  kable()
skim_type skim_variable n_missing complete_rate character.min character.max character.empty character.n_unique character.whitespace numeric.mean numeric.sd numeric.p0 numeric.p25 numeric.p50 numeric.p75 numeric.p100 numeric.hist
character regionname 0 1.0000000 6 28 0 18 0 NA NA NA NA NA NA NA NA
numeric regionno 0 1.0000000 NA NA NA NA NA 9.500000 5.191482 1.000000 5.000000 9.500000 14.000000 18.00000 ▇▆▇▆▇
numeric year 0 1.0000000 NA NA NA NA NA 1976.000000 12.417698 1955.000000 1965.000000 1976.000000 1987.000000 1997.00000 ▇▇▇▇▇
numeric gdpcap 0 1.0000000 NA NA NA NA NA 5.394987 2.242637 1.243431 3.693034 5.335690 6.869091 12.35004 ▅▇▇▂▁
numeric sec.agriculture 684 0.1162791 NA NA NA NA NA 20.268445 10.376150 1.320000 13.542499 19.240000 27.482500 46.50000 ▅▇▇▅▂
numeric sec.energy 684 0.1162791 NA NA NA NA NA 5.188556 4.035458 1.600000 2.697500 3.675000 6.080000 21.36000 ▇▂▁▁▁
numeric sec.industry 684 0.1162791 NA NA NA NA NA 23.915333 9.281848 9.560000 17.809999 23.135000 27.480000 46.22000 ▅▇▆▂▂
numeric sec.construction 684 0.1162791 NA NA NA NA NA 7.211556 1.361570 4.340000 6.240000 7.135000 8.177500 11.28000 ▃▇▆▅▁
numeric sec.services.venta 684 0.1162791 NA NA NA NA NA 36.485222 7.261439 26.230000 31.247500 34.750000 39.192500 58.21000 ▇▇▃▁▂
numeric sec.services.nonventa 684 0.1162791 NA NA NA NA NA 6.934556 1.978371 3.430000 5.497500 6.680000 7.927500 13.11000 ▃▇▃▂▁
numeric school.illit 666 0.1395349 NA NA NA NA NA 308.051261 630.842242 8.097660 40.290497 116.232128 252.270393 2863.27832 ▇▁▁▁▁
numeric school.prim 666 0.1395349 NA NA NA NA NA 2118.524959 4216.779749 151.320892 432.293228 852.125793 1763.311645 19459.55859 ▇▁▁▁▁
numeric school.med 666 0.1395349 NA NA NA NA NA 145.613775 297.452345 8.609827 26.513481 47.752241 119.039047 1696.14685 ▇▁▁▁▁
numeric school.high 666 0.1395349 NA NA NA NA NA 45.943488 92.106997 3.063398 9.131781 16.696211 38.758285 474.94116 ▇▁▁▁▁
numeric school.post.high 666 0.1395349 NA NA NA NA NA 25.458404 51.579575 1.660274 4.407411 7.712549 19.058688 252.25000 ▇▁▁▁▁
numeric popdens 756 0.0232558 NA NA NA NA NA 105.769444 101.518005 22.379999 44.772499 80.375000 122.567497 442.45001 ▇▂▁▁▁
numeric invest 198 0.7441860 NA NA NA NA NA 21.395883 4.111414 9.331671 18.742375 21.350711 23.750510 39.40980 ▁▇▇▁▁

Create a databse with only contains Years (1955-1997), 17 Spanish Regions with no.1 (average) and GDP.

Unused Varibles

\(~\)

Delete_Varibles_Table <- c("sec.agriculture", "sec.energy" , "sec.industry" , "sec.construction" , 
                     "sec.services.venta" , "sec.services.nonventa", "school.illit", "school.prim", 
                     "school.med", "school.high", "school.post.high", "popdens","invest","regionno")

Database for DD

\(~\)

Basque_Table <- basque[,!(names(basque) %in% Delete_Varibles_Table)]

Plot

\(~\)

Basque_Table %>%
  ggplot( aes(x=year, y=gdpcap , group=regionname, color=regionname)) +
  geom_line() +
  scale_color_viridis(discrete = TRUE) +
  theme(
    legend.position="none",
    plot.title = element_text(size=14)
  ) +
  ggtitle("GDPCAP from 1955-1997 on 18 Spanish Regions") +
  xlab("YEAR") + ylab("GDPCAP")+
  theme_ipsum()

Section 4: Difference in Differences of Basque Dataset

Create a databse with only contains Years (1955-1997), 17 Spanish Regions with no.1 (average) and GDP.

Unused Varibles

\(~\)

Delete_Varibles <- c("sec.agriculture", "sec.energy" , "sec.industry" , "sec.construction" , 
                     "sec.services.venta" , "sec.services.nonventa", "school.illit", "school.prim", 
                     "school.med", "school.high", "school.post.high", "popdens")

Database for DD

\(~\)

Basque_DD <- basque[,!(names(basque) %in% Delete_Varibles)]

Clean the Dataset

\(~\)

basq_Clean <- Basque_DD %>%
  mutate(post = ifelse(year > 1975, 1, 0),
         treat = ifelse(regionname == "Basque Country (Pais Vasco)", 1, 0),
         regionname = as.factor(regionname)) %>%
  filter(regionno != 1)

First Differences

\(~\)

basq_first_differences <- basq_Clean %>%
  filter(treat == 1)
ggplot(basq_first_differences, aes(x=year, y=gdpcap)) +
  geom_line(color = "blue") +  theme_economist() +
  geom_vline(xintercept=1975, color = "steelblue", linetype = "dashed") +
  labs(title="GDP Trend from 1955–1997 for Basque", 
       y="GDP per capita",x="Years", color = "Region") +
  annotate("text", x = 1970, y = 9, label = "Pre-period", size  =5, color = "Red") +
  annotate("text", x = 1980, y = 9, label = "Post-period", size  =5, color = "Red") 

Calculating First Differences

\(~\)

f_did <- lm(data = basq_first_differences, gdpcap ~ post)
stargazer(f_did, type="text")
## 
## ===============================================
##                         Dependent variable:    
##                     ---------------------------
##                               gdpcap           
## -----------------------------------------------
## post                         2.484***          
##                               (0.352)          
##                                                
## Constant                     5.382***          
##                               (0.252)          
##                                                
## -----------------------------------------------
## Observations                    43             
## R2                             0.549           
## Adjusted R2                    0.538           
## Residual Std. Error       1.153 (df = 41)      
## F Statistic           49.921*** (df = 1; 41)   
## ===============================================
## Note:               *p<0.1; **p<0.05; ***p<0.01

DiD Method

Codes From Synth: An R Package for Synthetic Control Methods in Comparative Case Studies - https://cran.r-project.org/web/packages/Synth/Synth.pdf \(~\)

dataprep.out <- dataprep(
  foo = basque,
  predictors = c("school.illit", "school.prim", "school.med",
                 "school.high", "school.post.high", "invest"),
  predictors.op = "mean",
  time.predictors.prior = 1964:1969,
  special.predictors = list(
    list("gdpcap", 1960:1969 ,"mean"),
    list("sec.agriculture", seq(1961, 1969, 2), "mean"),
    list("sec.energy", seq(1961, 1969, 2), "mean"),
    list("sec.industry", seq(1961, 1969, 2), "mean"),
    list("sec.construction", seq(1961, 1969, 2), "mean"),
    list("sec.services.venta", seq(1961, 1969, 2), "mean"),
    list("sec.services.nonventa", seq(1961, 1969, 2), "mean"),
    list("popdens",               1969,               "mean")),
  dependent = "gdpcap",
  unit.variable = "regionno",
  unit.names.variable = "regionname",
  time.variable = "year",
  treatment.identifier = 17,
  controls.identifier = c(2:16, 18),
  time.optimize.ssr = 1960:1969,
  time.plot = 1955:1997)

basq_synth <- basq_Clean %>%
  rename(Y = gdpcap) %>%
  mutate(regionname = as.character(regionname))
ggplot(basq_synth, aes(x=year,y=Y,group=regionno)) +
  geom_line(aes(color=as.factor(treat), size=as.factor(treat))) + 
  geom_vline(xintercept=1975,linetype="dashed", color = "steelblue") + theme_classic() + 
  labs(title="GDP Trend from 1955 to 1997 for All Regions", 
       y="GDP per Capita",x="Years", color = "Treatment group") +
  scale_color_manual(labels = c("Control", "Treated"), values = c("Red", "Blue")) +
  scale_size_manual(values = c(0.5,1), guide = 'none') +
  annotate("text", x = 1970, y = 11, label = "Pre-period", size  =5, color = "Red") +
  annotate("text", x = 1980, y = 11, label = "Post-period", size  =5, color = "Red") +
  theme_economist() 

Cataluna Region as Control Region

\(~\)

Selection <- basq_Clean %>%
  filter(post == 0) %>%
  left_join(dplyr::select(basq_Clean[basq_Clean$post==0 & basq_Clean$treat == 1, ], gdpcap, year),
            by = c("year"= 'year')) %>%
  mutate(perc_diff = (gdpcap.y - gdpcap.x) / gdpcap.y) %>%
  group_by(regionname) %>%
  summarise(gdp_var = abs(var(perc_diff))) %>%
  arrange(gdp_var)

Validating Assumption

\(~\)

did_data <- basq_Clean %>%
  filter(regionname %in% c("Basque Country (Pais Vasco)", "Cataluna"))
ggplot(did_data, aes(x=year, y=gdpcap, group = regionname)) +
  geom_line(aes(color = regionname)) + 
  theme_economist() + 
  geom_vline(xintercept=1975, color = "steelblue", linetype = "dashed") +
  labs(title="GDP Trend from 1955-1997 for Different Regions", 
       y="GDP per Capita",x="Years", color = "Region") +
  scale_color_manual(labels = c("Basque (Treated)", "Cataluna (Control)"), values = c("Blue", "Red"))+
  annotate("text", x = 1970, y = 11, label = "Pre-period", size  =5, color = "Red") +
  annotate("text", x = 1980, y = 11, label = "Post-period", size  =5, color = "Red")

Calculating DiD

\(~\)

did <- lm(data = did_data, gdpcap ~ treat*post)
stargazer(did, type="text")
## 
## ===============================================
##                         Dependent variable:    
##                     ---------------------------
##                               gdpcap           
## -----------------------------------------------
## treat                          0.139           
##                               (0.376)          
##                                                
## post                         3.339***          
##                               (0.371)          
##                                                
## treat:post                    -0.855           
##                               (0.525)          
##                                                
## Constant                     5.244***          
##                               (0.266)          
##                                                
## -----------------------------------------------
## Observations                    86             
## R2                             0.607           
## Adjusted R2                    0.593           
## Residual Std. Error       1.218 (df = 82)      
## F Statistic           42.279*** (df = 3; 82)   
## ===============================================
## Note:               *p<0.1; **p<0.05; ***p<0.01

Synthetic Control Methods

Codes From Synth: An R Package for Synthetic Control Methods in Comparative Case Studies - https://cran.r-project.org/web/packages/Synth/Synth.pdf \(~\)

dataprep.out <- dataprep(
  foo = basque,
  predictors = c("school.illit", "school.prim", "school.med",
                 "school.high", "school.post.high", "invest"),
  predictors.op = "mean",
  time.predictors.prior = 1964:1969,
  special.predictors = list(
    list("gdpcap", 1960:1969 ,"mean"),
    list("sec.agriculture", seq(1961, 1969, 2), "mean"),
    list("sec.energy", seq(1961, 1969, 2), "mean"),
    list("sec.industry", seq(1961, 1969, 2), "mean"),
    list("sec.construction", seq(1961, 1969, 2), "mean"),
    list("sec.services.venta", seq(1961, 1969, 2), "mean"),
    list("sec.services.nonventa", seq(1961, 1969, 2), "mean"),
    list("popdens",               1969,               "mean")),
  dependent = "gdpcap",
  unit.variable = "regionno",
  unit.names.variable = "regionname",
  time.variable = "year",
  treatment.identifier = 17,
  controls.identifier = c(2:16, 18),
  time.optimize.ssr = 1960:1969,
  time.plot = 1955:1997)

synth.out = synth(dataprep.out)
## 
## X1, X0, Z1, Z0 all come directly from dataprep object.
## 
## 
## **************** 
##  searching for synthetic control unit  
##  
## 
## **************** 
## **************** 
## **************** 
## 
## MSPE (LOSS V): 0.008864606 
## 
## solution.v:
##  0.02773094 1.194e-07 1.60609e-05 0.0007163836 1.486e-07 0.002423908 0.0587055 0.2651997 0.02851006 0.291276 0.007994382 0.004053188 0.009398579 0.303975 
## 
## solution.w:
##  2.53e-08 4.63e-08 6.44e-08 2.81e-08 3.37e-08 4.844e-07 4.2e-08 4.69e-08 0.8508145 9.75e-08 3.2e-08 5.54e-08 0.1491843 4.86e-08 9.89e-08 1.162e-07
path.plot(dataprep.res = dataprep.out, synth.res = synth.out,Xlab="Year",Ylab="GDP Per Capita")
abline(v=1975,lty=2,col="steelblue")
title("Actual vs Synthetic GDP for Basque")

Placebo Tests

\(~\)

placebo <- generate.placebos(dataprep.out = dataprep.out,synth.out = synth.out, strategy = "multiprocess")
## Warning: [ONE-TIME WARNING] Forked processing ('multicore') is disabled
## in future (>= 1.13.0) when running R from RStudio, because it is
## considered unstable. Because of this, plan("multicore") will fall
## back to plan("sequential"), and plan("multiprocess") will fall back to
## plan("multisession") - not plan("multicore") as in the past. For more details,
## how to control forked processing or not, and how to silence this warning in
## future R sessions, see ?future::supportsMulticore
## 
## X1, X0, Z1, Z0 all come directly from dataprep object.
## 
## 
## **************** 
##  searching for synthetic control unit  
##  
## 
## **************** 
## **************** 
## **************** 
## 
## MSPE (LOSS V): 0.001123037 
## 
## solution.v:
##  0.241915 0.004561014 0.0005982158 0.0004798842 0.001241421 0.01513177 0.3234859 0.01630921 0.02052098 0.1785447 0.00422904 0.006081587 0.09511713 0.09178419 
## 
## solution.w:
##  3.1e-09 1.9e-09 3e-09 5.1e-09 1.4e-09 3.6e-09 0.4744106 6.396e-07 0.1323778 0 0.3932067 4.3205e-06 3.9e-09 2.1e-09 2.1e-09 
## 
## 
## X1, X0, Z1, Z0 all come directly from dataprep object.
## 
## 
## **************** 
##  searching for synthetic control unit  
##  
## 
## **************** 
## **************** 
## **************** 
## 
## MSPE (LOSS V): 0.0005042449 
## 
## solution.v:
##  0.007951046 0.005383677 0.00339887 0.01137706 0.00460769 0.001229192 0.00942928 0.3711191 0.07144081 0.2706073 0.0009782796 0.2245375 0.01746217 0.0004780441 
## 
## solution.w:
##  0.01895674 0.1598039 0.04920978 0.1000861 0.01720838 0.09270212 0.002315982 0.0114126 0.007351221 0.06490908 0.008642634 0.03648222 0.01175011 0.4191682 8.657e-07 
## 
## 
## X1, X0, Z1, Z0 all come directly from dataprep object.
## 
## 
## **************** 
##  searching for synthetic control unit  
##  
## 
## **************** 
## **************** 
## **************** 
## 
## MSPE (LOSS V): 9.150192e-05 
## 
## solution.v:
##  0.007306654 0.03508239 0.02403886 0.04107291 0.02845283 0.01600231 0.03689203 0.2648573 0.08663512 0.003754926 0.144038 3.1e-09 0.2267502 0.08511643 
## 
## solution.w:
##  1.3e-09 0.1811191 8.428e-07 0.5491962 6.242e-07 5.4e-09 2.11e-08 0.2696795 4.32e-08 8e-10 1.24e-08 0 3.5521e-06 2.87e-08 2.73e-08 
## 
## 
## X1, X0, Z1, Z0 all come directly from dataprep object.
## 
## 
## **************** 
##  searching for synthetic control unit  
##  
## 
## **************** 
## **************** 
## **************** 
## 
## MSPE (LOSS V): 0.121049 
## 
## solution.v:
##  0.002617738 0.0002791556 2.57468e-05 0.000137047 0.00097715 0.008678213 0.5666581 0.05034707 0.2251532 9.92001e-05 0.01279458 0.02488051 0.07979401 0.0275582 
## 
## solution.w:
##  4.14567e-05 0.0002089316 1.5421e-06 0.004351484 0.0003378521 4.2789e-06 3.2254e-06 0.5733009 0.0004280817 0.0004464912 5.8471e-06 0.2508225 3.2848e-06 0.1700441 7.3e-09 
## 
## 
## X1, X0, Z1, Z0 all come directly from dataprep object.
## 
## 
## **************** 
##  searching for synthetic control unit  
##  
## 
## **************** 
## **************** 
## **************** 
## 
## MSPE (LOSS V): 0.001322934 
## 
## solution.v:
##  8.488e-07 0.01904002 0.02959735 0.03582989 0.0346101 0.1146028 0.06955046 0.01739945 0.1487795 0.1122417 0.0009692388 0.000475205 0.4166301 0.0002732522 
## 
## solution.w:
##  2.127e-07 2.18e-08 4.37e-08 0.1713031 1.071e-07 1.498e-07 4.6e-09 1.01e-08 2.41e-08 0.2497729 4.43e-08 2.86e-07 0.5789231 4.2e-09 2.8e-09 
## 
## 
## X1, X0, Z1, Z0 all come directly from dataprep object.
## 
## 
## **************** 
##  searching for synthetic control unit  
##  
## 
## **************** 
## **************** 
## **************** 
## 
## MSPE (LOSS V): 6.71562e-05 
## 
## solution.v:
##  0.09936656 0.03044294 0.02972727 0.05003989 0.01908395 0.07927748 0.1458313 0.2134485 0.007984265 0.03630765 0.08577594 0.194891 0.005290142 0.002533157 
## 
## solution.w:
##  7.2e-09 1.373e-07 0.2463714 0.03016985 8.263e-07 3.86e-08 6.49e-08 2.64e-08 0.5327529 5.04e-08 4.73e-08 3.2e-09 1.3424e-06 0.1076988 0.08300449 
## 
## 
## X1, X0, Z1, Z0 all come directly from dataprep object.
## 
## 
## **************** 
##  searching for synthetic control unit  
##  
## 
## **************** 
## **************** 
## **************** 
## 
## MSPE (LOSS V): 0.0003285848 
## 
## solution.v:
##  0.02044489 0.04385717 0.06860424 7.70025e-05 9.93798e-05 0.0007090437 0.02125673 0.0003438331 0.2376439 0.001942309 0.08550463 0.03828685 0.431263 0.049967 
## 
## solution.w:
##  0.2405035 0.02499768 0.06589436 7.7222e-06 4.1037e-06 1.86488e-05 5.34252e-05 5.9741e-05 1.20499e-05 0.007073272 0.2371783 0.003946094 0.4199905 0.0001347198 0.0001258654 
## 
## 
## X1, X0, Z1, Z0 all come directly from dataprep object.
## 
## 
## **************** 
##  searching for synthetic control unit  
##  
## 
## **************** 
## **************** 
## **************** 
## 
## MSPE (LOSS V): 0.004925438 
## 
## solution.v:
##  0.0640751 0.08790118 0.001980309 0.0003909784 0.001042542 0.009779076 0.06046502 0.3617145 0.1614419 0.004088867 0.002717818 0.2082107 0.002001196 0.03419076 
## 
## solution.w:
##  0.001196807 2.3356e-06 6.8792e-05 2.04e-07 3.601e-07 1.7593e-06 0.3517978 1.1187e-06 1.5301e-06 0.6414476 9.8923e-06 1.533e-07 3.784e-06 6.1336e-06 0.005461752 
## 
## 
## X1, X0, Z1, Z0 all come directly from dataprep object.
## 
## 
## **************** 
##  searching for synthetic control unit  
##  
## 
## **************** 
## **************** 
## **************** 
## 
## MSPE (LOSS V): 0.01361662 
## 
## solution.v:
##  0.001433952 0.03871014 0.1084723 0.06228383 0.009103274 0.0398204 0.04808129 0.4397184 0.04134378 0.1485366 0.005719523 0.04843611 0.00131105 0.007029396 
## 
## solution.w:
##  8.635e-07 2.1e-09 1.1e-09 1e-10 2e-10 0.3980367 1.09e-08 2.6e-09 1.07918e-05 7e-10 6.2e-09 0.6019516 7e-10 3.2e-09 2.9e-09 
## 
## 
## X1, X0, Z1, Z0 all come directly from dataprep object.
## 
## 
## **************** 
##  searching for synthetic control unit  
##  
## 
## **************** 
## **************** 
## **************** 
## 
## MSPE (LOSS V): 0.001117842 
## 
## solution.v:
##  0.0248724 5.94756e-05 0.1038749 0.001693213 0.0546291 0.1669822 0.2045931 0.0009081238 0.06845943 0.01931348 0.01832864 0.01195384 3.0517e-06 0.324329 
## 
## solution.w:
##  0.194387 7.6484e-06 1.5136e-06 0.0004885163 1.77094e-05 0.4662803 5.0585e-06 0.1050862 0.1032154 4.6847e-06 3.644e-07 0.1251891 2.04468e-05 1.11538e-05 0.005284934 
## 
## 
## X1, X0, Z1, Z0 all come directly from dataprep object.
## 
## 
## **************** 
##  searching for synthetic control unit  
##  
## 
## **************** 
## **************** 
## **************** 
## 
## MSPE (LOSS V): 0.1146391 
## 
## solution.v:
##  0.03571533 0.06494282 0.09362483 0.07491658 0.07469396 1.37939e-05 0.04935978 0.349576 0.1030824 0.08247467 0.001718186 0.0001076685 0.006958453 0.06281548 
## 
## solution.w:
##  1e-09 6.3e-09 9e-10 6.83e-08 1.731e-07 4e-09 1.5e-09 0.9999997 6e-10 2.5e-09 4.6e-09 3.1e-09 5.5e-09 1.53e-08 4.3e-08 
## 
## 
## X1, X0, Z1, Z0 all come directly from dataprep object.
## 
## 
## **************** 
##  searching for synthetic control unit  
##  
## 
## **************** 
## **************** 
## **************** 
## 
## MSPE (LOSS V): 0.0004963764 
## 
## solution.v:
##  0.0009361261 0.1333922 0.02525684 0.1227094 0.05640474 0.0004846876 0.419331 9.00574e-05 0.01196674 0.05946839 0.1680857 1.42442e-05 0.001858875 1.0247e-06 
## 
## solution.w:
##  0.286836 0.003897055 0.05920034 0.001340042 0.002269029 0.007676064 0.09404751 0.4512099 0.005251879 0.007917988 0.07104263 0.000415688 0.002069254 0.003351204 0.003475416 
## 
## 
## X1, X0, Z1, Z0 all come directly from dataprep object.
## 
## 
## **************** 
##  searching for synthetic control unit  
##  
## 
## **************** 
## **************** 
## **************** 
## 
## MSPE (LOSS V): 0.7209181 
## 
## solution.v:
##  0.05343497 0.0121308 0.08151981 0.05309881 0.06651345 0.07808911 0.261331 0.09139274 0.03624163 0.03912748 0.01813165 0.08133689 0.02373037 0.1039213 
## 
## solution.w:
##  5.929e-07 1.123e-07 1.222e-07 0.04087338 1.0595e-06 2.69e-08 1.823e-07 7.34e-08 0.9591236 1.817e-07 8.57e-08 1.869e-07 2.062e-07 1.018e-07 1.011e-07 
## 
## 
## X1, X0, Z1, Z0 all come directly from dataprep object.
## 
## 
## **************** 
##  searching for synthetic control unit  
##  
## 
## **************** 
## **************** 
## **************** 
## 
## MSPE (LOSS V): 0.001775702 
## 
## solution.v:
##  0.0001311986 0.002637929 0.1641209 0.08811159 0.004142852 0.07747617 0.3947256 0.00124808 0.04052377 0.0001841404 0.06422426 0.05199957 0.07180447 0.03866944 
## 
## solution.w:
##  1.9427e-06 1.1112e-06 0.02391781 1.25e-07 0.3687398 3.1e-09 0.01449668 0.3780162 4.957e-07 5.086e-07 0 1.8853e-06 2.81679e-05 2.478e-07 0.214795 
## 
## 
## X1, X0, Z1, Z0 all come directly from dataprep object.
## 
## 
## **************** 
##  searching for synthetic control unit  
##  
## 
## **************** 
## **************** 
## **************** 
## 
## MSPE (LOSS V): 0.0002705289 
## 
## solution.v:
##  0.01972369 0.0005686467 0.001710033 8.41935e-05 5.4099e-05 0.1506841 0.2549567 0.2465513 0.1368061 0.03261514 0.002974493 0.1309928 0.000170366 0.02210827 
## 
## solution.w:
##  0.0003172171 0.05824001 6.6413e-05 0.01134615 9.8315e-05 0.09664233 0.0001627458 9.64939e-05 0.1327023 0.0001892526 0.004009706 0.0002083849 0.001109842 0.0001280464 0.6946828 
## 
## 
## X1, X0, Z1, Z0 all come directly from dataprep object.
## 
## 
## **************** 
##  searching for synthetic control unit  
##  
## 
## **************** 
## **************** 
## **************** 
## 
## MSPE (LOSS V): 0.0004131752 
## 
## solution.v:
##  0.1276 0.2059267 0.02060637 0.1085751 0.01427578 0.002102533 0.0835427 0.009201901 0.00282947 0.3837883 0.0110102 6.9e-07 0.02223868 0.008301534 
## 
## solution.w:
##  1.38e-08 2.58e-08 1.9e-08 0 6.7e-09 1.46788e-05 2.98e-08 1.0763e-06 8.4e-09 3.43e-08 2.49e-08 4.17e-08 8.5e-09 0.1709315 0.8290525
plot_placebos(placebo)

Post/Pre MPSE Test

\(~\)

mspe_plot(placebo)

*** ### Calculatating root mean squared error between actual and synthetic \(~\)

round(sqrt(mean((dataprep.out$Y1plot - (dataprep.out$Y0plot %*% synth.out$solution.w))^2)), 2)
## [1] 0.57

Comparison of results from difference-in-differences and synthetic control methods

\(~\)

labels <- c("DiD","Synthetic Control")
Changes <- c(-0.85, -0.57)
Changes_df <- data.frame(labels,Changes)
names(Changes_df ) <- c("Method", "Change in GDP per Captial")
kable(Changes_df )
Method Change in GDP per Captial
DiD -0.85
Synthetic Control -0.57

Reference

Alberto Abadie, Alexis Diamond, Jens Hainmueller (2011). Synth: An R Package for Synthetic Control Methods in Comparative Case Studies. Journal of Statistical Software, 42(13), 1-17. URL http://www.jstatsoft.org/v42/i13/.

Bruno Castanho Silva and Michael DeWitt (2019). SCtools: Extensions for Synthetic Controls Analysis. R package version 0.3.0. https://CRAN.R-project.org/package=SCtools

H. Wickham. ggplot2: Elegant Graphics for Data Analysis. Springer-Verlag New York, 2016.

Hadley Wickham (2019). rvest: Easily Harvest (Scrape) Web Pages. R package version 0.3.5. https://CRAN.R-project.org/package=rvest

Hadley Wickham, Romain François, Lionel Henry and Kirill Müller (2020). dplyr: A Grammar of Data Manipulation. R package version 0.8.4. https://CRAN.R-project.org/package=dplyr

Hao Zhu (2019). kableExtra: Construct Complex Table with ‘kable’ and Pipe Syntax. R package version 1.1.0. https://CRAN.R-project.org/package=kableExtra

Hlavac, Marek (2018). stargazer: Well-Formatted Regression and Summary Statistics Tables. R package version 5.2.1. https://CRAN.R-project.org/package=stargazer

Jeffrey B. Arnold (2019). ggthemes: Extra Themes, Scales and Geoms for ‘ggplot2’. R package version 4.2.0. https://CRAN.R-project.org/package=ggthemes

Lüdecke D (2018). “sjmisc: Data and Variable Transformation Functions.” Journal of Open Source Software, 3(26), 754. doi: 10.21105/joss.00754 (URL: https://doi.org/10.21105/joss.00754).

Lüdecke D (2020). sjlabelled: Labelled Data Utility Functions (Version 1.1.3). doi: 10.5281/zenodo.1249215 (URL: https://doi.org/10.5281/zenodo.1249215), <URL: https://CRAN.R-project.org/package=sjlabelled>.

Lüdecke D (2020). sjPlot: Data Visualization for Statistics in Social Science. doi: 10.5281/zenodo.1308157 (URL: https://doi.org/10.5281/zenodo.1308157), R package version 2.8.3, <URL: https://CRAN.R-project.org/package=sjPlot>.

Simon Garnier (2018). viridis: Default Color Maps from ‘matplotlib’. R package version 0.5.1. https://CRAN.R-project.org/package=viridis

Wickham et al., (2019). Welcome to the tidyverse. Journal of Open Source Software, 4(43), 1686, https://doi.org/10.21105/joss.01686