## Descriptive Statistics
## data
## N: 731
##
## gdpcap year
## --------------- -------- ---------
## Mean 5.40 1976.00
## Std.Dev 2.26 12.42
## Min 1.24 1955.00
## Median 5.28 1976.00
## Max 12.35 1997.00
## N.Valid 731.00 731.00
## Pct.Valid 100.00 100.00
The median value of 5.28 for the real per capita GDP is less than $5.28K for fifty percent of the observed years.
## Table with Range, Mean, Median, IQR, SD, N
table2 = descr(data, stats= "common")
## Descriptive Statistics
## data
## Group: CE = controlAfter
## N: 400
##
## gdpcap year
## --------------- -------- ---------
## Mean 6.64 1985.00
## Std.Dev 1.85 7.22
## Min 3.05 1973.00
## Median 6.33 1985.00
## Max 12.35 1997.00
## N.Valid 400.00 400.00
## Pct.Valid 100.00 100.00
##
## Group: CE = controlBefore
## N: 288
##
## gdpcap year
## --------------- -------- ---------
## Mean 3.48 1963.50
## Std.Dev 1.33 5.20
## Min 1.24 1955.00
## Median 3.32 1963.50
## Max 7.57 1972.00
## N.Valid 288.00 288.00
## Pct.Valid 100.00 100.00
##
## Group: CE = treatedAfter
## N: 25
##
## gdpcap year
## --------------- -------- ---------
## Mean 7.77 1985.00
## Std.Dev 1.15 7.36
## Min 6.50 1973.00
## Median 7.33 1985.00
## Max 10.17 1997.00
## N.Valid 25.00 25.00
## Pct.Valid 100.00 100.00
##
## Group: CE = treatedBefore
## N: 18
##
## gdpcap year
## --------------- -------- ---------
## Mean 5.10 1963.50
## Std.Dev 0.91 5.34
## Min 3.85 1955.00
## Median 5.27 1963.50
## Max 6.56 1972.00
## N.Valid 18.00 18.00
## Pct.Valid 100.00 100.00
## Subset
treatedBefore = subset(data, year<1973 & region == "Basque Country")
treatedAfter = subset(data, year>=1973 & region == "Basque Country")
controlBefore = subset(data, year<1973 & region != "Basque Country")
controlAfter = subset(data, year>=1973 & region != "Basque Country")
## Nested if else
data$CE = ifelse(data$year<1973 & data$region == "Basque Country", "treatedBefore",
ifelse(data$year>=1973 & data$region == "Basque Country", "treatedAfter",
ifelse(data$year<1973 & data$region != "Basque Country", "controlBefore", "controlAfter"
)))
table3 = stby(
data = data,
INDICES = data$CE, # by Causal Effect
FUN = descr,
stats = "common"
)
## The mean difference is -1.132917 thousand dollars of real capita per GDP.
Between the years of 1973 and 1977, the average difference between Basque Country and non-Basque Country regions real capita GDP was ~ - $1.13 thousand, with Basque having a higher GDP. This data seems to allude that the terrorism helped the Basque Country economy.
require(rosetta)
after = data %>% filter(CE == "controlAfter" | CE == "treatedAfter")
diff = meanDiff(after$gdpcap ~ after$CE)
meand = (diff$meanDiff)
cat("The mean difference is", meand, "thousand dollars of real capita per GDP.")
Some possible confounding variables not accounted for are the total population number or the levels of education. If the Basque Country economy pre-terrorism was considerably more pronounced than Spain, but only slightly more pronounced post-terrorism the analysis would be different. In fact, one would read it that terrorism hurt the economy.
## The mean difference is 2.678146 thousand dollars of real capita per GDP.
Comparing the pre-terrorism phase (prior to 1973) and the the period during terrorism (1973-1977) the average difference of reap capita per GDP in Basque Country was $2.68 thousand. Again, based on these insights, terrorism seemed to aid in this regions economy.
control = data %>% filter(CE == "controlAfter" | CE == "controlBefore")
treatment = data %>% filter(CE == "treatedAfter" | CE == "treatedBefore")
require(rosetta)
before.after = meanDiff(treatment$gdpcap ~ treatment$CE)
before.after = before.after$meanDiff
cat("The mean difference is", before.after, "thousand dollars of real capita per GDP.")
Some possible confounding variables not accounted for are the total population levels of education that could be fluctuating. If the levels of those who are completing higher levels of education within Basque Country are increasing, it would make sense that their economy is increasing as well during this period. Which is a result of educational factors, not ETA terrorism.
##
## Call:
## lm(formula = gdpcap ~ Post + region + did, data = df)
##
## Residuals:
## Min 1Q Median 3Q Max
## -2.8599 -0.9246 -0.1304 0.8418 3.2757
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 2.03498 0.17160 11.859 < 2e-16 ***
## Post 3.16131 0.08342 37.898 < 2e-16 ***
## regionAragon 1.73745 0.23279 7.464 2.46e-13 ***
## regionBaleares 3.96887 0.23279 17.049 < 2e-16 ***
## regionBasque Country 3.06133 0.30688 9.976 < 2e-16 ***
## regionCanarias 1.03578 0.23279 4.449 9.99e-06 ***
## regionCantabria 1.52703 0.23279 6.560 1.04e-10 ***
## regionCastilla Y Leon 0.65334 0.23279 2.807 0.00514 **
## regionCastilla-La Mancha 0.06569 0.23279 0.282 0.77787
## regionCataluna 3.07905 0.23279 13.227 < 2e-16 ***
## regionComunidad Valenciana 1.68169 0.23279 7.224 1.30e-12 ***
## regionExtremadura -0.57207 0.23279 -2.457 0.01423 *
## regionGalicia 0.31306 0.23279 1.345 0.17911
## regionMadrid 3.55027 0.23279 15.251 < 2e-16 ***
## regionMurcia 0.58475 0.23279 2.512 0.01223 *
## regionNavarra 2.20325 0.23279 9.464 < 2e-16 ***
## regionPrincipado De Asturias 1.29162 0.23279 5.548 4.07e-08 ***
## regionRioja 2.00426 0.23279 8.610 < 2e-16 ***
## did -0.48316 0.34394 -1.405 0.16052
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 1.079 on 712 degrees of freedom
## Multiple R-squared: 0.7783, Adjusted R-squared: 0.7727
## F-statistic: 138.9 on 18 and 712 DF, p-value: < 2.2e-16
## The treatment affect is -0.4832 thousand dollars of real capita per GDP.
These results show us that between the period of terrorism and pre-terrorism there was - $.48 thousand less real capita per GDP growth for Basque Country compared to the rest of the Spain regions.Because it is different from 0 and less than was significant, this implies that the ETA hurt the the economy of the Basque region.
df = data
df$Basque = ifelse(df$region == "Basque Country",1 , 0)
df$Post = ifelse(df$year >=1973, 1, 0)
df$did = ifelse(df$Basque == 1 & df$Post == 1, 1, 0)
m7 = lm(gdpcap ~ Post + region + did, df )
treatment.affect = round(m7[["coefficients"]][["did"]],4)
summary(m7)
cat("The treatment affect is", treatment.affect, "thousand dollars of real capita per GDP.")
Difference in difference is an ideal study in this case because it is effective when analysing pre and post intervention studies over time. With less strict expectations in regard to absence of treatment, unobserved differences between the treatment and control groups over time, it removes bias in post intervention especially when randomisation is not possible.
The key assumption of the D-i-D model is that of parallel trends. This means that in their treatment states and their control states there are similar trends in the absence of treatment.
DiD seems to be the best approach as it accounts for the most variable factors.