Question 1: Code

Answer

## Descriptive Statistics  
## data  
## N: 731  
## 
##                   gdpcap      year
## --------------- -------- ---------
##            Mean     5.40   1976.00
##         Std.Dev     2.26     12.42
##             Min     1.24   1955.00
##          Median     5.28   1976.00
##             Max    12.35   1997.00
##         N.Valid   731.00    731.00
##       Pct.Valid   100.00    100.00

The median value of 5.28 for the real per capita GDP is less than $5.28K for fifty percent of the observed years.

Code

## Table with Range, Mean, Median, IQR, SD, N
table2 = descr(data, stats= "common")

Question 2:

Answer

## Descriptive Statistics  
## data  
## Group: CE = controlAfter  
## N: 400  
## 
##                   gdpcap      year
## --------------- -------- ---------
##            Mean     6.64   1985.00
##         Std.Dev     1.85      7.22
##             Min     3.05   1973.00
##          Median     6.33   1985.00
##             Max    12.35   1997.00
##         N.Valid   400.00    400.00
##       Pct.Valid   100.00    100.00
## 
## Group: CE = controlBefore  
## N: 288  
## 
##                   gdpcap      year
## --------------- -------- ---------
##            Mean     3.48   1963.50
##         Std.Dev     1.33      5.20
##             Min     1.24   1955.00
##          Median     3.32   1963.50
##             Max     7.57   1972.00
##         N.Valid   288.00    288.00
##       Pct.Valid   100.00    100.00
## 
## Group: CE = treatedAfter  
## N: 25  
## 
##                   gdpcap      year
## --------------- -------- ---------
##            Mean     7.77   1985.00
##         Std.Dev     1.15      7.36
##             Min     6.50   1973.00
##          Median     7.33   1985.00
##             Max    10.17   1997.00
##         N.Valid    25.00     25.00
##       Pct.Valid   100.00    100.00
## 
## Group: CE = treatedBefore  
## N: 18  
## 
##                   gdpcap      year
## --------------- -------- ---------
##            Mean     5.10   1963.50
##         Std.Dev     0.91      5.34
##             Min     3.85   1955.00
##          Median     5.27   1963.50
##             Max     6.56   1972.00
##         N.Valid    18.00     18.00
##       Pct.Valid   100.00    100.00

Code

## Subset
treatedBefore = subset(data, year<1973 & region == "Basque Country")
treatedAfter = subset(data, year>=1973 & region == "Basque Country")
controlBefore = subset(data, year<1973 & region != "Basque Country")
controlAfter = subset(data, year>=1973 & region != "Basque Country")

## Nested if else
data$CE = ifelse(data$year<1973 & data$region == "Basque Country", "treatedBefore",
               ifelse(data$year>=1973 & data$region == "Basque Country", "treatedAfter",
                      ifelse(data$year<1973 & data$region != "Basque Country", "controlBefore", "controlAfter"
                             )))


table3 = stby(
  data = data,
  INDICES = data$CE, # by Causal Effect
  FUN = descr,
  stats = "common"
)

Question 3:

Answer

## The mean difference is -1.132917 thousand dollars of real capita per GDP.

Between the years of 1973 and 1977, the average difference between Basque Country and non-Basque Country regions real capita GDP was ~ - $1.13 thousand, with Basque having a higher GDP. This data seems to allude that the terrorism helped the Basque Country economy.

Code

require(rosetta)
after = data %>% filter(CE == "controlAfter" | CE == "treatedAfter")

diff = meanDiff(after$gdpcap ~ after$CE)

meand = (diff$meanDiff)

cat("The mean difference is", meand, "thousand dollars of real capita per GDP.")

Question 4:

Some possible confounding variables not accounted for are the total population number or the levels of education. If the Basque Country economy pre-terrorism was considerably more pronounced than Spain, but only slightly more pronounced post-terrorism the analysis would be different. In fact, one would read it that terrorism hurt the economy.

Question 5:

Answer

## The mean difference is 2.678146 thousand dollars of real capita per GDP.

Comparing the pre-terrorism phase (prior to 1973) and the the period during terrorism (1973-1977) the average difference of reap capita per GDP in Basque Country was $2.68 thousand. Again, based on these insights, terrorism seemed to aid in this regions economy.

Code

control = data %>% filter(CE == "controlAfter" | CE == "controlBefore")
treatment = data %>% filter(CE == "treatedAfter" | CE == "treatedBefore")
require(rosetta)
before.after = meanDiff(treatment$gdpcap ~ treatment$CE)
before.after = before.after$meanDiff

cat("The mean difference is", before.after, "thousand dollars of real capita per GDP.")

Question 6:

Some possible confounding variables not accounted for are the total population levels of education that could be fluctuating. If the levels of those who are completing higher levels of education within Basque Country are increasing, it would make sense that their economy is increasing as well during this period. Which is a result of educational factors, not ETA terrorism.

Question 7:

Code

## 
## Call:
## lm(formula = gdpcap ~ Post + region + did, data = df)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -2.8599 -0.9246 -0.1304  0.8418  3.2757 
## 
## Coefficients:
##                              Estimate Std. Error t value Pr(>|t|)    
## (Intercept)                   2.03498    0.17160  11.859  < 2e-16 ***
## Post                          3.16131    0.08342  37.898  < 2e-16 ***
## regionAragon                  1.73745    0.23279   7.464 2.46e-13 ***
## regionBaleares                3.96887    0.23279  17.049  < 2e-16 ***
## regionBasque Country          3.06133    0.30688   9.976  < 2e-16 ***
## regionCanarias                1.03578    0.23279   4.449 9.99e-06 ***
## regionCantabria               1.52703    0.23279   6.560 1.04e-10 ***
## regionCastilla Y Leon         0.65334    0.23279   2.807  0.00514 ** 
## regionCastilla-La Mancha      0.06569    0.23279   0.282  0.77787    
## regionCataluna                3.07905    0.23279  13.227  < 2e-16 ***
## regionComunidad Valenciana    1.68169    0.23279   7.224 1.30e-12 ***
## regionExtremadura            -0.57207    0.23279  -2.457  0.01423 *  
## regionGalicia                 0.31306    0.23279   1.345  0.17911    
## regionMadrid                  3.55027    0.23279  15.251  < 2e-16 ***
## regionMurcia                  0.58475    0.23279   2.512  0.01223 *  
## regionNavarra                 2.20325    0.23279   9.464  < 2e-16 ***
## regionPrincipado De Asturias  1.29162    0.23279   5.548 4.07e-08 ***
## regionRioja                   2.00426    0.23279   8.610  < 2e-16 ***
## did                          -0.48316    0.34394  -1.405  0.16052    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 1.079 on 712 degrees of freedom
## Multiple R-squared:  0.7783, Adjusted R-squared:  0.7727 
## F-statistic: 138.9 on 18 and 712 DF,  p-value: < 2.2e-16
## The treatment affect is -0.4832 thousand dollars of real capita per GDP.

These results show us that between the period of terrorism and pre-terrorism there was - $.48 thousand less real capita per GDP growth for Basque Country compared to the rest of the Spain regions.Because it is different from 0 and less than was significant, this implies that the ETA hurt the the economy of the Basque region.

Answer

df = data
df$Basque = ifelse(df$region == "Basque Country",1 , 0)
df$Post = ifelse(df$year >=1973, 1, 0)
df$did = ifelse(df$Basque == 1 & df$Post == 1, 1, 0)

m7 = lm(gdpcap ~ Post + region + did, df )
treatment.affect = round(m7[["coefficients"]][["did"]],4)

summary(m7)

cat("The treatment affect is", treatment.affect, "thousand dollars of real capita per GDP.")

Question 8

Difference in difference is an ideal study in this case because it is effective when analysing pre and post intervention studies over time. With less strict expectations in regard to absence of treatment, unobserved differences between the treatment and control groups over time, it removes bias in post intervention especially when randomisation is not possible.

The key assumption of the D-i-D model is that of parallel trends. This means that in their treatment states and their control states there are similar trends in the absence of treatment.

Question 9

DiD seems to be the best approach as it accounts for the most variable factors.