[Components of Democracy: Does Democracy Cause Growth]

GitHub Repository: [add url here if you have created a data or code repository, if not delete this line]

Study Preregistration form: [https://rpubs.com/SMI200381941/1050769]

Information about this replication project

Workspace setup

YAML settings

output:
  html_document:
   code_download: true
    toc: true
    toc_depth: 2
    toc_float:
     collapsed: false
     smooth_scroll: true

Global settings of R chunks

# Global options
opts_chunk$set(echo=TRUE,
                 cache=TRUE,
               comment=NA,
               message=FALSE,
               warning=FALSE)

Libraries

# All used libraries
library("rmarkdown")
library("knitr")
library("readr")
library("dplyr")
library("ggplot2")
library("summarytools")
library("plm")
library("sjPlot")

Versions of used packages

$rmarkdown
[1] '2.21'

$knitr
[1] '1.42'

My enviroment

[1] "R version 4.3.0 (2023-04-21)"

1. Introduction

The paper (Acemoglu et al., 2014), ‘Does Democracy Cause Growth’, inquires to the global effectiveness and implications of Democracy in causing economic growth, measured through GDP per capita. The paper uses panel data to construct cross-country regression models and graphs which portray the effects of democracy over time, with the overarching finding that countries experience a 20% increase in growth, 25 years after a democratic transition. These findings are significant and follow a long stream of economic literature, such as (Minier, 1998) arguing that increasing level of democracy results in increasing growth rates comparable to decreasing democracy which find relatively lower growth, therefore reinforcing Acemoglu’s claim. Affected in conjunction by income and literacy levels. It differs from our replication paper, which uses simple coding of democratic or non-democratic for each year dependent on the Freedom House democracy indice score. Much of the literature is segmented into categories, analysing democracy as a sole entity, controlling for its economic, demographic, and social and cultural and geographical variables. Or analysing the effect of these channels in relation to democracy, as either how they affect or are affected by democracy and extending this to GDP.

Tavares & Wavziarg argue democracy increases growth by through increasing human capital and decreasing income inequality, whilst it cuts growth by rising government consumption and slowing down physical capital acquirement. (Tavares and Wacziarg, 2001) However like earlier literature, they come to an opposite consensus to our paper, that overall, when controlling these factors, democracy has a negative effect on growth. (Barro, 1996) Our selected paper builds on these past articles, arguing there are biases due to unaccounted for characteristics at the country level, not fixing country effects. (Acemoglu et al., 2014) Conclusively it does a great job at both controlling potential biases, both on the snapshot basic level, but also considering gradual lags and time effects which would have an impact on GDP through anticipation.

A significant gap in the literature is posed by the fact that democracy is simplified, most of the literature uses a dummy democratic variable or a continuous scale variable, the paper tries to resolve the possible measurement error this poses, through combining indices, it only acknowledges the occurrence of democracy once it has reached a significant level, even though democracy is present in a complexity of ways. Allowing prevalent factors or aspects of democracy to be thriving before a major transition has recorded.

Freese and Peterson’s (2017), table of replication, classes this as a repeatability replication, by conducting the same main methods and model, for separate democracy factors with our own additional exploratory analyses, all with a different data set, therefore not only critiquing significance, and magnitude individually over time, but also whether the wider cumulative effect fits with the results of (Acemoglu et al., 2014) Analysing factors of democracy against GDP with the same controls, to further the topic field.

The extension may yield an unbalanced result of significance, with indices being highly significant, whilst others are not. However, this must be tested further, as spoken previously each indice can contribute to a cumulative effect and may have effects on other channels that result in lagged GDP growth, such unobserved significance and effects will require reanalysis and extensive exploratory testing to ensure it is accounted for instead of being simply written off.

2. Data and methods

2.1. Data

The original study used panel data, comprised by combining data from multiple datasets, explored, and justified by referring to earlier studies. The combined datafile, named ‘DDCGdata_final.dta’, is openly available in the supplementary materials of the paper, (Acemoglu et al., 2019) containing 175 countries, spanning from 1960-2010, with an observation for each country in each year totalling 9384, with 1117 variables. The main dummy variable is coded to make a dichotomous scale of either democratic or not, dependent on 3 reliable indexes, (Freedom House, 2020), Polity IV (Center for Systemic Peace, 2022) and secondary indexes of democracy from CGV (Cheibub, Gandhi and Vreeland, 2009) and BMR (Boix, Miller and Rosato, 2012). Figure 1 performs a regression against GDP per capita, controlling for an array of x variables, such as investment, trade, secondary & primary school enrolment, infant mortality, and financial flows.

My data for replication is the large-scale V-DEM’s (V-dem.net, 2023) project data collated from 1960-2020, with select variables relating to the original study’s variables from the World Bank’s development indicators (World Bank, 2023) to merge with the datafile, to conduct the same methods presented in the original.

Loading Datasets:

V_DEM_Full <- readRDS("/Users/nestor/REPLICATION2PROJ/SMI205_Preregistration_form-main/V-Dem-CY-Full+Others-v11.1.rds")

WB_data_variables <- read_csv("/Users/nestor/REPLICATION2PROJ/SMI205_Preregistration_form-main/WB_data_variables.csv")

Recoding variables:

WB_data_variables <- WB_data_variables %>%
  rename('country_name' = `Country Name`)

WB_data_variables <- WB_data_variables %>%
  rename('year' = `Time`)

WB_data_variables <- WB_data_variables %>%
  rename('country_text_id' = `Country Code`)

WB_data_variables <- WB_data_variables %>%
  rename('tax_revenue' = `Tax revenue (% of GDP) [GC.TAX.TOTL.GD.ZS]`)

WB_data_variables <- WB_data_variables %>%
  rename('net_investment' = `Net investment in nonfinancial assets (% of GDP) [GC.NFN.TOTL.GD.ZS]`)

WB_data_variables <- WB_data_variables %>%
  rename('trade' = `Trade (% of GDP) [NE.TRD.GNFS.ZS]`)

Preparing and merging the data:

str(WB_data_variables)

datafile <- WB_data_variables

# Removing NAs
WB_data_variables <- datafile[!is.na(datafile$country_name), ]
WB_data_variables <- datafile[!is.na(datafile$year), ]

# Removing years below 1960
bottomthreshold <- 1960
V_DEM_Full <- V_DEM_Full[V_DEM_Full$year >= bottomthreshold, ]

# Filtering out countries that aren't in both datasets
countryID <- unique(V_DEM_Full$country_text_id)
Filtering <- WB_data_variables[WB_data_variables$country_text_id %in% countryID, ]

# Merging them together
indicator_dataset <- inner_join(V_DEM_Full, Filtering, by = c("year", "country_text_id"))

Subsetting the data:

condensed_data <- subset(indicator_dataset, select = c(country_name.x, country_text_id, country_id,
                                                       year, historical_date, histname,codingstart,
                                                       codingend,v2x_polyarchy, v2x_libdem, v2x_partipdem,
                                                       v2x_delibdem,v2x_egaldem,e_migdppc,tax_revenue,
                                                       net_investment, trade,e_cow_imports,
                                                       e_cow_exports,e_total_fuel_income_pc,e_miinflat,v2peprisch,
                                                       v2pesecsch,e_peinfmor,e_mipopula,e_wbgi_rle,e_wbgi_pve,e_migdpgrolns,
                                                       v2cafres,e_fh_rol,e_civil_war))

Creating a Statistics Summary

Firstly collating averages and standard deviation of main variables

# Creating a dataset of averages and standard deviatioms for our key variables
summarised_stats1 <- condensed_data %>%
  group_by(year) %>%
  summarize(mean_GDP = mean(e_migdppc, na.rm = TRUE),
            sd_GDP = sd(e_migdppc, na.rm = TRUE),
            mean_polyarchy = mean(v2x_polyarchy, na.rm = TRUE),
            sd_polyarchy = sd(v2x_polyarchy, na.rm = TRUE),
            mean_libdem = mean(v2x_libdem, na.rm = TRUE),
            sd_libdem = sd(v2x_libdem, na.rm = TRUE),
            mean_partip = mean(v2x_partipdem, na.rm = TRUE),
            sd_partip = sd(v2x_partipdem, na.rm = TRUE),
            mean_delibdem = mean(v2x_delibdem, na.rm = TRUE),
            sd_delibdem = sd(v2x_delibdem, na.rm = TRUE),
            mean_egaldem = mean(v2x_egaldem, na.rm = TRUE),
            sd_egaldem = sd(v2x_egaldem, na.rm = TRUE))

A visualisation of mean GDP time series, with the average of our key factors of democracy from 1960 to 2020

# A summary plot of average GDP, and average indicators on a time series
ggplot(summarised_stats1, aes(x = year, y = mean_GDP,)) +
  geom_point() +
  geom_smooth(aes(ymin = mean_GDP - sd_GDP, ymax = mean_GDP + sd_GDP), fill = "lightblue", alpha = 0.2) +
  theme_minimal() +
  scale_y_continuous(sec.axis = sec_axis(~ . / 20000, name = "Level of Democracy Indicators")) +
  geom_smooth(data = summarised_stats1, aes(y = mean_polyarchy * 20000), color = "red", size = 1) +
  geom_smooth(data = summarised_stats1, aes(y = mean_libdem * 20000), color = "green", size = 1) +
  geom_smooth(data = summarised_stats1, aes(y = mean_partip * 20000), color = "yellow", size = 1) +
  geom_smooth(data = summarised_stats1, aes(y = mean_delibdem * 20000), color = "purple", size = 1) +
  geom_smooth(data = summarised_stats1, aes(y = mean_egaldem * 20000), color = "orange", size = 1) +
  labs(title = "Figure 1. Average trend of GDP and democracy indicators over time " , x = "Years 1960 to 2020", y = "Mean GDP")

Exploring the key variable’s statistics

# Variables for our summary table
tabledata <- subset(condensed_data, select = c(e_migdppc,v2x_polyarchy,v2x_libdem,v2x_partipdem, v2x_delibdem,v2x_egaldem,tax_revenue,net_investment, trade,e_total_fuel_income_pc,e_miinflat,v2peprisch,v2pesecsch, e_peinfmor,e_mipopula,e_wbgi_rle,e_wbgi_pve,e_migdpgrolns,v2cafres))
# Renaming variables in our tabledata, so they can be labelled more clearly and therefore presentable

tabledata <- tabledata %>%
  rename(
    "Secondary school enrollment" = v2pesecsch,
    "Primary school enrollment" = v2peprisch,
    "Freedom of Research/Teach" = v2cafres,
    "Rule of law" = e_wbgi_rle,
    "Political stability" = e_wbgi_pve,
    "Natural resource produc per capita" = e_total_fuel_income_pc,
    "Infanty morality rate" = e_peinfmor,
    "Population" = e_mipopula,
    "Inflation" = e_miinflat,
    "GDP per capita" = e_migdppc,
    "GDP Growth" = e_migdpgrolns)
# Converting to numerical
tabledata$tax_revenue <- as.numeric(tabledata$tax_revenue)
tabledata$net_investment <- as.numeric(tabledata$net_investment)
tabledata$trade <- as.numeric(tabledata$trade)
# Creating and printing the summary table
summary_table <- descr(tabledata, stats = c("mean", "sd", "min", "max", "Q1", "Q3"))
print(summary_table)
Descriptive Statistics  
tabledata  
N: 9941  

                Freedom of Research/Teach   GDP Growth   GDP per capita   Infanty morality rate
------------- --------------------------- ------------ ---------------- -----------------------
         Mean                        0.48         0.02         10676.77                   58.30
      Std.Dev                        1.55         0.06         13250.78                   49.54
          Min                       -3.48        -0.69             0.00                    1.50
          Max                        3.31         0.84        156299.00                  277.00
           Q1                       -0.76         0.00          1972.38                   16.90
           Q3                        1.74         0.05         14060.74                   90.00

Table: Table continues below

 

                Inflation   Natural resource produc per capita   net_investment
------------- ----------- ------------------------------------ ----------------
         Mean       45.05                               772.33             3.11
      Std.Dev      553.92                              3868.63             3.95
          Min      -21.68                                 0.00            -7.98
          Max    24410.98                             81161.85            60.76
           Q1        2.76                                 0.00             1.17
           Q3       12.83                               177.99             3.60

Table: Table continues below

 

                Political stability   Population   Primary school enrollment   Rule of law
------------- --------------------- ------------ --------------------------- -------------
         Mean                 -0.17     27335.58                       80.68         -0.14
      Std.Dev                  0.98    101357.13                       22.03          1.00
          Min                 -3.32        41.66                        4.28         -2.61
          Max                  1.76   1262645.00                      100.00          2.10
           Q1                 -0.79      1963.41                       68.95         -0.87
           Q3                  0.60     16278.40                       97.51          0.52

Table: Table continues below

 

                Secondary school enrollment   tax_revenue    trade   v2x_delibdem   v2x_egaldem
------------- ----------------------------- ------------- -------- -------------- -------------
         Mean                         46.14         16.93    74.73           0.32          0.33
      Std.Dev                         29.89          7.89    51.25           0.27          0.25
          Min                          0.07          0.00     0.02           0.00          0.01
          Max                        100.00        147.66   442.62           0.89          0.88
           Q1                         19.01         11.68    42.11           0.08          0.13
           Q3                         73.12         21.64    93.19           0.56          0.50

Table: Table continues below

 

                v2x_libdem   v2x_partipdem   v2x_polyarchy
------------- ------------ --------------- ---------------
         Mean         0.33            0.26            0.42
      Std.Dev         0.27            0.21            0.29
          Min         0.00            0.01            0.01
          Max         0.89            0.80            0.92
           Q1         0.09            0.08            0.17
           Q3         0.56            0.43            0.69

2.2. Methods

The original paper of ‘Does Democracy cause Growth’, conducted an initial linear panel model, with autoregressive dynamics, therefore the model has been using past values in its variables to predict and in a sense, fill in the blanks of N/As and missing data within observations. Such a model portrayed through Figure 1, page 3 of the paper, maps the relationship between the singular democracy dummy variable, with fixed effects, against GDP per capita, taking into account and controlling for both economic, and country effects.

Allowing to isolate the the significance of democracy, without allowing other factors or channels to contribute to it’s significance. The authors chose to display this information, through visualisation of line plots, and most extensively providing coefficient tables as the main path of conveying their models and interpretations. My replication has followed this same methodology of conveying results, creating the same model with the alternate data sources referenced earlier from the V-DEM datafiles, and World Bank. The paper provides a rough list of the variables controlled for, and we have found similar variables, for following their model specification.

I have decided to include some additional testing such as a correlation matrix, due to the nature of the models, when building upon them, with additional covariates. The original study did not do this, as it did not look to split the key x independent variable into seperate parts to analyse individually and therefore democracy in the original study cannot overlap in affect.

3. Results

# Prepping the panel data
paneldata <- pdata.frame(condensed_data, index = c("country_name.x", "year"))
# Creating our first GDP model
gdpmodel1 <- plm(e_migdppc ~ v2x_polyarchy + v2x_libdem + v2x_partipdem + v2x_delibdem  + v2x_egaldem, data = paneldata, model = "random")
# Creating our 2nd GDP model
gdpmodel2 <- plm(e_migdppc ~ v2x_polyarchy + v2x_libdem + v2x_partipdem + v2x_delibdem 
                 + v2x_egaldem + e_mipopula + e_peinfmor + v2pesecsch
                 + v2peprisch + v2cafres + e_wbgi_rle + e_wbgi_pve + e_total_fuel_income_pc
                 + e_miinflat, data = paneldata, model = "random")
# Creating a table to show our summary of both models side by side

tab_model(gdpmodel2,gdpmodel1,pred.labels = c("Intercept", "Polyarchy", "Libdem","Partip","Delibdem","Egaldem","Population",
                                              "Infanty morality rate","Secondary school enrollment", "Primary school enrollment",
                                              "Freedom of Research/Teach","Rule of law","Political Stability",
                                              "Natural resource produc per capita","Inflation"),dv.labels = c("GDP Model 2 with controls","GDP Model 1"),
          string.est = "Coefficient",
          string.ci = "Conf.Int (95%)",
          string.p = "P-Value")
  GDP Model 2 with controls GDP Model 1
Predictors Coefficient Conf.Int (95%) P-Value Coefficient Conf.Int (95%) P-Value
Intercept 5547.40 -641.45 – 11736.25 0.079 3511.37 2191.60 – 4831.13 <0.001
Polyarchy -18507.41 -30798.99 – -6215.83 0.003 -21058.55 -25691.37 – -16425.73 <0.001
Libdem -2336.46 -19309.27 – 14636.36 0.787 -12281.77 -18562.36 – -6001.17 <0.001
Partip 19031.34 3755.72 – 34306.96 0.015 10450.91 4567.26 – 16334.56 0.001
Delibdem 5334.71 -6346.88 – 17016.31 0.369 5623.08 674.51 – 10571.66 0.026
Egaldem 15222.36 49.87 – 30394.85 0.049 46741.77 41373.16 – 52110.38 <0.001
Population -0.00 -0.01 – 0.00 0.555
Infanty morality rate -28.52 -72.73 – 15.68 0.205
Secondary school enrollment 73.43 29.62 – 117.24 0.001
Primary school enrollment -15.71 -64.48 – 33.07 0.527
Freedom of Research/Teach -450.55 -1329.18 – 428.07 0.314
Rule of law 4011.28 2396.35 – 5626.20 <0.001
Political Stability 100.32 -821.27 – 1021.92 0.830
Natural resource produc per capita 1.00 0.60 – 1.39 <0.001
Inflation -1.12 -8.19 – 5.95 0.756
Observations 283 8806
R2 / R2 adjusted 0.620 / 0.600 0.117 / 0.117

The original study (Acemoglu et al., 2014), posits that democracy has a positive impact on GDP per capita, through our replication, where we split up democracy into 5 key factors of democracy, in the initial GDP model 1 without controls, all of our independent x variables of democracy are considered significant each with a p-value less than 0.05. The majority except for polyarchy and libdem, having a major positive impact on GDP per capita, conveyed through their coefficients displayed above. However, this is in in contrast to the ‘GDP Model 2 with controls’, where with the introduction of the similar control variables used in the original study, a much wider fit of the model has been explained through the 0.620 adjusted R-squared compared to the 0.117 of GDP Model 1. However, we also find that many of the initial democracy indicators aren’t significant at all. Our controls detract from their explanatory power, as more of the noise in the model that would have simply been contributed to the indicators, are explained by the control variables, such as Natural resources, rate of secondary school enrolment and the degree of rule of law in a country are significant covariate controls with a p-value below 0.05. This is also conveyed by their coefficient which portrays the magnitude of the variables, as we can see Partip and Egaldem have a large positive effect on GDP per capita, whilst the rest do not, and Polyarchy which is significant in a negative effect on growth. This divide of negative and positive may support (Acemoglu et al., 2014)’s theory on the overall perceived negative effect argued in (Barro, 1996) may be due to the measurement error, of democracy introducing bias.

cor_matrix <- cor(summarised_stats1)
print(cor_matrix)
                     year mean_GDP sd_GDP mean_polyarchy sd_polyarchy
year            1.0000000       NA     NA      0.9465476   -0.1590039
mean_GDP               NA        1     NA             NA           NA
sd_GDP                 NA       NA      1             NA           NA
mean_polyarchy  0.9465476       NA     NA      1.0000000   -0.2645956
sd_polyarchy   -0.1590039       NA     NA     -0.2645956    1.0000000
mean_libdem     0.9456670       NA     NA      0.9997558   -0.2496638
sd_libdem       0.5890581       NA     NA      0.4835250    0.6866777
mean_partip     0.9568560       NA     NA      0.9988278   -0.2365987
sd_partip       0.5636391       NA     NA      0.5082732    0.6856397
mean_delibdem   0.9441315       NA     NA      0.9983368   -0.2238458
sd_delibdem     0.4358199       NA     NA      0.3196771    0.8101748
mean_egaldem    0.9605432       NA     NA      0.9915081   -0.1488847
sd_egaldem      0.6540827       NA     NA      0.5656349    0.6236122
               mean_libdem sd_libdem mean_partip sd_partip mean_delibdem
year             0.9456670 0.5890581   0.9568560 0.5636391     0.9441315
mean_GDP                NA        NA          NA        NA            NA
sd_GDP                  NA        NA          NA        NA            NA
mean_polyarchy   0.9997558 0.4835250   0.9988278 0.5082732     0.9983368
sd_polyarchy    -0.2496638 0.6866777  -0.2365987 0.6856397    -0.2238458
mean_libdem      1.0000000 0.4953030   0.9988519 0.5214927     0.9989628
sd_libdem        0.4953030 1.0000000   0.5146005 0.9776622     0.5183852
mean_partip      0.9988519 0.5146005   1.0000000 0.5337261     0.9986340
sd_partip        0.5214927 0.9776622   0.5337261 1.0000000     0.5452453
mean_delibdem    0.9989628 0.5183852   0.9986340 0.5452453     1.0000000
sd_delibdem      0.3330110 0.9768306   0.3510800 0.9644126     0.3571519
mean_egaldem     0.9928429 0.5902876   0.9955469 0.6059740     0.9952056
sd_egaldem       0.5770666 0.9902062   0.5934413 0.9785067     0.5986428
               sd_delibdem mean_egaldem sd_egaldem
year             0.4358199    0.9605432  0.6540827
mean_GDP                NA           NA         NA
sd_GDP                  NA           NA         NA
mean_polyarchy   0.3196771    0.9915081  0.5656349
sd_polyarchy     0.8101748   -0.1488847  0.6236122
mean_libdem      0.3330110    0.9928429  0.5770666
sd_libdem        0.9768306    0.5902876  0.9902062
mean_partip      0.3510800    0.9955469  0.5934413
sd_partip        0.9644126    0.6059740  0.9785067
mean_delibdem    0.3571519    0.9952056  0.5986428
sd_delibdem      1.0000000    0.4338498  0.9554695
mean_egaldem     0.4338498    1.0000000  0.6639576
sd_egaldem       0.9554695    0.6639576  1.0000000

We explore this, further by conducting a correlation matrix, whereby viewing the results, a large majority of the democracy indicators have unsurprisingly 1 or near correlation coefficients, which indicate extremely high multicollinearity between the factors of democracy.

4. Conclusions

In conclusion, it can be significantly argued that no one factor of democracy is outstandingly significant in increasing growth, through GDP per capita. As their significance seems to be highly intertwined which would confirm the possible outcome of my hypothesis that, the effects of these indices are cumulative and work as one to create a conclusive effect on growth. Therefore, the paper is in its right to use a simple code of democratic or not, as the larger effects on growth only occur when most of the factors of democracy are present to a certain degree. We can also argue that, as shown by the significance of our key indicators being reduced when the covariate channels which democracy tends to increase GDP through were added, we can convey how most of our factor’s influence on growth, stems through and requires these channels, therefore not democracy alone. Rather requiring additional channels to be highly significant for the full influence of democracy to be felt. As argued by (Tavares and Wacziarg, 2001).

Our results are similar to that of the original paper, as our model follows the similar timeframe, and relatively similar variables, with a good few crossing over from the original. With roughly the same sample size of countries in the study, it helps reduce measurement error on that front. Conclusively this replication seems to back up with further evidence the claims made in the original study ‘Does Democracy cause Growth’ (Acemoglu et al., 2014)

References

Reference list

Acemoglu, D., Naidu, S., Restrepo, P. and Robinson, J.A. (2014). Democracy Does Cause Growth. SSRN Electronic Journal, 127(1). doi:https://doi.org/10.2139/ssrn.2411791.

Acemoglu, D., Naidu, S., Restrepo, P. and Robinson, J.A. (2019). Democracy Does Cause Growth. Journal of Political Economy, [online] 127(1). doi:https://doi.org/10.1086/700936.

Barro, R.J. (1996). Democracy and growth. Journal of Economic Growth, 1(1), pp.1–27. doi:https://doi.org/10.1007/bf00163340.

Boix, C., Miller, M. and Rosato, S. (2012). A Complete Data Set of Political Regimes, 1800–2007. Comparative Political Studies, 46(12), pp.1523–1554. doi:https://doi.org/10.1177/0010414012463905.

Center for Systemic Peace (2022). INSCR Data Page. [online] www.systemicpeace.org. Available at: https://www.systemicpeace.org/inscrdata.html.

Cheibub, J.A., Gandhi, J. and Vreeland, J.R. (2009). Democracy and dictatorship revisited. Public Choice, 143(1-2), pp.67–101. doi:https://doi.org/10.1007/s11127-009-9491-2.

Freedom House (2020). Freedom in the World. [online] freedomhouse.org. Available at: https://freedomhouse.org/report/freedom-world.

GovData360. (n.d.). GovData360: Revised Combined Polity Score. [online] Available at: https://govdata360.worldbank.org/indicators/h6906d31b?country=BRA&indicator=27470&viz=line_chart&years=1800 [Accessed 5 Apr. 2023].

Minier, J.A. (1998). Democracy and Growth: Alternative Approaches. Journal of Economic Growth, 3(3), pp.241–266. doi:https://doi.org/10.1023/a:1009714821770.

Tavares, J. and Wacziarg, R. (2001). How democracy affects growth. European Economic Review, 45(8), pp.1341–1378. doi:https://doi.org/10.1016/s0014-2921(00)00093-3.

V-dem.net. (2023). Home | V-Dem. [online] Available at: (https://doi.org/10.23696/vdemds23, 2023).

World Bank (2023). World Bank Open Data | Data. [online] Worldbank.org. Available at: https://databank.worldbank.org/source/world-development-indicators.

Freese, J., & Peterson, D. (2017). Replication in social science. Annual Review of Sociology, 43, 147-165, doi: 10.1146.

Appendix

Appendix 1. My environment (full information)

# Detailed information about my environment
sessionInfo()
R version 4.3.0 (2023-04-21)
Platform: aarch64-apple-darwin20 (64-bit)
Running under: macOS Monterey 12.1

Matrix products: default
BLAS:   /Library/Frameworks/R.framework/Versions/4.3-arm64/Resources/lib/libRblas.0.dylib 
LAPACK: /Library/Frameworks/R.framework/Versions/4.3-arm64/Resources/lib/libRlapack.dylib;  LAPACK version 3.11.0

locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8

time zone: Europe/London
tzcode source: internal

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
[1] knitr_1.42     rmarkdown_2.21

loaded via a namespace (and not attached):
 [1] digest_0.6.31    R6_2.5.1         codetools_0.2-19 fastmap_1.1.1   
 [5] xfun_0.39        cachem_1.0.8     htmltools_0.5.5  cli_3.6.1       
 [9] sass_0.4.6       jquerylib_0.1.4  compiler_4.3.0   rstudioapi_0.14 
[13] tools_4.3.0      evaluate_0.21    bslib_0.4.2      yaml_2.3.7      
[17] rlang_1.1.1      jsonlite_1.8.4  

Appendix 2. Entire R code used in the project

# Opening key libraries first
library(rmarkdown)
library(knitr)
library(readr)
library(dplyr)
library(ggplot2)
library(summarytools)
library(plm)
library(sjPlot)

### NEED TO INSTALL xQuartz from their website... https://www.xquartz.org
# Global options
opts_chunk$set(echo=TRUE,
                 cache=TRUE,
               comment=NA,
               message=FALSE,
               warning=FALSE)
# All used libraries
library("rmarkdown")
library("knitr")
library("readr")
library("dplyr")
library("ggplot2")
library("summarytools")
library("plm")
library("sjPlot")
# Versions of used packages
packages <- c("rmarkdown", "knitr")
names(packages) <- packages
lapply(packages, packageVersion)
# What is my R version?
version[['version.string']]

V_DEM_Full <- readRDS("/Users/nestor/REPLICATION2PROJ/SMI205_Preregistration_form-main/V-Dem-CY-Full+Others-v11.1.rds")

WB_data_variables <- read_csv("/Users/nestor/REPLICATION2PROJ/SMI205_Preregistration_form-main/WB_data_variables.csv")


WB_data_variables <- WB_data_variables %>%
  rename('country_name' = `Country Name`)

WB_data_variables <- WB_data_variables %>%
  rename('year' = `Time`)

WB_data_variables <- WB_data_variables %>%
  rename('country_text_id' = `Country Code`)

WB_data_variables <- WB_data_variables %>%
  rename('tax_revenue' = `Tax revenue (% of GDP) [GC.TAX.TOTL.GD.ZS]`)

WB_data_variables <- WB_data_variables %>%
  rename('net_investment' = `Net investment in nonfinancial assets (% of GDP) [GC.NFN.TOTL.GD.ZS]`)

WB_data_variables <- WB_data_variables %>%
  rename('trade' = `Trade (% of GDP) [NE.TRD.GNFS.ZS]`)

str(WB_data_variables)

datafile <- WB_data_variables

# Removing NAs
WB_data_variables <- datafile[!is.na(datafile$country_name), ]
WB_data_variables <- datafile[!is.na(datafile$year), ]

# Removing years below 1960
bottomthreshold <- 1960
V_DEM_Full <- V_DEM_Full[V_DEM_Full$year >= bottomthreshold, ]

# Filtering out countries that aren't in both datasets
countryID <- unique(V_DEM_Full$country_text_id)
Filtering <- WB_data_variables[WB_data_variables$country_text_id %in% countryID, ]

# Merging them together
indicator_dataset <- inner_join(V_DEM_Full, Filtering, by = c("year", "country_text_id"))


condensed_data <- subset(indicator_dataset, select = c(country_name.x, country_text_id, country_id,
                                                       year, historical_date, histname,codingstart,
                                                       codingend,v2x_polyarchy, v2x_libdem, v2x_partipdem,
                                                       v2x_delibdem,v2x_egaldem,e_migdppc,tax_revenue,
                                                       net_investment, trade,e_cow_imports,
                                                       e_cow_exports,e_total_fuel_income_pc,e_miinflat,v2peprisch,
                                                       v2pesecsch,e_peinfmor,e_mipopula,e_wbgi_rle,e_wbgi_pve,e_migdpgrolns,
                                                       v2cafres,e_fh_rol,e_civil_war))



# Creating a dataset of averages and standard deviatioms for our key variables
summarised_stats1 <- condensed_data %>%
  group_by(year) %>%
  summarize(mean_GDP = mean(e_migdppc, na.rm = TRUE),
            sd_GDP = sd(e_migdppc, na.rm = TRUE),
            mean_polyarchy = mean(v2x_polyarchy, na.rm = TRUE),
            sd_polyarchy = sd(v2x_polyarchy, na.rm = TRUE),
            mean_libdem = mean(v2x_libdem, na.rm = TRUE),
            sd_libdem = sd(v2x_libdem, na.rm = TRUE),
            mean_partip = mean(v2x_partipdem, na.rm = TRUE),
            sd_partip = sd(v2x_partipdem, na.rm = TRUE),
            mean_delibdem = mean(v2x_delibdem, na.rm = TRUE),
            sd_delibdem = sd(v2x_delibdem, na.rm = TRUE),
            mean_egaldem = mean(v2x_egaldem, na.rm = TRUE),
            sd_egaldem = sd(v2x_egaldem, na.rm = TRUE))


# A summary plot of average GDP, and average indicators on a time series
ggplot(summarised_stats1, aes(x = year, y = mean_GDP,)) +
  geom_point() +
  geom_smooth(aes(ymin = mean_GDP - sd_GDP, ymax = mean_GDP + sd_GDP), fill = "lightblue", alpha = 0.2) +
  theme_minimal() +
  scale_y_continuous(sec.axis = sec_axis(~ . / 20000, name = "Level of Democracy Indicators")) +
  geom_smooth(data = summarised_stats1, aes(y = mean_polyarchy * 20000), color = "red", size = 1) +
  geom_smooth(data = summarised_stats1, aes(y = mean_libdem * 20000), color = "green", size = 1) +
  geom_smooth(data = summarised_stats1, aes(y = mean_partip * 20000), color = "yellow", size = 1) +
  geom_smooth(data = summarised_stats1, aes(y = mean_delibdem * 20000), color = "purple", size = 1) +
  geom_smooth(data = summarised_stats1, aes(y = mean_egaldem * 20000), color = "orange", size = 1) +
  labs(title = "Figure 1. Average trend of GDP and democracy indicators over time " , x = "Years 1960 to 2020", y = "Mean GDP")



# Variables for our summary table
tabledata <- subset(condensed_data, select = c(e_migdppc,v2x_polyarchy,v2x_libdem,v2x_partipdem, v2x_delibdem,v2x_egaldem,tax_revenue,net_investment, trade,e_total_fuel_income_pc,e_miinflat,v2peprisch,v2pesecsch, e_peinfmor,e_mipopula,e_wbgi_rle,e_wbgi_pve,e_migdpgrolns,v2cafres))


# Renaming variables in our tabledata, so they can be labelled more clearly and therefore presentable

tabledata <- tabledata %>%
  rename(
    "Secondary school enrollment" = v2pesecsch,
    "Primary school enrollment" = v2peprisch,
    "Freedom of Research/Teach" = v2cafres,
    "Rule of law" = e_wbgi_rle,
    "Political stability" = e_wbgi_pve,
    "Natural resource produc per capita" = e_total_fuel_income_pc,
    "Infanty morality rate" = e_peinfmor,
    "Population" = e_mipopula,
    "Inflation" = e_miinflat,
    "GDP per capita" = e_migdppc,
    "GDP Growth" = e_migdpgrolns)

# Converting to numerical
tabledata$tax_revenue <- as.numeric(tabledata$tax_revenue)
tabledata$net_investment <- as.numeric(tabledata$net_investment)
tabledata$trade <- as.numeric(tabledata$trade)

# Creating and printing the summary table
summary_table <- descr(tabledata, stats = c("mean", "sd", "min", "max", "Q1", "Q3"))
print(summary_table)

# Prepping the panel data
paneldata <- pdata.frame(condensed_data, index = c("country_name.x", "year"))
# Creating our first GDP model
gdpmodel1 <- plm(e_migdppc ~ v2x_polyarchy + v2x_libdem + v2x_partipdem + v2x_delibdem  + v2x_egaldem, data = paneldata, model = "random")

# Creating our 2nd GDP model
gdpmodel2 <- plm(e_migdppc ~ v2x_polyarchy + v2x_libdem + v2x_partipdem + v2x_delibdem 
                 + v2x_egaldem + e_mipopula + e_peinfmor + v2pesecsch
                 + v2peprisch + v2cafres + e_wbgi_rle + e_wbgi_pve + e_total_fuel_income_pc
                 + e_miinflat, data = paneldata, model = "random")

# Creating a table to show our summary of both models side by side

tab_model(gdpmodel2,gdpmodel1,pred.labels = c("Intercept", "Polyarchy", "Libdem","Partip","Delibdem","Egaldem","Population",
                                              "Infanty morality rate","Secondary school enrollment", "Primary school enrollment",
                                              "Freedom of Research/Teach","Rule of law","Political Stability",
                                              "Natural resource produc per capita","Inflation"),dv.labels = c("GDP Model 2 with controls","GDP Model 1"),
          string.est = "Coefficient",
          string.ci = "Conf.Int (95%)",
          string.p = "P-Value")
cor_matrix <- cor(summarised_stats1)
print(cor_matrix)
# Detailed information about my environment
sessionInfo()
V_DEM_Full <- readRDS("/Users/nestor/REPLICATION2PROJ/SMI205_Preregistration_form-main/V-Dem-CY-Full+Others-v11.1.rds")

WB_data_variables <- read_csv("/Users/nestor/REPLICATION2PROJ/SMI205_Preregistration_form-main/WB_data_variables.csv")

WB_data_variables <- WB_data_variables %>%
  rename(country_name = `Country Name`)

WB_data_variables <- WB_data_variables %>%
  rename(year = `Time`)

WB_data_variables <- WB_data_variables %>%
  rename(country_text_id = `Country Code`)

WB_data_variables <- WB_data_variables %>%
  rename(tax_revenue = `Tax revenue (% of GDP) [GC.TAX.TOTL.GD.ZS]`)

WB_data_variables <- WB_data_variables %>%
  rename(net_investment = `Net investment in nonfinancial assets (% of GDP) [GC.NFN.TOTL.GD.ZS]`)

WB_data_variables <- WB_data_variables %>%
  rename(trade = `Trade (% of GDP) [NE.TRD.GNFS.ZS]`)



str(WB_data_variables)

datafile <- WB_data_variables

WB_data_variables <- datafile[!is.na(datafile$country_name), ]
WB_data_variables <- datafile[!is.na(datafile$year), ]

bottomthreshold <- 1960
V_DEM_Full <- V_DEM_Full[V_DEM_Full$year >= bottomthreshold, ]

countryID <- unique(V_DEM_Full$country_text_id)
Filtering <- WB_data_variables[WB_data_variables$country_text_id %in% countryID, ]

indicator_dataset <- inner_join(V_DEM_Full, Filtering, by = c("year", "country_text_id"))

condensed_data <- subset(indicator_dataset, select = c(country_name.x, country_text_id, country_id,
                                                       year, historical_date, histname,codingstart,
                                                       codingend,v2x_polyarchy, v2x_libdem, v2x_partipdem,
                                                       v2x_delibdem,v2x_egaldem,e_migdppc,tax_revenue,
                                                       net_investment, trade,e_cow_imports,
                                                       e_cow_exports,e_total_fuel_income_pc,e_miinflat,v2peprisch,
                                                       v2pesecsch,e_peinfmor,e_mipopula,e_wbgi_rle,e_wbgi_pve,e_migdpgrolns,
                                                       v2cafres,e_fh_rol,e_civil_war))


summarised_stats1 <- condensed_data %>%
  group_by(year) %>%
  summarize(mean_GDP = mean(e_migdppc, na.rm = TRUE),
            sd_GDP = sd(e_migdppc, na.rm = TRUE),
            mean_polyarchy = mean(v2x_polyarchy, na.rm = TRUE),
            sd_polyarchy = sd(v2x_polyarchy, na.rm = TRUE),
            mean_libdem = mean(v2x_libdem, na.rm = TRUE),
            sd_libdem = sd(v2x_libdem, na.rm = TRUE),
            mean_partip = mean(v2x_partipdem, na.rm = TRUE),
            sd_partip = sd(v2x_partipdem, na.rm = TRUE),
            mean_delibdem = mean(v2x_delibdem, na.rm = TRUE),
            sd_delibdem = sd(v2x_delibdem, na.rm = TRUE),
            mean_egaldem = mean(v2x_egaldem, na.rm = TRUE),
            sd_egaldem = sd(v2x_egaldem, na.rm = TRUE))

ggplot(summarised_stats1, aes(x = year, y = mean_GDP,)) +
  geom_point() +
  geom_smooth(aes(ymin = mean_GDP - sd_GDP, ymax = mean_GDP + sd_GDP), fill = "lightblue", alpha = 0.2) +
  theme_minimal() +
  scale_y_continuous(sec.axis = sec_axis(~ . / 20000, name = "Level of Democracy Indicators")) +
  geom_smooth(data = summarised_stats1, aes(y = mean_polyarchy * 20000), color = "red", size = 1) +
  geom_smooth(data = summarised_stats1, aes(y = mean_libdem * 20000), color = "green", size = 1) +
  geom_smooth(data = summarised_stats1, aes(y = mean_partip * 20000), color = "yellow", size = 1) +
  geom_smooth(data = summarised_stats1, aes(y = mean_delibdem * 20000), color = "purple", size = 1) +
  geom_smooth(data = summarised_stats1, aes(y = mean_egaldem * 20000), color = "orange", size = 1) +
  labs(title = "Average trend of GDP and democracy indicators over time " , x = "Years 1960 to 2020", y = "Mean GDP")



### NEED TO INSTALL xQuartz from their website... https://www.xquartz.org
# Otherwise the package below wont work

install.packages("summarytools")
library(summarytools)

tabledata <- subset(condensed_data, select = c(e_migdppc,v2x_polyarchy,v2x_libdem,v2x_partipdem,
                                               v2x_delibdem,v2x_egaldem,tax_revenue,net_investment,
                                               trade,e_total_fuel_income_pc,e_miinflat,v2peprisch,v2pesecsch,
                                               e_peinfmor,e_mipopula,e_wbgi_rle,e_wbgi_pve,e_migdpgrolns,
                                               v2cafres))

tabledata <- tabledata %>%
  rename(
    "Secondary school enrollment" = v2pesecsch,
    "Primary school enrollment" = v2peprisch,
    "Freedom of Research/Teach" = v2cafres,
    "Rule of law" = e_wbgi_rle,
    "Political stability" = e_wbgi_pve,
    "Natural resource produc per capita" = e_total_fuel_income_pc,
    "Infanty morality rate" = e_peinfmor,
    "Population" = e_mipopula,
    "Inflation" = e_miinflat,
    "GDP per capita" = e_migdppc,
    "GDP Growth" = e_migdpgrolns)


tabledata$tax_revenue <- as.numeric(tabledata$tax_revenue)
tabledata$net_investment <- as.numeric(tabledata$net_investment)
tabledata$trade <- as.numeric(tabledata$trade)


# Generate summary statistics table
summary_table <- descr(tabledata, stats = c("mean", "sd", "min", "max", "Q1", "Q3"))

# Print the summary statistics table
print(summary_table)

# Prepping the panel data
paneldata <- pdata.frame(condensed_data, index = c("country_name.x", "year"))



# Creating our first GDP model
gdpmodel1 <- plm(e_migdppc ~ v2x_polyarchy + v2x_libdem + v2x_partipdem + v2x_delibdem  + v2x_egaldem, data = paneldata, model = "random")




# Creating our 2nd GDP model
gdpmodel2 <- plm(e_migdppc ~ v2x_polyarchy + v2x_libdem + v2x_partipdem + v2x_delibdem 
                 + v2x_egaldem + e_mipopula + e_peinfmor + v2pesecsch
                 + v2peprisch + v2cafres + e_wbgi_rle + e_wbgi_pve + e_total_fuel_income_pc
                 + e_miinflat, data = paneldata, model = "random")




# Creating a table to show our summary of both models side by side

tab_model(gdpmodel2,gdpmodel1,pred.labels = c("Intercept", "Polyarchy", "Libdem","Partip","Delibdem","Egaldem","Population",
                                              "Infanty morality rate","Secondary school enrollment", "Primary school enrollment",
                                              "Freedom of Research/Teach","Rule of law","Political Stability",
                                              "Natural resource produc per capita","Inflation"),dv.labels = c("GDP Model 2 with controls","GDP Model 1"),
          string.est = "Coefficient",
          string.ci = "Conf.Int (95%)",
          string.p = "P-Value")

cor_matrix <- cor(summarised_stats1)
print(cor_matrix)
---
title: "SMI205 Replication Project (2023)"
author: '200381941'
date: "22/05/2023"
output:
  html_document:
    code_download: yes
    toc: yes
    toc_depth: 2
    toc_float:
      collapsed: no
      smooth_scroll: yes
  word_document:
    toc: yes
    toc_depth: '2'
---

```{r start, include=FALSE}
# Opening key libraries first
library(rmarkdown)
library(knitr)
library(readr)
library(dplyr)
library(ggplot2)
library(summarytools)
library(plm)
library(sjPlot)

### NEED TO INSTALL xQuartz from their website... https://www.xquartz.org
```

# [Components of Democracy: Does Democracy Cause Growth]

### Rpubs link: [https://rpubs.com/SMI200381941/1050771]

### GitHub Repository: [add url here if you have created a data or code repository, if not delete this line]

### Study Preregistration form: [https://rpubs.com/SMI200381941/1050769]

## Information about this replication project

-   Replication project based on paper [Acemoglu, D., Naidu, S., Restrepo, P. and Robinson, J.A. (2014). Democracy Does Cause Growth. SSRN Electronic Journal, 127(1). <doi:https://doi.org/10.2139/ssrn.2411791>. <https://www.journals.uchicago.edu/doi/10.1086/700936#tb4>]
-   Replication method (select one from below):
    -   Own replication following methods section of the paper

## Workspace setup {.tabset .tabset-pills}

### YAML settings

output: </br>   html_document: </br>    code_download: true </br>     toc: true </br>     toc_depth: 2 </br>     toc_float: </br>      collapsed: false </br>      smooth_scroll: true </br>

### Global settings of R chunks

```{r setup, include=TRUE}
# Global options
opts_chunk$set(echo=TRUE,
	             cache=TRUE,
               comment=NA,
               message=FALSE,
               warning=FALSE)
```

### Libraries

```{r libraries, include=TRUE}
# All used libraries
library("rmarkdown")
library("knitr")
library("readr")
library("dplyr")
library("ggplot2")
library("summarytools")
library("plm")
library("sjPlot")
```

### Versions of used packages

```{r versions, echo=FALSE}
# Versions of used packages
packages <- c("rmarkdown", "knitr")
names(packages) <- packages
lapply(packages, packageVersion)
```

### My enviroment

```{r myR, echo=FALSE}
# What is my R version?
version[['version.string']]
```

## 1. Introduction


The paper (Acemoglu et al., 2014), 'Does Democracy Cause Growth', inquires to the global effectiveness and implications of Democracy in causing economic growth, measured through GDP per capita. The paper uses panel data to construct cross-country regression models and graphs which portray the effects of democracy over time, with the overarching finding that countries experience a 20% increase in growth, 25 years after a democratic transition. These findings are significant and follow a long stream of economic literature, such as (Minier, 1998) arguing that increasing level of democracy results in increasing growth rates comparable to decreasing democracy which find relatively lower growth, therefore reinforcing Acemoglu's claim. Affected in conjunction by income and literacy levels. It differs from our replication paper, which uses simple coding of democratic or non-democratic for each year dependent on the Freedom House democracy indice score. Much of the literature is segmented into categories, analysing democracy as a sole entity, controlling for its economic, demographic, and social and cultural and geographical variables. Or analysing the effect of these channels in relation to democracy, as either how they affect or are affected by democracy and extending this to GDP.

Tavares & Wavziarg argue democracy increases growth by through increasing human capital and decreasing income inequality, whilst it cuts growth by rising government consumption and slowing down physical capital acquirement. (Tavares and Wacziarg, 2001) However like earlier literature, they come to an opposite consensus to our paper, that overall, when controlling these factors, democracy has a negative effect on growth. (Barro, 1996) Our selected paper builds on these past articles, arguing there are biases due to unaccounted for characteristics at the country level, not fixing country effects. (Acemoglu et al., 2014) Conclusively it does a great job at both controlling potential biases, both on the snapshot basic level, but also considering gradual lags and time effects which would have an impact on GDP through anticipation.

A significant gap in the literature is posed by the fact that democracy is simplified, most of the literature uses a dummy democratic variable or a continuous scale variable, the paper tries to resolve the possible measurement error this poses, through combining indices, it only acknowledges the occurrence of democracy once it has reached a significant level, even though democracy is present in a complexity of ways. Allowing prevalent factors or aspects of democracy to be thriving before a major transition has recorded.

Freese and Peterson's (2017), table of replication, classes this as a repeatability replication, by conducting the same main methods and model, for separate democracy factors with our own additional exploratory analyses, all with a different data set, therefore not only critiquing significance, and magnitude individually over time, but also whether the wider cumulative effect fits with the results of (Acemoglu et al., 2014) Analysing factors of democracy against GDP with the same controls, to further the topic field.

The extension may yield an unbalanced result of significance, with indices being highly significant, whilst others are not. However, this must be tested further, as spoken previously each indice can contribute to a cumulative effect and may have effects on other channels that result in lagged GDP growth, such unobserved significance and effects will require reanalysis and extensive exploratory testing to ensure it is accounted for instead of being simply written off.

## 2. Data and methods

### 2.1. Data

The original study used panel data, comprised by combining data from multiple datasets, explored, and justified by referring to earlier studies. The combined datafile, named 'DDCGdata_final.dta', is openly available in the supplementary materials of the paper, (Acemoglu et al., 2019) containing 175 countries, spanning from 1960-2010, with an observation for each country in each year totalling 9384, with 1117 variables. The main dummy variable is coded to make a dichotomous scale of either democratic or not, dependent on 3 reliable indexes, (Freedom House, 2020), Polity IV (Center for Systemic Peace, 2022) and secondary indexes of democracy from CGV (Cheibub, Gandhi and Vreeland, 2009) and BMR (Boix, Miller and Rosato, 2012). Figure 1 performs a regression against GDP per capita, controlling for an array of x variables, such as investment, trade, secondary & primary school enrolment, infant mortality, and financial flows.

My data for replication is the large-scale V-DEM's (V-dem.net, 2023) project data collated from 1960-2020, with select variables relating to the original study's variables from the World Bank's development indicators (World Bank, 2023) to merge with the datafile, to conduct the same methods presented in the original.



**Loading Datasets:**

```{r results='hide'}

V_DEM_Full <- readRDS("/Users/nestor/REPLICATION2PROJ/SMI205_Preregistration_form-main/V-Dem-CY-Full+Others-v11.1.rds")

WB_data_variables <- read_csv("/Users/nestor/REPLICATION2PROJ/SMI205_Preregistration_form-main/WB_data_variables.csv")

```

**Recoding variables:**

```{r results='hide'}

WB_data_variables <- WB_data_variables %>%
  rename('country_name' = `Country Name`)

WB_data_variables <- WB_data_variables %>%
  rename('year' = `Time`)

WB_data_variables <- WB_data_variables %>%
  rename('country_text_id' = `Country Code`)

WB_data_variables <- WB_data_variables %>%
  rename('tax_revenue' = `Tax revenue (% of GDP) [GC.TAX.TOTL.GD.ZS]`)

WB_data_variables <- WB_data_variables %>%
  rename('net_investment' = `Net investment in nonfinancial assets (% of GDP) [GC.NFN.TOTL.GD.ZS]`)

WB_data_variables <- WB_data_variables %>%
  rename('trade' = `Trade (% of GDP) [NE.TRD.GNFS.ZS]`)
```

**Preparing and merging the data:**

```{r results='hide'}

str(WB_data_variables)

datafile <- WB_data_variables

# Removing NAs
WB_data_variables <- datafile[!is.na(datafile$country_name), ]
WB_data_variables <- datafile[!is.na(datafile$year), ]

# Removing years below 1960
bottomthreshold <- 1960
V_DEM_Full <- V_DEM_Full[V_DEM_Full$year >= bottomthreshold, ]

# Filtering out countries that aren't in both datasets
countryID <- unique(V_DEM_Full$country_text_id)
Filtering <- WB_data_variables[WB_data_variables$country_text_id %in% countryID, ]

# Merging them together
indicator_dataset <- inner_join(V_DEM_Full, Filtering, by = c("year", "country_text_id"))

```

**Subsetting the data:**

```{r results='hide'}

condensed_data <- subset(indicator_dataset, select = c(country_name.x, country_text_id, country_id,
                                                       year, historical_date, histname,codingstart,
                                                       codingend,v2x_polyarchy, v2x_libdem, v2x_partipdem,
                                                       v2x_delibdem,v2x_egaldem,e_migdppc,tax_revenue,
                                                       net_investment, trade,e_cow_imports,
                                                       e_cow_exports,e_total_fuel_income_pc,e_miinflat,v2peprisch,
                                                       v2pesecsch,e_peinfmor,e_mipopula,e_wbgi_rle,e_wbgi_pve,e_migdpgrolns,
                                                       v2cafres,e_fh_rol,e_civil_war))


```

***Creating a Statistics Summary***

Firstly collating averages and standard deviation of main variables

```{r results='hide'}

# Creating a dataset of averages and standard deviatioms for our key variables
summarised_stats1 <- condensed_data %>%
  group_by(year) %>%
  summarize(mean_GDP = mean(e_migdppc, na.rm = TRUE),
            sd_GDP = sd(e_migdppc, na.rm = TRUE),
            mean_polyarchy = mean(v2x_polyarchy, na.rm = TRUE),
            sd_polyarchy = sd(v2x_polyarchy, na.rm = TRUE),
            mean_libdem = mean(v2x_libdem, na.rm = TRUE),
            sd_libdem = sd(v2x_libdem, na.rm = TRUE),
            mean_partip = mean(v2x_partipdem, na.rm = TRUE),
            sd_partip = sd(v2x_partipdem, na.rm = TRUE),
            mean_delibdem = mean(v2x_delibdem, na.rm = TRUE),
            sd_delibdem = sd(v2x_delibdem, na.rm = TRUE),
            mean_egaldem = mean(v2x_egaldem, na.rm = TRUE),
            sd_egaldem = sd(v2x_egaldem, na.rm = TRUE))

```

**A visualisation of mean GDP time series, with the average of our key factors of democracy from 1960 to 2020**

```{r}

# A summary plot of average GDP, and average indicators on a time series
ggplot(summarised_stats1, aes(x = year, y = mean_GDP,)) +
  geom_point() +
  geom_smooth(aes(ymin = mean_GDP - sd_GDP, ymax = mean_GDP + sd_GDP), fill = "lightblue", alpha = 0.2) +
  theme_minimal() +
  scale_y_continuous(sec.axis = sec_axis(~ . / 20000, name = "Level of Democracy Indicators")) +
  geom_smooth(data = summarised_stats1, aes(y = mean_polyarchy * 20000), color = "red", size = 1) +
  geom_smooth(data = summarised_stats1, aes(y = mean_libdem * 20000), color = "green", size = 1) +
  geom_smooth(data = summarised_stats1, aes(y = mean_partip * 20000), color = "yellow", size = 1) +
  geom_smooth(data = summarised_stats1, aes(y = mean_delibdem * 20000), color = "purple", size = 1) +
  geom_smooth(data = summarised_stats1, aes(y = mean_egaldem * 20000), color = "orange", size = 1) +
  labs(title = "Figure 1. Average trend of GDP and democracy indicators over time " , x = "Years 1960 to 2020", y = "Mean GDP")


```

**Exploring the key variable's statistics**

```{r results='hide'}

# Variables for our summary table
tabledata <- subset(condensed_data, select = c(e_migdppc,v2x_polyarchy,v2x_libdem,v2x_partipdem, v2x_delibdem,v2x_egaldem,tax_revenue,net_investment, trade,e_total_fuel_income_pc,e_miinflat,v2peprisch,v2pesecsch, e_peinfmor,e_mipopula,e_wbgi_rle,e_wbgi_pve,e_migdpgrolns,v2cafres))

```

```{r results='hide'}

# Renaming variables in our tabledata, so they can be labelled more clearly and therefore presentable

tabledata <- tabledata %>%
  rename(
    "Secondary school enrollment" = v2pesecsch,
    "Primary school enrollment" = v2peprisch,
    "Freedom of Research/Teach" = v2cafres,
    "Rule of law" = e_wbgi_rle,
    "Political stability" = e_wbgi_pve,
    "Natural resource produc per capita" = e_total_fuel_income_pc,
    "Infanty morality rate" = e_peinfmor,
    "Population" = e_mipopula,
    "Inflation" = e_miinflat,
    "GDP per capita" = e_migdppc,
    "GDP Growth" = e_migdpgrolns)
```

```{r results='hide'}

# Converting to numerical
tabledata$tax_revenue <- as.numeric(tabledata$tax_revenue)
tabledata$net_investment <- as.numeric(tabledata$net_investment)
tabledata$trade <- as.numeric(tabledata$trade)
```

```{r}

# Creating and printing the summary table
summary_table <- descr(tabledata, stats = c("mean", "sd", "min", "max", "Q1", "Q3"))
print(summary_table)
```

### 2.2. Methods

The original paper of 'Does Democracy cause Growth', conducted an initial linear panel model, with autoregressive dynamics, therefore the model has been using past values in its variables to predict and in a sense, fill in the blanks of N/As and missing data within observations. Such a model portrayed through Figure 1, page 3 of the paper, maps the relationship between the singular democracy dummy variable, with fixed effects, against GDP per capita, taking into account and controlling for both economic, and country effects.

Allowing to isolate the the significance of democracy, without allowing other factors or channels to contribute to it's significance. The authors chose to display this information, through visualisation of line plots, and most extensively providing coefficient tables as the main path of conveying their models and interpretations. My replication has followed this same methodology of conveying results, creating the same model with the alternate data sources referenced earlier from the V-DEM datafiles, and World Bank. The paper provides a rough list of the variables controlled for, and we have found similar variables, for following their model specification. 

I have decided to include some additional testing such as a correlation matrix, due to the nature of the models, when building upon them, with additional covariates. The original study did not do this, as it did not look to split the key x independent variable into seperate parts to analyse individually and therefore democracy in the original study cannot overlap in affect.

## 3. Results

```{r}

# Prepping the panel data
paneldata <- pdata.frame(condensed_data, index = c("country_name.x", "year"))
```

```{r}
# Creating our first GDP model
gdpmodel1 <- plm(e_migdppc ~ v2x_polyarchy + v2x_libdem + v2x_partipdem + v2x_delibdem  + v2x_egaldem, data = paneldata, model = "random")

```

```{r}
# Creating our 2nd GDP model
gdpmodel2 <- plm(e_migdppc ~ v2x_polyarchy + v2x_libdem + v2x_partipdem + v2x_delibdem 
                 + v2x_egaldem + e_mipopula + e_peinfmor + v2pesecsch
                 + v2peprisch + v2cafres + e_wbgi_rle + e_wbgi_pve + e_total_fuel_income_pc
                 + e_miinflat, data = paneldata, model = "random")
```

```{r}

# Creating a table to show our summary of both models side by side

tab_model(gdpmodel2,gdpmodel1,pred.labels = c("Intercept", "Polyarchy", "Libdem","Partip","Delibdem","Egaldem","Population",
                                              "Infanty morality rate","Secondary school enrollment", "Primary school enrollment",
                                              "Freedom of Research/Teach","Rule of law","Political Stability",
                                              "Natural resource produc per capita","Inflation"),dv.labels = c("GDP Model 2 with controls","GDP Model 1"),
          string.est = "Coefficient",
          string.ci = "Conf.Int (95%)",
          string.p = "P-Value")
```


The original study (Acemoglu et al., 2014), posits that democracy has a positive impact on GDP per capita, through our replication, where we split up democracy into 5 key factors of democracy, in the initial GDP model 1 without controls, all of our independent x variables of democracy are considered significant each with a p-value less than 0.05.  The majority except for polyarchy and libdem, having a major positive impact on GDP per capita, conveyed through their coefficients displayed above. However, this is in in contrast to the ‘GDP Model 2 with controls’, where with the introduction of the similar control variables used in the original study, a much wider fit of the model has been explained through the 0.620 adjusted R-squared compared to the 0.117 of GDP Model 1. However, we also find that many of the initial democracy indicators aren’t significant at all. Our controls detract from their explanatory power, as more of the noise in the model that would have simply been contributed to the indicators, are explained by the control variables, such as Natural resources, rate of secondary school enrolment and the degree of rule of law in a country are significant covariate controls with a p-value below 0.05. This is also conveyed by their coefficient which portrays the magnitude of the variables, as we can see Partip and Egaldem have a large positive effect on GDP per capita, whilst the rest do not, and Polyarchy which is significant in a negative effect on growth. This divide of negative and positive may support (Acemoglu et al., 2014)’s theory on the overall perceived negative effect argued in (Barro, 1996) may be due to the measurement error, of democracy introducing bias. 

```{r}
cor_matrix <- cor(summarised_stats1)
print(cor_matrix)
```

We explore this, further by conducting a correlation matrix, whereby viewing the results, a large majority of the democracy indicators have unsurprisingly 1 or near correlation coefficients, which indicate extremely high multicollinearity between the factors of democracy.

## 4. Conclusions

In conclusion, it can be significantly argued that no one factor of democracy is outstandingly significant in increasing growth, through GDP per capita. As their significance seems to be highly intertwined which would confirm the possible outcome of my hypothesis that, the effects of these indices are cumulative and work as one to create a conclusive effect on growth. Therefore, the paper is in its right to use a simple code of democratic or not, as the larger effects on growth only occur when most of the factors of democracy are present to a certain degree. We can also argue that, as shown by the significance of our key indicators being reduced when the covariate channels which democracy tends to increase GDP through were added, we can convey how most of our factor's influence on growth, stems through and requires these channels, therefore not democracy alone. Rather requiring additional channels to be highly significant for the full influence of democracy to be felt. As argued by (Tavares and Wacziarg, 2001).

Our results are similar to that of the original paper, as our model follows the similar timeframe, and relatively similar variables, with a good few crossing over from the original. With roughly the same sample size of countries in the study, it helps reduce measurement error on that front. Conclusively this replication seems to back up with further evidence the claims made in the original study 'Does Democracy cause Growth' (Acemoglu et al., 2014)

## References

Reference list

Acemoglu, D., Naidu, S., Restrepo, P. and Robinson, J.A. (2014). Democracy Does Cause Growth. SSRN Electronic Journal, 127(1). doi:https://doi.org/10.2139/ssrn.2411791.

Acemoglu, D., Naidu, S., Restrepo, P. and Robinson, J.A. (2019). Democracy Does Cause Growth. Journal of Political Economy, [online] 127(1). doi:https://doi.org/10.1086/700936.

Barro, R.J. (1996). Democracy and growth. Journal of Economic Growth, 1(1), pp.1–27. doi:https://doi.org/10.1007/bf00163340.

Boix, C., Miller, M. and Rosato, S. (2012). A Complete Data Set of Political Regimes, 1800–2007. Comparative Political Studies, 46(12), pp.1523–1554. doi:https://doi.org/10.1177/0010414012463905.

Center for Systemic Peace (2022). INSCR Data Page. [online] www.systemicpeace.org. Available at: https://www.systemicpeace.org/inscrdata.html.

Cheibub, J.A., Gandhi, J. and Vreeland, J.R. (2009). Democracy and dictatorship revisited. Public Choice, 143(1-2), pp.67–101. doi:https://doi.org/10.1007/s11127-009-9491-2.

Freedom House (2020). Freedom in the World. [online] freedomhouse.org. Available at: https://freedomhouse.org/report/freedom-world.

GovData360. (n.d.). GovData360: Revised Combined Polity Score. [online] Available at: https://govdata360.worldbank.org/indicators/h6906d31b?country=BRA&indicator=27470&viz=line_chart&years=1800 [Accessed 5 Apr. 2023].

Minier, J.A. (1998). Democracy and Growth: Alternative Approaches. Journal of Economic Growth, 3(3), pp.241–266. doi:https://doi.org/10.1023/a:1009714821770.

Tavares, J. and Wacziarg, R. (2001). How democracy affects growth. European Economic Review, 45(8), pp.1341–1378. doi:https://doi.org/10.1016/s0014-2921(00)00093-3.

V-dem.net. (2023). Home | V-Dem. [online] Available at: (https://doi.org/10.23696/vdemds23, 2023).

World Bank (2023). World Bank Open Data | Data. [online] Worldbank.org. Available at: https://databank.worldbank.org/source/world-development-indicators.

Freese, J., & Peterson, D. (2017). Replication in social science. *Annual Review of Sociology*, 43, 147-165, [doi: 10.1146](https://www.annualreviews.org/doi/abs/10.1146/annurev-soc-060116-053450).

## Appendix

### Appendix 1. My environment (full information)

```{r session}
# Detailed information about my environment
sessionInfo()
```

### Appendix 2. Entire R code used in the project

```{r ref.label=knitr::all_labels(), echo=TRUE, eval=FALSE}
V_DEM_Full <- readRDS("/Users/nestor/REPLICATION2PROJ/SMI205_Preregistration_form-main/V-Dem-CY-Full+Others-v11.1.rds")

WB_data_variables <- read_csv("/Users/nestor/REPLICATION2PROJ/SMI205_Preregistration_form-main/WB_data_variables.csv")

WB_data_variables <- WB_data_variables %>%
  rename(country_name = `Country Name`)

WB_data_variables <- WB_data_variables %>%
  rename(year = `Time`)

WB_data_variables <- WB_data_variables %>%
  rename(country_text_id = `Country Code`)

WB_data_variables <- WB_data_variables %>%
  rename(tax_revenue = `Tax revenue (% of GDP) [GC.TAX.TOTL.GD.ZS]`)

WB_data_variables <- WB_data_variables %>%
  rename(net_investment = `Net investment in nonfinancial assets (% of GDP) [GC.NFN.TOTL.GD.ZS]`)

WB_data_variables <- WB_data_variables %>%
  rename(trade = `Trade (% of GDP) [NE.TRD.GNFS.ZS]`)



str(WB_data_variables)

datafile <- WB_data_variables

WB_data_variables <- datafile[!is.na(datafile$country_name), ]
WB_data_variables <- datafile[!is.na(datafile$year), ]

bottomthreshold <- 1960
V_DEM_Full <- V_DEM_Full[V_DEM_Full$year >= bottomthreshold, ]

countryID <- unique(V_DEM_Full$country_text_id)
Filtering <- WB_data_variables[WB_data_variables$country_text_id %in% countryID, ]

indicator_dataset <- inner_join(V_DEM_Full, Filtering, by = c("year", "country_text_id"))

condensed_data <- subset(indicator_dataset, select = c(country_name.x, country_text_id, country_id,
                                                       year, historical_date, histname,codingstart,
                                                       codingend,v2x_polyarchy, v2x_libdem, v2x_partipdem,
                                                       v2x_delibdem,v2x_egaldem,e_migdppc,tax_revenue,
                                                       net_investment, trade,e_cow_imports,
                                                       e_cow_exports,e_total_fuel_income_pc,e_miinflat,v2peprisch,
                                                       v2pesecsch,e_peinfmor,e_mipopula,e_wbgi_rle,e_wbgi_pve,e_migdpgrolns,
                                                       v2cafres,e_fh_rol,e_civil_war))


summarised_stats1 <- condensed_data %>%
  group_by(year) %>%
  summarize(mean_GDP = mean(e_migdppc, na.rm = TRUE),
            sd_GDP = sd(e_migdppc, na.rm = TRUE),
            mean_polyarchy = mean(v2x_polyarchy, na.rm = TRUE),
            sd_polyarchy = sd(v2x_polyarchy, na.rm = TRUE),
            mean_libdem = mean(v2x_libdem, na.rm = TRUE),
            sd_libdem = sd(v2x_libdem, na.rm = TRUE),
            mean_partip = mean(v2x_partipdem, na.rm = TRUE),
            sd_partip = sd(v2x_partipdem, na.rm = TRUE),
            mean_delibdem = mean(v2x_delibdem, na.rm = TRUE),
            sd_delibdem = sd(v2x_delibdem, na.rm = TRUE),
            mean_egaldem = mean(v2x_egaldem, na.rm = TRUE),
            sd_egaldem = sd(v2x_egaldem, na.rm = TRUE))

ggplot(summarised_stats1, aes(x = year, y = mean_GDP,)) +
  geom_point() +
  geom_smooth(aes(ymin = mean_GDP - sd_GDP, ymax = mean_GDP + sd_GDP), fill = "lightblue", alpha = 0.2) +
  theme_minimal() +
  scale_y_continuous(sec.axis = sec_axis(~ . / 20000, name = "Level of Democracy Indicators")) +
  geom_smooth(data = summarised_stats1, aes(y = mean_polyarchy * 20000), color = "red", size = 1) +
  geom_smooth(data = summarised_stats1, aes(y = mean_libdem * 20000), color = "green", size = 1) +
  geom_smooth(data = summarised_stats1, aes(y = mean_partip * 20000), color = "yellow", size = 1) +
  geom_smooth(data = summarised_stats1, aes(y = mean_delibdem * 20000), color = "purple", size = 1) +
  geom_smooth(data = summarised_stats1, aes(y = mean_egaldem * 20000), color = "orange", size = 1) +
  labs(title = "Average trend of GDP and democracy indicators over time " , x = "Years 1960 to 2020", y = "Mean GDP")



### NEED TO INSTALL xQuartz from their website... https://www.xquartz.org
# Otherwise the package below wont work

install.packages("summarytools")
library(summarytools)

tabledata <- subset(condensed_data, select = c(e_migdppc,v2x_polyarchy,v2x_libdem,v2x_partipdem,
                                               v2x_delibdem,v2x_egaldem,tax_revenue,net_investment,
                                               trade,e_total_fuel_income_pc,e_miinflat,v2peprisch,v2pesecsch,
                                               e_peinfmor,e_mipopula,e_wbgi_rle,e_wbgi_pve,e_migdpgrolns,
                                               v2cafres))

tabledata <- tabledata %>%
  rename(
    "Secondary school enrollment" = v2pesecsch,
    "Primary school enrollment" = v2peprisch,
    "Freedom of Research/Teach" = v2cafres,
    "Rule of law" = e_wbgi_rle,
    "Political stability" = e_wbgi_pve,
    "Natural resource produc per capita" = e_total_fuel_income_pc,
    "Infanty morality rate" = e_peinfmor,
    "Population" = e_mipopula,
    "Inflation" = e_miinflat,
    "GDP per capita" = e_migdppc,
    "GDP Growth" = e_migdpgrolns)


tabledata$tax_revenue <- as.numeric(tabledata$tax_revenue)
tabledata$net_investment <- as.numeric(tabledata$net_investment)
tabledata$trade <- as.numeric(tabledata$trade)


# Generate summary statistics table
summary_table <- descr(tabledata, stats = c("mean", "sd", "min", "max", "Q1", "Q3"))

# Print the summary statistics table
print(summary_table)

# Prepping the panel data
paneldata <- pdata.frame(condensed_data, index = c("country_name.x", "year"))



# Creating our first GDP model
gdpmodel1 <- plm(e_migdppc ~ v2x_polyarchy + v2x_libdem + v2x_partipdem + v2x_delibdem  + v2x_egaldem, data = paneldata, model = "random")




# Creating our 2nd GDP model
gdpmodel2 <- plm(e_migdppc ~ v2x_polyarchy + v2x_libdem + v2x_partipdem + v2x_delibdem 
                 + v2x_egaldem + e_mipopula + e_peinfmor + v2pesecsch
                 + v2peprisch + v2cafres + e_wbgi_rle + e_wbgi_pve + e_total_fuel_income_pc
                 + e_miinflat, data = paneldata, model = "random")




# Creating a table to show our summary of both models side by side

tab_model(gdpmodel2,gdpmodel1,pred.labels = c("Intercept", "Polyarchy", "Libdem","Partip","Delibdem","Egaldem","Population",
                                              "Infanty morality rate","Secondary school enrollment", "Primary school enrollment",
                                              "Freedom of Research/Teach","Rule of law","Political Stability",
                                              "Natural resource produc per capita","Inflation"),dv.labels = c("GDP Model 2 with controls","GDP Model 1"),
          string.est = "Coefficient",
          string.ci = "Conf.Int (95%)",
          string.p = "P-Value")

cor_matrix <- cor(summarised_stats1)
print(cor_matrix)
```
