Is High Debt Associated with Low Growth?

This is an illustrative example of data retrieval and manipulation using R. Let’s first set working directory and load relevent packages.

setwd("C:/Users/dvorakt/Google Drive/reproducibility")

library(WDI)
library(dplyr)
library(ggplot2)
library(stargazer)

We are going to download the data from World Bank’s World Development Indicators (WDI). There is an R package called WDI that accesses the internet and retrieves the series liste in the indicators option. The names of series can be found here.

wdi <- WDI(country = "all", start=1960, end=2015, extra="TRUE",
           indicator=c("NY.GDP.MKTP.KD.ZG","GC.DOD.TOTL.GD.ZS" , "NY.GDP.PCAP.KD"))

Let’s do some basic data manipulation.

#rename the variables more recognizable names
wdi <- rename(wdi, gdppc = NY.GDP.PCAP.KD, debttogdp = GC.DOD.TOTL.GD.ZS, gdpgrowth = NY.GDP.MKTP.KD.ZG)

#delete the 'Aggregates' so that we only have countries
wdi <- wdi[wdi$region != "Aggregates",]

#keep only the variables we're going to use
wdi <- select(wdi, debttogdp, gdpgrowth, gdppc, year, country)

#keep only observations for which we have no missing values
wdi <- wdi[!is.na(wdi$debttogdp), ]
wdi <- wdi[!is.na(wdi$gdpgrowth), ]
wdi <- wdi[!is.na(wdi$gdppc), ]

#create a log of GDP per capita in case we need it later int he analysis
wdi$loggdppc <- log(wdi$gdppc)

#create debt categories
wdi$debtcat <- ifelse(wdi$debttogdp <= 30, "0-30%",
                       ifelse(wdi$debttogdp <= 60, "30-60%",
                              ifelse(wdi$debttogdp <= 90 , "60-90%", "Above 90%")))

#plot growth against debt categories
ggplot(wdi,aes(x = factor(debtcat), y = gdpgrowth)) + stat_summary(fun.y = mean , geom = "bar")


Let’s create a dataset that looks at debt to GDP ratio and SUBSEQUENT growth over the next five years.

wdi <- arrange(wdi, country , year) #sort by country and year
#give each year within a country a number starting with 1
wdi <- wdi %>% group_by(country) %>% mutate(countryyear = row_number()) 
#create an indicator that marks each five-year period
wdi$fivey <- ceiling(wdi$countryyear/5) 
#create the number of years in each five-year period
wdi <- wdi %>% group_by(country, fivey) %>% mutate(nyearsin5y = n()) 
#drop five-year periods that don't have five years
wdi <- filter(wdi, nyearsin5y == 5) 
#keep only the first year of each five-year period
wdi <- filter(wdi, countryyear == 1 | countryyear == 6 | countryyear == 11) 
#wdi needs to be dataframe for stargazer to work
wdi <- data.frame(wdi)

Let’s produce a descriptive statistics table:

stargazer(wdi[c("gdpgrowth", "debttogdp", "gdppc")], type = "text" , digits=1)
## 
## ==============================================
## Statistic  N    Mean   St. Dev.  Min    Max   
## ----------------------------------------------
## gdpgrowth 173   3.7      4.2    -9.6    12.3  
## debttogdp 173   50.1     39.0    0.6   244.4  
## gdppc     173 14,332.4 19,033.8 182.9 99,626.1
## ----------------------------------------------

Let’s estimate some regressions.

r1 <- lm(gdpgrowth ~ debttogdp, data = wdi)
r2 <- lm(gdpgrowth ~ debttogdp + gdppc, data = wdi)
r3 <- lm(gdpgrowth ~ debttogdp + loggdppc, data = wdi)

And show the results in a nice table:

stargazer(r1, r2,r3, type = "html")
Dependent variable:
gdpgrowth
(1) (2) (3)
debttogdp -0.016** -0.018** -0.018**
(0.008) (0.008) (0.008)
gdppc -0.00002
(0.00002)
loggdppc -0.212
(0.227)
Constant 4.560*** 4.887*** 6.493***
(0.519) (0.623) (2.137)
Observations 173 173 173
R2 0.023 0.028 0.028
Adjusted R2 0.017 0.016 0.016
Residual Std. Error 4.181 (df = 171) 4.182 (df = 170) 4.183 (df = 170)
F Statistic 3.956** (df = 1; 171) 2.427* (df = 2; 170) 2.411* (df = 2; 170)
Note: p<0.1; p<0.05; p<0.01