Is High Debt Associated with Low Growth?

This is an illustrative example of data retrieval and manipulation using R. Let’s first set working directory and load relevent packages.

setwd("C:/Users/dvorakt/Google Drive/reproducibility")

library(WDI)
library(dplyr)
library(ggplot2)
library(stargazer)

We are going to download the data from World Bank’s World Development Indicators (WDI). There is an R package called WDI that accesses the internet and retrieves the series liste in the indicators option. The names of series can be found here.

wdi <- WDI(country = "all", start=1960, end=2015, extra="TRUE",
           indicator=c("NY.GDP.MKTP.KD.ZG","GC.DOD.TOTL.GD.ZS" , "NY.GDP.PCAP.KD"))

Let’s do some basic data manipulation.

#rename the variables more recognizable names
wdi <- rename(wdi, gdppc = NY.GDP.PCAP.KD, debttogdp = GC.DOD.TOTL.GD.ZS, gdpgrowth = NY.GDP.MKTP.KD.ZG)

#delete the 'Aggregates' so that we only have countries
wdi <- wdi[wdi$region != "Aggregates",]

#keep only the variables we're going to use
wdi <- select(wdi, debttogdp, gdpgrowth, gdppc, year, country)

#keep only observations for which we have no missing values
wdi <- wdi[!is.na(wdi$debttogdp), ]
wdi <- wdi[!is.na(wdi$gdpgrowth), ]
wdi <- wdi[!is.na(wdi$gdppc), ]

#create a log of GDP per capita in case we need it later int he analysis
wdi$loggdppc <- log(wdi$gdppc)

#create debt categories
wdi$debtcat <- ifelse(wdi$debttogdp <= 30, "0-30%",
                       ifelse(wdi$debttogdp <= 60, "30-60%",
                              ifelse(wdi$debttogdp <= 90 , "60-90%", "Above 90%")))

#plot growth against debt categories
ggplot(wdi,aes(x = factor(debtcat), y = gdpgrowth)) + stat_summary(fun.y = mean , geom = "bar")

Let’s create a dataset that looks at debt to GDP ratio and SUBSEQUENT growth over the next five years.

wdi <- arrange(wdi, country , year) #sort by country and year
#give each year within a country a number starting with 1
wdi <- wdi %>% group_by(country) %>% mutate(countryyear = row_number()) 
#create an indicator that marks each five-year period
wdi$fivey <- ceiling(wdi$countryyear/5) 
#create the number of years in each five-year period
wdi <- wdi %>% group_by(country, fivey) %>% mutate(nyearsin5y = n()) 
#drop five-year periods that don't have five years
wdi <- filter(wdi, nyearsin5y == 5) 
#keep only the first year of each five-year period
wdi <- filter(wdi, countryyear == 1 | countryyear == 6 | countryyear == 11) 
#wdi needs to be dataframe for stargazer to work
wdi <- data.frame(wdi)

Let’s produce a descriptive statistics table:

stargazer(wdi[c("gdpgrowth", "debttogdp", "gdppc")], type = "text" , digits=1)

## 
## ==============================================
## Statistic  N    Mean   St. Dev.  Min    Max   
## ----------------------------------------------
## gdpgrowth 173   3.7      4.2    -9.6    12.3  
## debttogdp 173   50.1     39.0    0.6   244.4  
## gdppc     173 14,332.4 19,033.8 182.9 99,626.1
## ----------------------------------------------

Let’s estimate some regressions.

r1 <- lm(gdpgrowth ~ debttogdp, data = wdi)
r2 <- lm(gdpgrowth ~ debttogdp + gdppc, data = wdi)
r3 <- lm(gdpgrowth ~ debttogdp + loggdppc, data = wdi)

And show the results in a nice table:

stargazer(r1, r2,r3, type = "html")


	Dependent variable:

	gdpgrowth
	(1)	(2)	(3)

debttogdp	-0.016^**	-0.018^**	-0.018^**
	(0.008)	(0.008)	(0.008)

gdppc		-0.00002
		(0.00002)

loggdppc			-0.212
			(0.227)

Constant	4.560^***	4.887^***	6.493^***
	(0.519)	(0.623)	(2.137)


Observations	173	173	173
R²	0.023	0.028	0.028
Adjusted R²	0.017	0.016	0.016
Residual Std. Error	4.181 (df = 171)	4.182 (df = 170)	4.183 (df = 170)
F Statistic	3.956^** (df = 1; 171)	2.427^* (df = 2; 170)	2.411^* (df = 2; 170)

Note:	p<0.1; p<0.05; p<0.01