As expressed in the title, this analysis centers around the economical effects different governmental policies have on countries around the world. This project will address whether countries with less invasive government tendencies have more productive economies, and will compare the rankings given to each country by The Heritage Foundation. The variables that will be included in this data collection are: Gross Domestic Product (GDP), Property Rights, Tax Burden, Business Freedom, Financial Freedom, World Ranking, and many others. This information will display any correlation between government policies and the well-being of that country’s economy. This is especially important in today’s world with increasing regulations and burdensome political structures.
The original data set that will be used in this study is collected by The Heritage Foundation, an influential research organization that gathers annual information on countries all over the world. The particular set examined in this analysis is an exhaustive agglomeration of economic and governmental variables gathered for this year, 2017. To access this data set, please click here.
In order to discover trends from this data, we will look at the highest ranked and lowest ranked countries and discover from where this differentiation comes. To most effectively visualize the information, we will use various scatterplots measuring the correlation of two (or more) variables.
By observing the contrasting elements of government tactics and the response to these policies by the economy, we can learn which type of political atmosphere is most conducive to a thriving economy. “Incentives Matter;” Understanding whom we elect and their role in creating an environment where wealth creation is encouraged is central to being an informed, constructive citizen.
In order to evaluate the Heritage Foundation Excel information you will need to load the following packages to be able to access and manipulate the dataset:
library(readxl) #Import Excel file to R
library(dplyr) #Manipulate data
library(tidyverse)#Tidy up data
library(ggplot2) #Visualize data
library(DT) #Print tables
library(magrittr) #Piping
As previously mentioned, for this project I am examining data from the second half of 2015 through the first half of 2016 collected by The Heritage Foundation titled, “2017 Index Data.” This dataset measures 186 countries’ Economic Freedom, which is defined as “the fundamental right of every human to control his or her own labor and property.” Measures of Economic Freedom are based on 12 factors, grouped into four broad categories : Rule of Law, Government Size, Regulatory Efficiency, and Open Markets. The Heritage Foundation argues that Economics Freedom brings prosperity; we will analyze this argument by examining correlations of differnt scores and graphing the data collected by the Heritage Foundation.
By exporting the data to an Excel spreadsheet we are able to locate the document for future referencing. Once we have loaded “readxl,” we can start the process of loading and tidying the data by viewing the Sheets in the Spreadsheet. I have labled my original exported Excel dataset “RData.xlsx.” This collection originally consisted of 34 variables (missing values = N/A), but we will limit our analysis to only the most critical measurements and get rid of duplicate data.
#################
##Loading Data##
################
##Lists all Sheets in Excel Spreadsheet##
excel_sheets("RData.xlsx")
##Read in Dataset##
read_excel("RData.xlsx", sheet = "Sheet1")
##Assigns name to Dataset##
heritage_data <- read_excel("RData.xlsx", sheet = "Sheet1")
#################
##Cleaning Data##
#################
##Assigns Numeric and Character Variables##
char = c(2:4,25)
num = c(1,5:24,26:34)
heritage_data[,char]<-sapply(heritage_data[, char], as.character)
heritage_data[,num]<- sapply(heritage_data[, num], as.numeric)
##Replaces N/A with 0##
heritage_data[is.na(heritage_data)] <- 0
##Eliminates Unnecessary Variables##
clean_heritage <- heritage_data[c(-1,-3,-6,-25)]
##Creates Datatable for Cleaned Data##
datatable(clean_heritage, caption = 'Table 1: Clean Economic Data')
The purpose for this data exploration is to see how different variables effect a country’s economic well-being. We will first examine basic correlations, then begin to dig deeper into the data.
clean_heritage%>%
ggplot(aes(x=`2017 Score`, y=`GDP (Billions, PPP)`, color=Region)) +
geom_point() +
ggtitle("Figure 1: Economic Freedom and GDP",
subtitle = "By Region")
Since the first graph, Figure 1, shows us the distribution of GDP based on the Economic Freedom Score for the different regions, we will now look at the “four pillars of economic freedom:” Rule of Law, Government Size, Regulatory Efficiency, and Open Markets. This will help us to determine if one is more influential on GDP.
rule_of_law <- clean_heritage[c(1,5:7,23)]
rule_of_law$mean <- rowMeans(subset(rule_of_law, select = c(2:4)), na.rm = TRUE)
rule_of_law%>%
ggplot(aes(x=mean, y=`GDP (Billions, PPP)`)) +
geom_point() +
geom_smooth() +
ggtitle("Rule of Law and GDP",
subtitle = "average of property rights, government integrity, judicial effectiveness")
government_size <- clean_heritage[c(1,8:10,23)]
government_size$mean <- rowMeans(subset(government_size, select = c(2:4)), na.rm = TRUE)
government_size%>%
ggplot(aes(x=mean, y=`GDP (Billions, PPP)`)) +
geom_point() +
geom_smooth() +
ggtitle("Government Size and GDP",
subtitle = "average of government spending, tax burden, fiscal health")
regulatory_efficiency <- clean_heritage[c(1,11:13,23)]
regulatory_efficiency$mean <- rowMeans(subset(regulatory_efficiency, select = c(2:4)), na.rm = TRUE)
regulatory_efficiency%>%
ggplot(aes(x=mean, y=`GDP (Billions, PPP)`)) +
geom_point() +
geom_smooth() +
ggtitle("Regulatory Efficiency and GDP",
subtitle = "average of business freedom, labor freedom, monetary freedom")
open_markets <- clean_heritage[c(1,14:16,23)]
open_markets$mean <- rowMeans(subset(open_markets, select = c(2:4)), na.rm = TRUE)
open_markets%>%
ggplot(aes(x=mean, y=`GDP (Billions, PPP)`)) +
geom_point() +
geom_smooth() +
ggtitle("Open Markets and GDP",
subtitle = "average of trade freedom, investment freedom, financial freedom")
These four graphs depict the effects of certain classes of variables have on a country’s GDP. Because the ranking of countries are determined by the average of these twelve variables (grouped into four pillars), these latter graphs are very similar to the first, showing regions’ ranks and GDP. From these this group of four, we can see that all have slight variance with a few outliers, but Regulatory Efficiency seems to have the largest abnormality, pulling the line graph upwards.
Lastly (for now), we will look at the spread of GDP based on region. This will give us insight on the most prosperous countries, and can then further study their political systems.
clean_heritage%>%
ggplot(aes(Region, `GDP (Billions, PPP)`)) + geom_boxplot() + geom_jitter()
The above figure agrees with our original scatterplot, displaying Asia-Pacific and the Americas as having the largest GDPs. From this we can also infer that the countries with the highest GDPs have the most spread in their data. The Americas and Europe look a lot alike, but the Americas have a couple of large outliers. This would lead me to conclud that the outliers have better political systems in place, and that we should replicate this type of behavior.
We will now take a look at the highest ranking countries in the “Americas:”
rank_data <- clean_heritage[c(1,2,3,23)]
rank_data%>%
filter(Region == 'Americas', `World Rank`<50)
## # A tibble: 7 x 4
## `Country Name` Region `World Rank` `GDP (Billions, PPP)`
## <chr> <chr> <dbl> <dbl>
## 1 Canada Americas 7 1631.9
## 2 Chile Americas 10 422.4
## 3 Colombia Americas 37 667.4
## 4 Jamaica Americas 41 24.6
## 5 Peru Americas 43 389.1
## 6 United States Americas 17 17947.0
## 7 Uruguay Americas 38 73.5