Introduction

The Gap Minder data set consists of population data for 1,704 countries from Africa, the Americas, Asia, Europe, and Oceania. Life expectancy, population and gross domestic product (GDP) per capita have been recorded from 1952 to 2007, in 5-year intervals. The breath of the Gap Minder data set provides numerous avenues of investigation. For example, if only two variables are selected, there is still 15 possible relationships to consider.

The current data investigation was interested in understanding the economic outcomes within the continent of the Americas. Thus GDP per capita was selected as the primary dependent variable. GDP per capita is the sum of the total economic income of a country divided by the mid-year population. It is a scaled variable that is comparable across year, continent, and country.

To that end, three plots have been created to investigate the relationship between GDP per captia and year (1952 to 2007). The first compares the Americas GDP per captia to all 5 other continents. The second considers the variability of GDP per captia growth for the years of 1997 to 2007 between the different regions of the Americans. Finally, the third plot brakes down the region with the most fluctuating GDP per captia growth, identifying which countries experienced high growth and which experienced low.

For each plot, the corresponding data transformation and plot code is given prior to the plot. The following packages were used to complete this data investigation:

library(gapminder)
#Data set
library(ggplot2)
library(scales)
library(dplyr)
library(tidyverse)

Analysis

First consider was the global trends of GDP per captia over time. Specifically, how much does the Americas gross domestic product increase over time (1952 - 2007), and how does it compare to other continents?

To achieve this, a line graph was created. Time (1952 - 2007) was the x-axis and GDP per captia (transformed to log 2) along the y-axis. Each continent was represented as a separate line and is identifiable with a symbol.

Global_GDP <- gapminder  %>%
  ggplot( aes(x = year, y = gdpPercap, shape = continent)) +
  stat_summary(fun = mean, geom="point", size=2, color="#F8766D")+
  stat_summary(fun = mean, geom="line", size = 1, color = "#F8766D", alpha = .6)+
  scale_x_continuous(breaks = seq(1952, 2007, 5))+
  theme_bw()+
  scale_y_continuous(
    trans = log2_trans(),
    breaks = trans_breaks("log2", function(x) 2^x),
    labels = trans_format("log2", math_format(2^.x))
  )+
  labs(y= "Gross Domestic Product per capita (Log 2) ", x = "Year")+
  labs(color = "Continent")+
  guides(fill = guide_legend(reverse = TRUE))

Plot 1: Global Growth of Gross Domestic Product Per Capita (1952-2007)

Plot 1 illustrates that the Americas GDP per captia does increase between 1952 to 2007, from a continental mean amount of 4079 to 11003. Moreover, as compared to Africa, Asia, Europe and Oceania, the Americas GDP per captia remains the third highest. Positioning it in the middle between all six continents. Between 1952 to 2007 Oceania’s GDP per captia is consistently the highest and Africa’s GDP per captia is consistently the lowest.

Important to consider however is the spread of data that makes up the Americas mean GDP per captia. While as a whole the Americas mean GDP per captia increases between 1952-2007, there is a large variation in the minimum GDP per captia and maxim GDP per captia for each year. Moreover, the Table 1 shows that in year 2007, the minimum GDP per captia is less then the minimum GDP per captia for 1952.

Table 1

group_by(gapminder )%>%
  filter(continent == "Americas") %>%
  group_by(year)%>%
  summarize(
    n = n(),
    mean = mean(gdpPercap), 
    min = min(gdpPercap), 
    max = max(gdpPercap))
## # A tibble: 12 × 5
##     year     n   mean   min    max
##    <int> <int>  <dbl> <dbl>  <dbl>
##  1  1952    25  4079. 1398. 13990.
##  2  1957    25  4616. 1544. 14847.
##  3  1962    25  4902. 1662. 16173.
##  4  1967    25  5668. 1452. 19530.
##  5  1972    25  6491. 1654. 21806.
##  6  1977    25  7352. 1874. 24073.
##  7  1982    25  7507. 2011. 25010.
##  8  1987    25  7793. 1823. 29884.
##  9  1992    25  8045. 1456. 32004.
## 10  1997    25  8889. 1342. 35767.
## 11  2002    25  9288. 1270. 39097.
## 12  2007    25 11003. 1202. 42952.

This variation in the data is to be expected given the geographic scope of the Americans. North America as compared to Central America, South America and the Caribbean is more developed, having much higher GDP per capt. Thus graphing the variability of GDP per captia within the Americas would provide a clearer picture of the differences between regions. Specifically, the GDP per captia growth as a percentage for the last 10 year (1997 - 2007).

To develop this graph, a new data set was created that only included countries of the Americas.

Americas <- gapminder %>% 
  filter(continent == "Americas" )

A new Region variable was created that classified each country as belonging to either North America, Central America, South America, and the Caribbean.

Refining the data set further, the relevant years (1997 and 2007) and GDP per captia were filtered for. The data was then transformed from long format into wide format, so each country only had one line of data. Next, a new variable for GDP per captia growth, which was the percentage change between 1997 and 2007. This allowed for a comparison to be made between regions while taking into wealth discrepancy.

Americas_GDP <- Americas %>% 
  filter(year == 1997| year==2007)%>% 
  select(-lifeExp, -pop, -continent)

Americas_GDP <-spread(Americas_GDP, year, gdpPercap)

Americas_GDP$GDP_Percent <- ((Americas_GDP$`2007`-Americas_GDP$`1997`)/Americas_GDP$`1997`)*100

Finally, a Box Plot was constructed with Region along the x-axis and percent increase of GDP per captia along the y-axis.

Americas_GDP_Plot <- 
Americas_GDP %>%
  mutate(country = fct_reorder(Region, GDP_Percent)) %>%
  ggplot( aes(y=GDP_Percent, x=Region)) +
  geom_boxplot(fill = "gray", outlier.shape = NA, alpha=0.3)+
  geom_jitter(color="#F8766D", fill="#F8766D", alpha=0.3, width=.13)+
  scale_y_continuous(name="Percent Increase of GDP Per Captia", limits=c(-20, 120), breaks = c(-20, 0, 20, 40, 60, 80, 100, 120))+
  labs(x = "Region")+
  theme_bw()

Plot 2: Regions of the Americas GDP Per Captia Percent Increase from 1997-2007

Plot 2 shows that the Caribbean region had the highest increase of GDP per captia with a median of 40 % between the 1997 and 2007. South America, North America and Central America all grouped around the median of a 20% increase. Importantly, Plot 2 showed that the Caribbean region also has the highest degree of variability within GDP per captia percent increase, with one country doubling its GDP per captia over the 10 year period while another country fell by 10%. The third and final plot identifies which countries within the Caribbean region have experienced high growth and which low.

A LollyPop graph was created to compare GDP per captia percent increase between the different Caribbean countries. Percent increase was mapped onto the x-axis and country was mapped onto the y-axis.

Caribbean <-Americas_GDP %>%
  filter(Region == "Caribbean")%>%
  mutate(country = fct_reorder(country, GDP_Percent)) %>%
  ggplot( aes(x=GDP_Percent, y=country)) +
  geom_segment( aes(y=country, yend=country, x=0, xend=GDP_Percent), color="gray", size = 1) +
  geom_vline(xintercept = 0)+
  geom_point(size=2, color="#F8766D") +
  theme_light()+
  theme(
    panel.grid.major.y = element_blank(),
    panel.border = element_blank(),
    axis.ticks.x = element_blank()
  ) +
  theme_bw()+
  ylab("Country")+
  scale_x_continuous(name="GDP Per Captia Percent Change", limits=c(-20, 120), breaks = c(-20, 0, 20, 40, 60, 80, 100, 120))

Plot 3: Caribbean Countries GDP Per Captia Percent Variation from 1997-2007

As shown in Plot 3, Trinidad and Tobago experienced a large growth in GDP per captia, having an over 100% increase between 1997 and 2007. This raises an interesting question as to why Trinidad and Tobago experience such a high degree of growth. Was this the effect of trade agreements, development projects or new government initiatives? Haiti on the other hand suffered a 10% loss in GDP per captia between 1997 and 2007. This too raise questions about possible political or geographic events that may have effected the countries economic growth.

Conclusion

All three plots showcase the variability in continental GDP per captia. Specifically within the continent of the Americas which contains diverse geographical, political and developmental characteristics. Not only do they illustrate the economic discrepancies between each region of the Americas, but also within regions. Furthermore, it has allowed for clear economic anomalies to be identified, which presents further avenues of investigation for other domains in the social sciences.