The relationship between GDP and emission per capita for different countries has been widely documented, however I wanted to illustrate it in a way in which I could capture the time dimension to try to associate certain historical events with changes in the behaviour of both variables. This allowed me to explore the visualization options that R has to offer. Doing some research I found the gganimate package that offers many functions that make it possible to animate graphs.
First I started by setting up my working directory and naming the libraries that I was going to use:
setwd("C:\\Users\\Lara\\Documents\\CCMF\\2. Quantitative Methods\\R\\E3")
library(dplyr)
##
## Attaching package: 'dplyr'
## The following objects are masked from 'package:stats':
##
## filter, lag
## The following objects are masked from 'package:base':
##
## intersect, setdiff, setequal, union
library(ggplot2)
library(readxl)
library(tidyr)
library(gganimate)
Here, I upload the dataframe to R Studio. For this visualization exercise I used the data of CO2 emissions, population and GDP per country from the Our World in Data website.The database that I used can be found at the following link: https://ourworldindata.org/co2-and-other-greenhouse-gas-emissions
df_emissions=read_excel("C:\\Users\\Lara\\Documents\\CCMF\\2. Quantitative Methods\\R\\E3\\data\\owid-co2-data.xlsx",col_names=TRUE)
Then, I selected the variables from the dataframe that I was going to use:
df_emissions=df_emissions%>%select(country,year,co2, gdp, population)
Transformed the dataframe to include the variables co2_per_capita and gdp_per_capita as the original dataset had many missing values.
df_emissions=df_emissions%>%mutate(df_emissions, gdp_capita=gdp/population,co2_capita=co2*1000000/population)
Since they were too many missing values before 1900 I filtered the dataframe to include values after 1900.
df_emissions=df_emissions%>%filter(year>1900)
In this piece of code I filtered my initial dataframe to include only the countries in which I was interested. I attempted to include all of them but the graph looked too crowded and the code took too much time to run.
df_emissions_countries=df_emissions%>%filter(as.character(country)=="China" | as.character(country)=="India" | as.character(country)=="United States"| as.character(country)=="United Kingdom"| as.character(country)=="Brazil"| as.character(country)=="France"| as.character(country)=="Germany"| as.character(country)=="Japan"| as.character(country)=="Nigeria"| as.character(country)=="Canada"| as.character(country)=="South Korea"| as.character(country)=="Poland"| as.character(country)=="Russia"| as.character(country)=="South Africa"| as.character(country)=="Australia")
Here I created the scatter plot using the ggplot function. ggplot allows one to integrate the elements of the graph as if they where layers.
In the first part of the code I created a scatter plot and specified that I wanted to plot gdp_capita on the x-axis and tons of CO2 per capita on the y-axis. I also indicated that I wanted the size of each bubble to be proportional to the population each country, and the colour to be different for each country.
In the second piece of code I included the commands necessary to animate the the scatter plot that I created in the first piece. In this case:
Graph= ggplot(df_emissions_countries, aes(gdp_capita, co2_capita, size = population/1000000, color = country)) +
#This layer specifies that I want the graph to be a scatter plot-type graph
geom_point() +
#This layer allows me to specify the characteristics of the labels of the observations that I'm including in the plot. In this case I'm saying that I want the observations to be labelled with the names of the countries and I'm specifying that I want the font to be a certain size(4), colour black, and to be located slightly above each bubble (nudge_y).
geom_text(label=df_emissions_countries$country, size=4, nudge_y=1.5, color= "black")+
#This layer allows me to change the colour scale of the bubbles
scale_color_viridis_d()+
#This layer allows me to change the scale of the graph to log scale for both axis
scale_y_log10()+
scale_x_log10()+
#Before including this I had the problem that the y axis was in scientific notation and I felt that it might be easier to interpret if I could change it. I found this command that does exactly that.
scale_y_continuous(labels = function(x) format(x, scientific = FALSE))+
scale_size(range = c(1, 40), name="Population [M]")+
theme_gray()+
# Animation commands using gganimate functions:
labs(title = 'Year:{frame_time}', x = ' GDP per capita', y = ' ton CO2 per
capita') +
transition_time(as.integer(year)) +
ease_aes('linear')
## Scale for 'y' is already present. Adding another scale for 'y', which will
## replace the existing scale.
animate(Graph, duration = 25, fps = 20, width = 1000, height = 500)
anim_save("C:\\Users\\Lara\\Documents\\CCMF\\2. Quantitative Methods\\R\\E3\\co2vsgdp.gif")
Discussion
The graph shows that in general, GDP per capita and CO2 emission intensity are coupled. For example, during years where there have been events that have negatively impacted GDP there has also been a diagonal drop towards the origin of the graph; a reduction of both GDP per capita and CO2 per capita. This can be seen for example in the case of the United States during the Great Depression (from 1929) or the in European countries during both World Wars. Another example is the behaviour of Russia immediately after the fall of the Soviet Union in 1991, falling to 1960s levels both in terms of CO2 emissions and GDP per capita. This trend of coupled growth and fall of GDP per capita and CO2 emissions seems to be sustained until the 2000s when there is an almost vertical drop for some countries such as the US, Australia and Canada that coincides with a sustained rise in both GDP and emissions from countries such as China and India. One reason for this could be that these high-income countries outsourced part of their high emission intensity activities to, for example, China, while starting to rely more on less emission intensive activities to sustain their economic growth.