1 Introduction

This visualisation aims to illustrate information regarding the annual amounts of carbon emissions released. To gain a high-level overview, we want to see the general trend of the amounts of carbon emissions released annually globally over the years, which will give us insights on the rate of increase in carbon emissions levels.

However, carbon emission released annually per country, region or even continent alone is not insightful enough. We would also want to visualise carbon emissions released per capita, to have a better understanding of carbon emissions habits of citizens of each country, and possibly identify countries which citizens’ tend to emit more carbon emissions instead. The following visualisations will look at both timeseries trends, as well as geographical distribution.

2 Major Data and Design Challenges

2.1 Not enough data from a single dataset

In order to create a comprehensive visualisation with a story to tell, it is important to have sufficient data to work with. Getting data from just one of the csvs itself was simply not enough and several other datasets have to be downloaded. With the multiple datasets, merging has to be done for the relevant datasets to plot the generate the respective visualisations.

2.2 No geospatial data

Using the datasets that I found, there was no geospatial information provided - only information relating to the country’s carbon emissions was provided. In order to use a choropleth map to illustrate the distribution of the annual carbon emissions per capita, another geospatial dataset with the polygon shapefiles of each country needs to be added. The shapefiles are downloaded from ArcGis Hub and the longtitude/latitude coordinates follow the WGS84 coordinate system. Using the geospatial dataset and the carbon emissions dataset, a new dataframe can be generated by joining the two datasets on a unique variable (e.g. Country Name).

2.3 Data needs to be cleaned

For the dataset on annual carbon emissions by region, the dataset included both Regions (e.g. Asia, Africa, Northern America) and the individual countries. As such, we need to extract out rows that are relevant for the visualisation manually.

2.4 Difficulty in presenting data in a readable format

The dataset provides exact values of the annual carbon emissions released per capita, and that of for regions. However, because these values are aggregated for each year, the values are extremely large, thus making it difficult to read and compare. As such, the number format will be customised to make these values easier to read, by converting them to billions, and rounded off to 2 decimal places.

3 Proposed Sketched Design

3 visualisations are proposed to gain insights on the annual carbon emissions released worldwide.

Stacked Area Chart

In an stacked area chart, each line is stacked on top of one another to illustrate both the trend of an individual line, but also the overall trend among all lines. As we want to view the trend of Annual Carbon Emissions by Region, the stacked area chart can show the change in annual carbon emissions for each region over the years, and can show the overall trend of total carbon emissions worldwide cumulated over the years.

Choropleth Map

A choropleth map is used to visualise the geographic distribution of carbon emissions per capita worldwide. This allows us to compare the amount of carbon emissions released per capita among countries, and identify countries which are the top contributors (apart from the known ones such as China, India and United States).

Timeseries Line Graph

After identifying the top contributing countries of carbon emissions per capita, we will plot a timeseries line graph of the trend of annual carbon emissions released per capita for these countries. This can help us gain insights on the growth of consumption per capita.

4 Data Visualisation

4.1 Preparation of Data Visualisation

4.1.1 Loading packages

packages = c('tidyverse', 'ggplot2','plotly', 'dplyr', 'ggiraph', 'tmap', 'sf', 'quantmod', 'ggthemes')

for (p in packages){
  if(!require (p, character.only =T)){
    install.packages(p)
  }
  library(p, character.only=T)
}

4.1.2 Datasets

Two datasets have been used for this visualisation.

The first dataset is Annual Total Carbon Emissions Released by World Region from ourworldindata.org. It is called annual-co-emissions-by-region.csv.
The second dataset is Annual Carbon Emissions Released per capita from ourworldindata.org. It is called co-emissions-per-capita.csv.

Both files are in csv file format. It is also important to note that in both datasets, emissions of carbon dioxide (CO₂) is measured in tonnes per year and that the emissions are from fossil fuels and cement production only, and that land use change is not included.

4.1.3 Reading the dataset

Read the dataset using the read.csv() function. Convert the tibble into a datframe by using the data.frame() function and use the function head() to obtain a snippet of what information the dataset provides, by looking at the column names and class of the variables.

emissions_total <- read.csv("C:/Users/user/Documents/R Shiny/Assignment 5/annual-co-emissions-by-region.csv")
emissions_total <- data.frame(emissions_total)
head(emissions_total)

##        Entity Code Year Annual.CO2.emissions
## 1 Afghanistan  AFG 1750                    0
## 2 Afghanistan  AFG 1751                    0
## 3 Afghanistan  AFG 1752                    0
## 4 Afghanistan  AFG 1753                    0
## 5 Afghanistan  AFG 1754                    0
## 6 Afghanistan  AFG 1755                    0

emissions_per_capita <- read.csv("C:/Users/user/Documents/R Shiny/Assignment 5/co-emissions-per-capita.csv")
emissions_per_capita <- data.frame(emissions_per_capita)
head(emissions_per_capita)

##        Entity Code Year Per.capita.CO2.emissions
## 1 Afghanistan  AFG 1949                 0.001912
## 2 Afghanistan  AFG 1950                 0.010871
## 3 Afghanistan  AFG 1951                 0.011684
## 4 Afghanistan  AFG 1952                 0.011542
## 5 Afghanistan  AFG 1953                 0.013216
## 6 Afghanistan  AFG 1954                 0.013036

4.2 Data Cleaning

Looking at the dataset for Annual Total Carbon Emissions Released by World Region, the dataset includes data for both regions and the individual countries. As such, we need to extract out the relevant rows that we want. Given that we want to visualise the amount of carbon emissions by region and top contributors, we will extract the 6 continents (data for Antarctica was not given) which are namely Asia, Africa, South America, North America, Europe and Oceania. We will also extract out China, India, United States and EU-27 as the dataset has flagged out these countries as the top contributors of carbon emissions.

This step is carried out using the filter() function.

emissions_by_region <- filter(emissions_total, 
                              (Entity=="Asia (excl. China & India)") | 
                              (Entity=="China") | 
                              (Entity=="India") |
                              (Entity=="Africa") |
                              (Entity=="South America") |
                              (Entity=="North America (excl. USA)")| 
                              (Entity=="United States") | 
                              (Entity=="Europe (excl. EU-27") |
                              (Entity=="EU-27") | 
                              (Entity=="Oceania")
                        )

head(emissions_by_region)

##   Entity Code Year Annual.CO2.emissions
## 1 Africa      1750                    0
## 2 Africa      1751                    0
## 3 Africa      1752                    0
## 4 Africa      1753                    0
## 5 Africa      1754                    0
## 6 Africa      1755                    0

4.3 Visualisation 1: Timeseries Stacked Area Chart for Annual CO₂ emissions

4.3.1 Static Stacked Area Chart

Use the geom_area() function to generate the stacked area chart
Entity is passed into the fill argument to split the area of the chart by the region.
scale_y_continuous() and scale_x_continuous() functions are used to manually set the x and y axis ranges.

area_chart <- ggplot(emissions_by_region, aes(x=Year, y=Annual.CO2.emissions
, fill=Entity)) +
  geom_area() +
  theme_minimal() +
  scale_y_continuous(breaks=seq(0,40000000000,5000000000)) +
  scale_x_continuous(breaks=seq(1750,2019,10)) +
  ggtitle("Annual Carbon Emissions by Region (1750-2019)") +
  labs(x="Year", y="Amount of Carbon Emissions (in billions") +
  theme(axis.text.x = element_text(angle = 45)) 

area_chart

4.3.2 Creating a function to convert CO₂ emission values to billions

From the original dataframe, we can see that the Annual CO₂ emission values are extremely large and are difficult to read and thus compare among regions/countries. As such, we can create a function called round_billion(), which converts the values to billions, and up to 2 decimal places.

round_billion <- function(x){
  x <- paste0(round(x/1e9, 2))
}

4.3.3 Interactive Stacked Area Chart

In order to add interactivity to the chart, we will use the ggplotly() function from the plotly package which was loaded earlier on. The function incorporates tooltips, which allow for easier reading of the values for each continent/country for a specific year. We can also zoom in to specific windows by drawing a box on the plot.

We can also customise the text that appears on the tooltip by formatting text within the geom_area() function. Here, we apply the round_billion() function to format the display of the annual CO₂ emissions.
The text is then passed into tooltip when plotting the chart using ggplotly().

area_chart <- ggplot(emissions_by_region, aes(x=Year, y=Annual.CO2.emissions
, fill=Entity)) +
  geom_area(aes(group=1,
                text = str_glue(
                      "{Entity}, {Year}
                      Annual CO2 emissions (in billions):{round_billion(Annual.CO2.emissions)}
                      ")
                )) +
  theme_minimal() +
  scale_y_continuous(breaks=seq(0,40000000000,5000000000)) +
  scale_x_continuous(breaks=seq(1750,2020,10)) +
  ggtitle("Annual Carbon Emissions by Region (1750-2019)") +
  labs(x="Year", y="Amount of Carbon Emissions (in billions)") +
  theme(axis.text.x = element_text(angle = 45)) 

area_chart <- ggplotly(area_chart, height=500, width = 800, tooltip = "text")
area_chart

# area_chart_slider <- area_chart %>% animation_slider(currentvalue = list(prefix = "Year" ))

5 Visualisation 2: Choropleth Map

5.1 Getting World Map data

5.1.1 Reading the geospatial shapefile

As mentioned previously, we do not have any geospatial information to plot the boundaries of each country from the datasets regarding the carbon emissions released per capita. Thus, we downloaded the data and made use of the st_read() function to read the data that the file is in, and the shapefile itself.

world_map <- st_read(dsn = "Longitude_Graticules_and_World_Countries_Boundaries-shp", 
                layer = "99bfd9e7-bb42-4728-87b5-07f8c8ac631c2020328-1-1vef4ev.lu5nk")

world_map

5.1.2 Merging geospatial and carbon emissions information

Since the choropleth map is unable to show timeseries trends of the geographic distributions of carbon emissions released per capita, we will focus on the most recent annual carbon emissions to analyse and annual ammount of carbon emissions per capita. As such, we will first need to filter the data and extract rows relevant to only the year 2019. This is done using the filter() function.

y2019 <- emissions_per_capita %>%
  filter(Year == "2019")
y2019 <- data.frame(y2019)
head(y2019)

##        Entity Code Year Per.capita.CO2.emissions
## 1 Afghanistan  AFG 2019                 0.281803
## 2      Africa      2019                 1.121465
## 3     Albania  ALB 2019                 1.936486
## 4     Algeria  DZA 2019                 3.988271
## 5     Andorra  AND 2019                 6.110325
## 6      Angola  AGO 2019                 1.194668

After getting the geospatial information of each of the countries, we now need to merge this information with the data about the carbon emissions released. * Use the left_join() function to join the country polygons to their respective carbon emissions data on the CNTRY_NAME and Entity which should be unique values that can identify each country.

choropleth <- left_join(world_map, y2019, by=c("CNTRY_NAME"="Entity"))
choropleth <- st_as_sf(choropleth)

5.2 Plotting the choropleth map

Set tmap_mode() to "view" to make the map interactive
Set tm_fill() such that the choropleth map is based on carbon emissions per capita using the field Per.capita.CO2.emissions
The style is set to the jenks classification method because it optimises the number and range of each interval given the dataset.
id is used to customise the hoverlabel, and is set to the country name in this case

tmap_mode("view")

map <- tm_view(view.legend.position = c("left", "bottom", legend.title.size = .5,
legend.text.size = .5)) +
tm_shape(choropleth) + 
  tm_fill("Per.capita.CO2.emissions", 
          style = "jenks", 
          palette = hcl.colors(5, palette = "Sunset"),
          title = "Amount of Carbon Emissions (billion tonnes)",
          id = "CNTRY_NAME",
          popup.vars=c("Annual Emissions per capita "="Per.capita.CO2.emissions")) +
  tm_borders(alpha = 0.5) + 
  tm_layout(title = "Distribution of Carbon Emissions per capita (2019)")

map

6 Visualisation 3: Top 10 contributiors in 2019

6.1 Preparing the new dataframe

6.1.1 Extracting the top 10 countries with the highest amount of carbon emissions released per capita

Given that we can now see the geographical distribution of the annual amounts of carbon emissions released per capita, it would be even more insightful to look at the trends of the top 10 countries which citizens contribute the most to global carbon emissions. As such, we first extract the top 10 countries based on the latest data, 2019.

We can make use of the rank function from the library dplyr, and we want to get the top 10 countries based on the field of Per.capita.CO2.emissions.
This sorts the data according to the carbon emissions per capita in a decreasing order and returns the rows in the dataframe y2019 which are ranked 10 and above (i.e. ranks 1-10).

top10 <- y2019 %>%
  filter(rank(desc(Per.capita.CO2.emissions))<=10)

top10

##                       Entity Code Year Per.capita.CO2.emissions
## 1                    Bahrain  BHR 2019                 20.93500
## 2                     Brunei  BRN 2019                 20.99004
## 3                     Kuwait  KWT 2019                 25.56027
## 4                   Mongolia  MNG 2019                 20.31433
## 5              New Caledonia  NCL 2019                 29.86432
## 6                      Qatar  QAT 2019                 38.61042
## 7               Saudi Arabia  SAU 2019                 16.98765
## 8  Sint Maarten (Dutch part)  SXM 2019                 17.93352
## 9        Trinidad and Tobago  TTO 2019                 27.14257
## 10      United Arab Emirates  ARE 2019                 19.51520

6.2 Extracting data related to these top 10 countries from the original dataset (data over the years)

To extract the names of the top 10 countries, we can use the as.vector() function.

top10 <- as.vector(top10$Entity)
top10

##  [1] "Bahrain"                   "Brunei"                   
##  [3] "Kuwait"                    "Mongolia"                 
##  [5] "New Caledonia"             "Qatar"                    
##  [7] "Saudi Arabia"              "Sint Maarten (Dutch part)"
##  [9] "Trinidad and Tobago"       "United Arab Emirates"

With the names of the top 10 countries, we can then use the filter() function once again, to obtain all rows within the original dataframe emissions_per_capita, where the Entity field contains any of the top 10 countries.

From the new dataframe top10_data we can see that the entities are only limited to the top 10 countries.

top10_data <- emissions_per_capita %>%
  filter(Entity %in% top10)

head(top10_data)

##    Entity Code Year Per.capita.CO2.emissions
## 1 Bahrain  BHR 1933                 0.111459
## 2 Bahrain  BHR 1934                 1.214683
## 3 Bahrain  BHR 1935                 5.287684
## 4 Bahrain  BHR 1936                19.220320
## 5 Bahrain  BHR 1937                31.784303
## 6 Bahrain  BHR 1938                33.652417

6.3 Plotting the timeseries line graph

The timeseries line graph can show us the trend of amount of carbon emissions released by each of the top 10 countries over the years.

Similar to the interactive stacked area chart, we can apply the same steps, except that we use the geom_line() function now, to plot a line graph instead.

line_graph <- ggplot(top10_data, aes(x=Year, y=Per.capita.CO2.emissions)) +
  geom_line(aes(group=1,
                text = str_glue(
                      "{Entity}, {Year}
                      Annual CO2 emissions per capita (in billions):{Per.capita.CO2.emissions}
                      "), colour=Entity
                )) +
  theme_minimal() +
  scale_y_continuous(breaks=seq(0,800,100)) +
  scale_x_continuous(breaks=seq(1750,2020,10)) +
  ggtitle("Annual Carbon Emissions per capita for top 10 countries (1750-2019)") +
  labs(x="Year", y="Amount of Carbon Emissions per capita (in billions)") +
  theme(axis.text.x = element_text(angle = 45)) 

line_graph <- ggplotly(line_graph, height=500, width = 800, tooltip = "text")
line_graph

7 Insights

From the stacked area chart, we can see that overall release of carbon emissions have definitely increased globally, at an increasing rate, or even close to an exponential rate. Breaking it down further by the different entities, we can tell that China, although a country and not a region by itself, has contributed the most to carbon emissions in 2019, as seen from the largest width of its area chart as of 2019. This is further validated by looking at the tooltip, which tells us that annual carbon emissions released by China alone, was 10.17 billion tonnes. This is followed by Asia (which excludes China and India) with 7.45 billion tonnes of carbon emitted in 2019, then the United States which contributed 5.28 billion tonnes of carbon emissions. South America, Oceania and the rest of North America excluding US contribute the least carbon emissions among all the entities in 2019.
From the stacked area chart as well, we can tell that the growth in carbon emissions occurred the latest for China as compared to the other regions or large contributors, as seen that the size of the area of the chart for China only starting increasing significantly from 1950 onwards. This is unlike the case as compared to other entities such as the United States or EU-27, which are developed countries now, which explains that their industries have been developed way before China, who is still a developing country as of 2019.
From the choropleth map, we can see that countries in the Northern Hemisphere tend to have higher carbon emissions released per capita, while countries in the Southern Hemisphere tend to have smaller amounts of carbon emissions per capita. We also note even though China was highlighted as the country with the highest annual carbon emissions, it was not one of the top contributing countries when we look at the amount of carbon emissions released per capita instead. This meant that to obtain a more accurate understanding of the carbon emission habits, carbon emission per capita should be used instead.
Lastly, we can look at the trend of carbon emissions per capita for the top 10 countries. We can see that Sint Maarten (Dutch part) was the country that had the highest carbon emissions per capita from 1950 to 1958 before other countries within the 10 top overtook it. As of 2019, Qatar is the conutry with the highest carbon emissions per capita, at 28.6billion tonnes. It seems that the carbon emissions per capita for these top 10 countries for the past decade has been consistently high, up till 2019.

7.1 References

https://ourworldindata.org/co2-and-other-greenhouse-gas-emissions https://ourworldindata.org/per-capita-co2 https://www.displayr.com/how-to-make-an-area-chart-in-r/ https://plotly.com/ggplot2/axis-text/ https://www.r-graph-gallery.com/271-ggplot2-animated-gif-chart-with-gganimate.html

Annual amounts of CO₂ Emissions Released

IS428 Assignment 5

Ow Ling Jia

6 April 2021