Introduction

This dataset contains information on sustainable energy indicators across all countries from 2000 to 2020. Some aspects it covers include access to electricity, renewable energy capacity, energy intensity (energy use per unit of GDP at purchasing power parity), and financial flows (aid from developed countries for clean energy projects). This project explored the evolving relationship between renewable energy adoption (measured by the percentage of renewable energy in a country’s final energy consumption) and carbon emissions per capita, focusing on the most carbon-intensive countries. The data and background information was primarily derived from the scientific online publication Our World in Data, with additional references from the World Bank and the International Energy Agency.

Load tidyverse read in the data

library(tidyverse)
## ── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ──
## ✔ dplyr     1.1.3     ✔ readr     2.1.4
## ✔ forcats   1.0.0     ✔ stringr   1.5.0
## ✔ ggplot2   3.4.3     ✔ tibble    3.2.1
## ✔ lubridate 1.9.2     ✔ tidyr     1.3.0
## ✔ purrr     1.0.2     
## ── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
## ✖ dplyr::filter() masks stats::filter()
## ✖ dplyr::lag()    masks stats::lag()
## ℹ Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to become errors
energy <- read_csv("global-data-on-sustainable-energy (1).csv")
## Rows: 3649 Columns: 21
## ── Column specification ────────────────────────────────────────────────────────
## Delimiter: ","
## chr  (1): Entity
## dbl (19): Year, Access to electricity (% of population), Access to clean fue...
## num  (1): Density\n(P/Km2)
## 
## ℹ Use `spec()` to retrieve the full column specification for this data.
## ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.

Clean variable names (lowercase + replace spaces with underscores)

names(energy) <- tolower(names(energy))
names(energy) <- gsub(" ", "_", names(energy))
head(energy)
## # A tibble: 6 × 21
##   entity       year access_to_electricity_(%_of_populat…¹ access_to_clean_fuel…²
##   <chr>       <dbl>                                 <dbl>                  <dbl>
## 1 Afghanistan  2000                                  1.61                    6.2
## 2 Afghanistan  2001                                  4.07                    7.2
## 3 Afghanistan  2002                                  9.41                    8.2
## 4 Afghanistan  2003                                 14.7                     9.5
## 5 Afghanistan  2004                                 20.1                    10.9
## 6 Afghanistan  2005                                 25.4                    12.2
## # ℹ abbreviated names: ¹​`access_to_electricity_(%_of_population)`,
## #   ²​access_to_clean_fuels_for_cooking
## # ℹ 17 more variables:
## #   `renewable-electricity-generating-capacity-per-capita` <dbl>,
## #   `financial_flows_to_developing_countries_(us_$)` <dbl>,
## #   `renewable_energy_share_in_the_total_final_energy_consumption_(%)` <dbl>,
## #   `electricity_from_fossil_fuels_(twh)` <dbl>, …

Arrange countries/regions in descending order of mean CO2 emissions per capita, dropping missing values from calculation

energy |> group_by(entity) |> summarise(mean_co2 = mean(value_co2_emissions_kt_by_country, na.rm = TRUE)) |> arrange(desc(mean_co2))
## # A tibble: 176 × 2
##    entity         mean_co2
##    <chr>             <dbl>
##  1 China          7636642.
##  2 United States  5329539.
##  3 India          1633979.
##  4 Japan          1183734.
##  5 Germany         773645.
##  6 Canada          547645.
##  7 United Kingdom  470604.
##  8 Mexico          444619.
##  9 Indonesia       420334.
## 10 Saudi Arabia    416248.
## # ℹ 166 more rows

Filter out the top 4 countries and create a new variable showing CO2 emssions in millions of metric tons per capita

co2_leaders <- energy |> filter(entity %in% c("China", "United States", "India", "Japan")) |> mutate(co2_emissions = value_co2_emissions_kt_by_country / 10^6) |> select(entity, year, `renewable_energy_share_in_the_total_final_energy_consumption_(%)`, co2_emissions)
co2_leaders
## # A tibble: 84 × 4
##    entity  year renewable_energy_share_in_the_total_final_energy…¹ co2_emissions
##    <chr>  <dbl>                                              <dbl>         <dbl>
##  1 China   2000                                               29.6          3.35
##  2 China   2001                                               28.4          3.53
##  3 China   2002                                               27            3.81
##  4 China   2003                                               23.9          4.42
##  5 China   2004                                               20.2          5.12
##  6 China   2005                                               17.4          5.82
##  7 China   2006                                               16.4          6.44
##  8 China   2007                                               14.9          6.99
##  9 China   2008                                               14.1          7.20
## 10 China   2009                                               13.4          7.72
## # ℹ 74 more rows
## # ℹ abbreviated name:
## #   ¹​`renewable_energy_share_in_the_total_final_energy_consumption_(%)`

Create a line chart illustrating the percentage share of renewable energy in the total final energy consumption over time in the 4 leading carbon-intensive countries

p1 <- co2_leaders |> ggplot(aes(x = year, y = `renewable_energy_share_in_the_total_final_energy_consumption_(%)`, color = entity)) +
  geom_point() +
  geom_line() +
  labs(
    x = "Year", 
    y = "Renewable Energy Share (%)",
    title = "Renewable Energy Adoption in Leading Carbon-Intensive Countries (2000-2019)"
  ) +
  theme_minimal() +
  scale_color_brewer(palette = "Set1", name = "Country")
p1
## Warning: Removed 4 rows containing missing values (`geom_point()`).
## Warning: Removed 4 rows containing missing values (`geom_line()`).

Create a line chart illustrating those countries’ CO2 emissions per capita over the years

p2 <- co2_leaders |> ggplot(aes(x = year, y = co2_emissions, color = entity)) +
  geom_point() +
  geom_line() +
  labs(
    x = "Year",
    y = "CO2 Emissions\n(millions of metric tons per capita)",
    title = "Carbon Dioxide Emissions Per Capita in Leading Carbon-Intensive Countries (2000-2019)"
  ) +
  theme_minimal() +
  scale_color_brewer(palette = "Set1", name = "Country")
p2
## Warning: Removed 4 rows containing missing values (`geom_point()`).
## Warning: Removed 4 rows containing missing values (`geom_line()`).

Install and load “gridExtra” to use the grid.arrange() function to show both plots on the same page

#install.packages("gridExtra")
library(gridExtra)
## 
## Attaching package: 'gridExtra'
## The following object is masked from 'package:dplyr':
## 
##     combine
grid.arrange(p1, p2)
## Warning: Removed 4 rows containing missing values (`geom_point()`).
## Warning: Removed 4 rows containing missing values (`geom_line()`).
## Warning: Removed 4 rows containing missing values (`geom_point()`).
## Warning: Removed 4 rows containing missing values (`geom_line()`).

Conclusion

In this project, the data was cleaned by lowercasing all variable names and replacing all spaces with underscores in addition to dropping missing values from calculations. The first visualization depicts yearly changes in renewable energy share of total energy consumption in the top four countries with the highest mean CO2 emissions per capita in the 20-year period the data was collected from. The second visualization illustrates yearly changes in CO2 emissions per capita in those countries. China and India reduced their renewable energy share over the years and their CO2 emissions simultaneously rose during that period, though not necessarily at the same rate. By contrast, Japan and the US lowered their CO2 emissions per capita as they increased their renewable energy share percentage, which implies that the adoption of clean energy sources can serve as an effective means to reduce carbon footprints and help mitigate the global issue of climate change. One thing I wish I could have done for this project was create a heatmap showing all the countries and color-coding based on renewable electricity generating capacity or energy consumption per person.