This is a project about the relationship between greenhouse gas emissions and gross domestic income of different countries across the globe. The goal of this project is to find out the relationship between a country’s income and greenhouse gas emissions per capita. This project uses several variables to establish the relationship between greenhouse gas emissions across countries with respect to their income levels. The variables include.
Country code - This is a short identification code for different countries around the world.
Country - This shows the names of the countries used in the dataset
Income group - It shows the income classification of different countries, whether they are high income, upper income or low income.
Gross national income per capita - This displays the measure of a country’s total income divided by its population.
Greenhouse gas per capita - This shows the average amount of greenhouse gases emitted by each person in a specific country.
Change - This shows the percentage change in GHG emissions.
Source of the dataset: World Bank, https://databank.worldbank.org/metadataglossary/world-development-indicators/series/EN.ATM.GHGT.KT.CE
Load libraries
library(tidyverse)
── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ──
✔ dplyr 1.1.4 ✔ readr 2.1.5
✔ forcats 1.0.0 ✔ stringr 1.5.1
✔ ggplot2 3.5.2 ✔ tibble 3.2.1
✔ lubridate 1.9.4 ✔ tidyr 1.3.1
✔ purrr 1.0.4
── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
✖ dplyr::filter() masks stats::filter()
✖ dplyr::lag() masks stats::lag()
ℹ Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to become errors
Load the dataset
data <-read_csv("greenhousegas_gni2018.csv")
Rows: 179 Columns: 7
── Column specification ────────────────────────────────────────────────────────
Delimiter: ","
chr (3): Country Code, Country Name, IncomeGroup
dbl (4): GNI Per Capita (USD), GHG Per Capita, change, population
ℹ Use `spec()` to retrieve the full column specification for this data.
ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
head(data)
# A tibble: 6 × 7
`Country Code` `Country Name` IncomeGroup `GNI Per Capita (USD)`
<chr> <chr> <chr> <dbl>
1 AFG Afghanistan Low income 550
2 ALB Albania Upper middle income 4860
3 DZA Algeria Upper middle income 4060
4 AGO Angola Lower middle income 3370
5 ARG Argentina Upper middle income 12370
6 ARM Armenia Upper middle income 4230
# ℹ 3 more variables: `GHG Per Capita` <dbl>, change <dbl>, population <dbl>
## Log Transformation for GNI Per Capitadata$log_GNI <-log(data$`GNI Per Capita (USD)`)
Data Visualization
ggplot(data, aes(x = IncomeGroup, y =`GHG Per Capita`, fill = IncomeGroup)) +geom_bar(stat ="summary", fun ="mean") +labs(title ="Average Greenhouse Gas Emissions Per Capita by Income Group",x ="Income Group",y ="Average GHG Emissions Per Capita (Metric Tons)",caption ="Source: World Bank Dataset") +scale_fill_manual(values =c("#1f77b4", "#ff7f0e", "#2ca02c", "#d62728")) +theme_minimal() +theme(plot.title =element_text(hjust =0.5),legend.position ="bottom")
Essay
In conducting this project, the income group variable of the dataset was converted into factor form for proper ordering. A log transformation was applied to GNI Per Capita to handle its wide range of values, making the visualization more interpretable. The visualization represents countries grouped according to their income in correlation to their greenhouse gas emissions. There were some obvious and not-so-obvious insights into this project.
The visualization showed that high-income countries exhibited the highest average GHG emissions per capita, followed by upper-middle-income countries. Low-income countries have the lowest emissions, which was expected due to lower economic activity. What was surprising was that some upper-middle-income countries (e.g., China) have emissions comparable to high-income nations, suggesting that industrialization plays a significant role beyond just income levels. Also, keep in mind the limited information used in the visualization. The visualization does not account for population size, which could skew perceptions. A future improvement could include a weighted average based on population.
I ran into some trouble over the course of this project, which I had to overcome. Initially, the wide range of GNI Per Capita values made it difficult to visualize trends. To resolve it, a log transformation was created to resolve the issue. Including population size in the visualization would provide deeper insights, but was omitted due to complexity.
In conclusion, I think this project was partially successful. The project explored the relationship between income groups and GHG emissions per capita. The findings highlight the disparity in emissions across income groups, with high-income countries contributing significantly more per capita. Future work could explore the role of industrialization in greenhouse gas emissions around the world. The dataset was effectively cleaned and visualized, though the inclusion of certain variables could have enhanced the overall output.