Global Warming Contributions - Final Project

Source: Pixabay.com
Source: Pixabay.com

Introduction

The dataset I chose for this project is about global warming contributions by gas and source, pulled from the website ourworldindata.org. The data was prepared by going through the process of standardizing country names and world region definitions, converting units, calculating derived indicators such as per capita measures, and lastly adding or adapting metadata such as the name or the description given to an indicator. This data includes information about changes in global mean surface temperature caused by the emissions of three gases; carbon dioxide, methane, and nitrous oxide. It also groups the data by the source; fossil fuels/ industry or agriculture / land. For my project, I focused on the categorical variables Entity and Year and the quantitative variables ‘Change in global mean surface temperature caused by CO₂ emissions from fossil fuels and industry’ and ‘Change in global mean surface temperature caused by CO₂ emissions from agriculture and land use’.I first started cleaning the Entity and chose the ten most visited countries. I then filtered the years to be every 20 years starting in 1840. I wanted to include the 1800s since that is when the Industrial Revolution first began, causing an increase in the fossil fuels used. I decided I wanted to rename the quantitative variables to be shorter so that it would be easier to code. Lastly, I mutated the quantitative variables and multiplies them by 100 so that my visualizations would show up better. I decided to choose this topic for my final project because global warming is currently a huge rising issue, and I wanted to visualize the trends behind it.

Background - More about Global Warming

Global warming can be defined as the planet’s overall temperature. As the human population increases and the world continues to advance, it has become evident that the burning of fossil fuels such as coal, oil, natural gas, etc. has caused the global surface temperature to increase rapidly. This can essentially lead to climate changes, where the weather starts to be affected. It can lead to a rise in sea levels because of melting ice and glaciers (Global Warming).

Work Cited: Global Warming. education.nationalgeographic.org/resource/global-warming.

Loading Libraries

library(tidyverse) #setting libraries
## ── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ──
## ✔ dplyr     1.1.4     ✔ readr     2.1.5
## ✔ forcats   1.0.0     ✔ stringr   1.5.1
## ✔ ggplot2   3.4.4     ✔ tibble    3.2.1
## ✔ lubridate 1.9.3     ✔ tidyr     1.3.0
## ✔ purrr     1.0.2     
## ── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
## ✖ dplyr::filter() masks stats::filter()
## ✖ dplyr::lag()    masks stats::lag()
## ℹ Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to become errors
library(ggplot2)
library(highcharter)
## Warning: package 'highcharter' was built under R version 4.3.3
## Registered S3 method overwritten by 'quantmod':
##   method            from
##   as.zoo.data.frame zoo
setwd("C:/Users/asman/Documents/data110")
globalwarming <- read_csv("globalwarming.csv") #Dataset
## Rows: 41280 Columns: 9
## ── Column specification ────────────────────────────────────────────────────────
## Delimiter: ","
## chr (2): Entity, Code
## dbl (7): Year, Change in global mean surface temperature caused by nitrous o...
## 
## ℹ Use `spec()` to retrieve the full column specification for this data.
## ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
head(globalwarming)
## # A tibble: 6 × 9
##   Entity      Code   Year Change in global mean surface…¹ Change in global mea…²
##   <chr>       <chr> <dbl>                           <dbl>                  <dbl>
## 1 Afghanistan AFG    1851                   0.00000000262           0.0000000999
## 2 Afghanistan AFG    1852                   0.00000000529           0.000000202 
## 3 Afghanistan AFG    1853                   0.00000000800           0.000000306 
## 4 Afghanistan AFG    1854                   0.0000000107            0.000000411 
## 5 Afghanistan AFG    1855                   0.0000000135            0.000000518 
## 6 Afghanistan AFG    1856                   0.0000000163            0.000000627 
## # ℹ abbreviated names:
## #   ¹​`Change in global mean surface temperature caused by nitrous oxide emissions from fossil fuels and industry`,
## #   ²​`Change in global mean surface temperature caused by nitrous oxide emissions from agriculture and land use`
## # ℹ 4 more variables:
## #   `Change in global mean surface temperature caused by methane emissions from fossil fuels and industry` <dbl>,
## #   `Change in global mean surface temperature caused by methane emissions from agriculture and land use` <dbl>,
## #   `Change in global mean surface temperature caused by CO₂ emissions from fossil fuels and industry` <dbl>, …

Cleaning Up Dataset

globalwarming1 <- globalwarming |>
  filter(Entity %in% c("France", "Spain", "United States", "China", "Italy", "Brazil", "United Kingdom", "Mexico", "Germany", "Canada")) |> #Filtering Most Visited Countries
  filter(Year %in% c("1840", "1860", "1880", "1900", "1920", "1940", "1960", "1980", "2000", "2020")) |> #Filtering by every 20 years
  rename(n20fossilfuels_industry = `Change in global mean surface temperature caused by nitrous oxide emissions from fossil fuels and industry`) |> #renaming to make the name shorter
  rename(n20agr_land = `Change in global mean surface temperature caused by nitrous oxide emissions from agriculture and land use`) |>
  rename(ch4fossilfuels_industry = `Change in global mean surface temperature caused by methane emissions from fossil fuels and industry`) |>#ch4 is methane
  rename(ch4agr_land = `Change in global mean surface temperature caused by methane emissions from agriculture and land use`) |> #methane
  rename(c02fossilfuels_industry = `Change in global mean surface temperature caused by CO₂ emissions from fossil fuels and industry`) |>
  rename(c02agr_land =`Change in global mean surface temperature caused by CO₂ emissions from agriculture and land use`)
head(globalwarming1)
## # A tibble: 6 × 9
##   Entity Code   Year n20fossilfuels_industry n20agr_land ch4fossilfuels_industry
##   <chr>  <chr> <dbl>                   <dbl>       <dbl>                   <dbl>
## 1 Brazil BRA    1860              0.00000308  0.00000233               0.0000195
## 2 Brazil BRA    1880              0.0000102   0.00000804               0.0000625
## 3 Brazil BRA    1900              0.0000193   0.0000206                0.000132 
## 4 Brazil BRA    1920              0.0000313   0.0000660                0.000290 
## 5 Brazil BRA    1940              0.0000468   0.000120                 0.000582 
## 6 Brazil BRA    1960              0.0000770   0.000310                 0.00116  
## # ℹ 3 more variables: ch4agr_land <dbl>, c02fossilfuels_industry <dbl>,
## #   c02agr_land <dbl>

More Cleaning

globalwarming2 <- globalwarming1 |>
  mutate(n20fossilfuels_industry = n20fossilfuels_industry * 100) |> #multiplying by 100 
  mutate(n20agr_land = n20agr_land * 100) |>
  mutate(ch4fossilfuels_industry = ch4fossilfuels_industry * 100) |>
  mutate(ch4agr_land = ch4agr_land * 100) |>
  mutate(c02fossilfuels_industry = c02fossilfuels_industry * 100) |>
  mutate(c02agr_land = c02agr_land * 100)
head(globalwarming2)
## # A tibble: 6 × 9
##   Entity Code   Year n20fossilfuels_industry n20agr_land ch4fossilfuels_industry
##   <chr>  <chr> <dbl>                   <dbl>       <dbl>                   <dbl>
## 1 Brazil BRA    1860                0.000308    0.000233                 0.00195
## 2 Brazil BRA    1880                0.00102     0.000804                 0.00625
## 3 Brazil BRA    1900                0.00193     0.00206                  0.0132 
## 4 Brazil BRA    1920                0.00313     0.00660                  0.0290 
## 5 Brazil BRA    1940                0.00468     0.0120                   0.0582 
## 6 Brazil BRA    1960                0.00770     0.0310                   0.116  
## # ℹ 3 more variables: ch4agr_land <dbl>, c02fossilfuels_industry <dbl>,
## #   c02agr_land <dbl>

Linear Regression Analysis

linearmodel <- lm(c02agr_land ~  c02fossilfuels_industry, data = globalwarming2) #equation
summary(linearmodel)
## 
## Call:
## lm(formula = c02agr_land ~ c02fossilfuels_industry, data = globalwarming2)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -1.5067 -0.4092 -0.2910  0.0801  3.9902 
## 
## Coefficients:
##                         Estimate Std. Error t value Pr(>|t|)    
## (Intercept)              0.31778    0.11343   2.802  0.00625 ** 
## c02fossilfuels_industry  0.26695    0.03633   7.347 9.77e-11 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.9878 on 88 degrees of freedom
## Multiple R-squared:  0.3802, Adjusted R-squared:  0.3732 
## F-statistic: 53.98 on 1 and 88 DF,  p-value: 9.766e-11

The model has the equation: c02agr_land = 0.27(c02fossilfuels_industry) + 0.32

The p-value on the right of c02fossilfuels_industry has 3 asterisks which suggests it is a meaningful variable to explain the linear increase in c02agr_land. However, the Adjusted R-Squared value states that about 37% of the variation may be explained by the model. In other words, 63% of the variation in the data is likely not explained by this model.

Linear Regression Plot

linearplot <- ggplot(globalwarming2, aes(x = c02fossilfuels_industry, y = c02agr_land)) +
  geom_point() +
  geom_smooth(method = "lm", se = FALSE, color = "#344E41") +  #linear method
  labs(x = "C02 in Fossil Fuels & Industry",
       y = "C02 in Agriculture & Land",
       title = "Global Surface Temperature: C02 in Fossil Fuels vs C02 in Agriculture & Land")+  # Axis labels and title
  theme_classic() +
  theme(panel.background = element_rect(fill = "#A3B18A")) #background color
linearplot
## `geom_smooth()` using formula = 'y ~ x'

Creating Simple Plots

simpleplot1 <- ggplot(globalwarming2, aes(x = c02fossilfuels_industry, y = c02agr_land, color = Entity)) +
  geom_point(aes(size = 2)) + #Bigger point size
  scale_color_manual(values =  c("#A26360", "#E8B298", "#EDCC8B", "#BDE1B3", "#31572c", "#8DD6E2" , "#9194E2","#C6A0c4", "#370c11","#67657f"))+ #adding color
  labs(x = "C02 in Fossil Fuels & Industry",
       y = "C02 in Agriculture & Land",
       title = "C02 in Fossil Fuels vs C02 in Agriculture & Land by Entity")+  # Axis labels and title
  theme_test() +
  theme(panel.background = element_rect(fill = "#ecf1e6")) #background color
simpleplot1

This visualization groups together the Entitys by color and shows us that the country with the highesst changes of global surface temperature caused by c02 is the United States and the second highest is China.

Second Simple Plot

For my next two plots, I wanted to take a closer look at the differences between fossil fuels / industry and agriculture / land grouped by Entity.

simpleplot2 <- ggplot(globalwarming2, aes(x = Entity, y = c02fossilfuels_industry)) +
  geom_boxplot(fill = "darkolivegreen", color = "darkseagreen") +
  coord_flip()+ #Flipping the axes
  labs(x = "Entity",
       y = "C02 in Fossil Fuels",
       title = "C02 in Fossil Fuels by Entity")+  # Axis labels and title
  theme_test() +
  theme(panel.background = element_rect(fill = "#ecf1e6"))
simpleplot2

From this visualization we can conclude that the United States, Germany, United Kingdom, and China have the highest use of C02.

Last Simple Plot

simpleplot3 <- ggplot(globalwarming2, aes(x = Entity, y = c02agr_land)) +
  geom_boxplot(fill = "darkolivegreen", color = "darkseagreen") +
  coord_flip() + #Flipping the axes 
  labs(x = "Entity",
       y = "C02 in Agriculture & Land",
       title = "C02 in Agriculture & Land by Entity")+  # Axis labels and title
  theme_test() +
  theme(panel.background = element_rect(fill = "#ecf1e6"))
simpleplot3

From this visualization, we can see that Brazil and Canada had a huge difference in agriculture / land compared to fossil fuels. The United States and China remain at a high level.

Final Visualizations

cols <- c("#31572c","#a6b196", "#4f772d", "#90a955", "#ecf39e","#d4f3b7", "#eaeeea","#505c45", "#96d031", "#132a13")#colors
highchart () |>
 hc_add_series(data = globalwarming2,
 type = "streamgraph", #creating a stream graph
 hcaes(x = Year,
 y = c02fossilfuels_industry,
 group = Entity)) |> #grouping by country
hc_chart(backgroundColor = "#d0cdc9") |> #background color
  hc_xAxis(title = list(text="Year")) |>
  hc_yAxis(title = list(text="C02 of Fossil Fuels and Industry")) |>
  hc_title(text="Changes in Global Surface Temperature caused by C02 of Fossil Fuels & Industry")|>
    hc_colors(cols)
highchart () |>
 hc_add_series(data = globalwarming2,
 type = "streamgraph",
 hcaes(x = Year,
 y = c02agr_land,
 group = Entity)) |>
hc_chart(backgroundColor = "#d0cdc9") |>
    hc_colors(cols) |>
  hc_xAxis(title = list(text="Year")) |>
  hc_yAxis(title = list(text="C02 of Agriculture and Land")) |>
  hc_title(text="Changes in Global Surface Temperature caused by C02 of Agriculture and Land")|>
  hc_caption(text = "Source: Our World in Data")

Conclusion

Overall, these visualizations show us that over time, the changes in global mean surface temperature caused by c02 has increased rapidly. In the fossil fuels and industry visualization we can see that the increase only started around 1880, and this is explained by the start of the industrial revolution in the 1800s. Even though I selected the countries which are most visited, mostly the countries with more population such as China and the United States seem to have high amounts of c02. It is also evident that Brazil has an extremely low amount of c02 of fossil fuels and industry, however they have a really high amount of c02 in agriculture and land. This could be explained by the large forests and tropical land that exist in Brazil. For this project, I wish I could have faceted the two visualizations together so that it would be visible next to eachother, but I couldn’t figure out how especially with highcharter. I attempted with ggplot but didn’t get enough time to further work on it.