The major focus for this project is to ask 2 questions: How did the inception of industrialization affect the rate of CO2 emission and how does level of a nation’s income affect the level of CO2 emission?
#Loading required libraries
library(dplyr)
##
## Attaching package: 'dplyr'
## The following objects are masked from 'package:stats':
##
## filter, lag
## The following objects are masked from 'package:base':
##
## intersect, setdiff, setequal, union
library(tidyverse)
## ── Attaching packages ──────────────────────────────────────────────────────────────────────── tidyverse 1.2.1 ──
## ✔ ggplot2 3.2.1 ✔ readr 1.3.1
## ✔ tibble 2.1.3 ✔ purrr 0.3.2
## ✔ tidyr 0.8.3 ✔ stringr 1.4.0
## ✔ ggplot2 3.2.1 ✔ forcats 0.4.0
## ── Conflicts ─────────────────────────────────────────────────────────────────────────── tidyverse_conflicts() ──
## ✖ dplyr::filter() masks stats::filter()
## ✖ dplyr::lag() masks stats::lag()
library(ggplot2)
library(gganimate)
#Read Data
#co2_emissions <- read.csv("co2_global_emissions.csv")
#co2_emissions_regions <- read.csv("co2_global_emissions_regions.csv")
# The forementioned datasets were merged using Open Refine
co2_emissions_merged <- read.csv("co2_global_emissions_merged_clean.csv")
#view(co2_emissions_merged)
#Read a seperate dataset to show the results of Carbon Dioxide emission in relation to the evolution of industrialization.
co2_cumulative_emissions <- read.csv("co2_cumulative_emissions.csv")
# Clean and explore the data variables
head(co2_emissions_merged)
## Country Country_Code Income_Group Region Year
## 1 Aruba ABW High income Latin America & Caribbean 1986
## 2 Aruba ABW High income Latin America & Caribbean 1987
## 3 Aruba ABW High income Latin America & Caribbean 1988
## 4 Aruba ABW High income Latin America & Caribbean 1989
## 5 Aruba ABW High income Latin America & Caribbean 1990
## 6 Aruba ABW High income Latin America & Caribbean 1991
## Emission_per_Capita
## 1 2.868319
## 2 7.235198
## 3 10.026179
## 4 10.634733
## 5 26.374503
## 6 26.046130
str(co2_emissions_merged)
## 'data.frame': 12261 obs. of 6 variables:
## $ Country : Factor w/ 264 levels "Afghanistan",..: 11 11 11 11 11 11 11 11 11 11 ...
## $ Country_Code : Factor w/ 264 levels "ABW","AFG","AGO",..: 1 1 1 1 1 1 1 1 1 1 ...
## $ Income_Group : Factor w/ 5 levels "","High income",..: 2 2 2 2 2 2 2 2 2 2 ...
## $ Region : Factor w/ 8 levels "","East Asia & Pacific",..: 4 4 4 4 4 4 4 4 4 4 ...
## $ Year : int 1986 1987 1988 1989 1990 1991 1992 1993 1994 1995 ...
## $ Emission_per_Capita: num 2.87 7.24 10.03 10.63 26.37 ...
summary(co2_emissions_merged)
## Country Country_Code Income_Group
## Afghanistan : 55 AFG : 55 :2400
## Albania : 55 AGO : 55 High income :3355
## Algeria : 55 ALB : 55 Low income :1540
## Angola : 55 ARB : 55 Lower middle income:2280
## Antigua and Barbuda: 55 ARE : 55 Upper middle income:2686
## Arab World : 55 ARG : 55
## (Other) :11931 (Other):11931
## Region Year Emission_per_Capita
## Sub-Saharan Africa :2460 Min. :1960 Min. : -0.0201
## :2400 1st Qu.:1975 1st Qu.: 0.4419
## Europe & Central Asia :2049 Median :1989 Median : 1.6688
## Latin America & Caribbean :1986 Mean :1988 Mean : 4.2151
## East Asia & Pacific :1684 3rd Qu.:2002 3rd Qu.: 5.7911
## Middle East & North Africa:1110 Max. :2014 Max. :100.6977
## (Other) : 572 NA's :12 NA's :12
#Remove missing data
co2_co2_emissions_merged <- na.omit(co2_emissions_merged)
dim(co2_emissions_merged)
## [1] 12261 6
# Clean and explore the data variable for co2_cumulative_emissions
head(co2_cumulative_emissions)
## Country Country_Code Year Cumulative_CO2_emissions_tonnes
## 1 Afghanistan AFG 1751 0
## 2 Afghanistan AFG 1752 0
## 3 Afghanistan AFG 1753 0
## 4 Afghanistan AFG 1754 0
## 5 Afghanistan AFG 1755 0
## 6 Afghanistan AFG 1756 0
str(co2_cumulative_emissions)
## 'data.frame': 61677 obs. of 4 variables:
## $ Country : Factor w/ 231 levels "Afghanistan",..: 1 1 1 1 1 1 1 1 1 1 ...
## $ Country_Code : Factor w/ 223 levels "","ABW","AFG",..: 3 3 3 3 3 3 3 3 3 3 ...
## $ Year : int 1751 1752 1753 1754 1755 1756 1757 1758 1759 1760 ...
## $ Cumulative_CO2_emissions_tonnes: num 0 0 0 0 0 0 0 0 0 0 ...
summary(co2_cumulative_emissions)
## Country Country_Code Year
## Afghanistan : 267 : 2403 Min. :1751
## Africa : 267 ABW : 267 1st Qu.:1817
## Albania : 267 AFG : 267 Median :1884
## Algeria : 267 AGO : 267 Mean :1884
## Americas (other): 267 AIA : 267 3rd Qu.:1951
## Andorra : 267 ALB : 267 Max. :2017
## (Other) :60075 (Other):57939
## Cumulative_CO2_emissions_tonnes
## Min. :0.000e+00
## 1st Qu.:0.000e+00
## Median :0.000e+00
## Mean :2.386e+09
## 3rd Qu.:4.305e+06
## Max. :1.575e+12
##
#Remove missing data
co2_cumulative_emissions <- na.omit(co2_cumulative_emissions)
dim(co2_cumulative_emissions)
## [1] 61677 4
#Bar Plot for Cumulative CO2 Emission
library(plotly)
##
## Attaching package: 'plotly'
## The following object is masked from 'package:ggplot2':
##
## last_plot
## The following object is masked from 'package:stats':
##
## filter
## The following object is masked from 'package:graphics':
##
## layout
co2_emission_bar <- ggplot(co2_cumulative_emissions, aes(x = Year, y = Cumulative_CO2_emissions_tonnes)) +
xlab("Year") +
ylab("CO2 Emissions") +
theme_minimal(base_size = 14)+
geom_bar( position = "dodge", stat = "identity",color= "red") +
xlim(1850, 2017) +
ggtitle("CO2 Emission (1850 - 2016")
ggplotly(co2_emission_bar)
#Regression Diagnostics
fit <- lm(Emission_per_Capita ~ Year, data = co2_emissions_merged)
summary(fit)
##
## Call:
## lm(formula = Emission_per_Capita ~ Year, data = co2_emissions_merged)
##
## Residuals:
## Min 1Q Median 3Q Max
## -5.059 -3.524 -2.394 1.406 97.154
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) -64.722252 7.798821 -8.299 <2e-16 ***
## Year 0.034670 0.003922 8.840 <2e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 6.892 on 12247 degrees of freedom
## (12 observations deleted due to missingness)
## Multiple R-squared: 0.00634, Adjusted R-squared: 0.006259
## F-statistic: 78.14 on 1 and 12247 DF, p-value: < 2.2e-16
fit
##
## Call:
## lm(formula = Emission_per_Capita ~ Year, data = co2_emissions_merged)
##
## Coefficients:
## (Intercept) Year
## -64.72225 0.03467
#Draw linear regression for Emission over the years.
co2_regression <- ggplot(co2_emissions_merged, aes(x=Year, y= Emission_per_Capita)) +
geom_point(aes( color = Income_Group), na.rm = TRUE) +
geom_smooth(method = "lm", se = TRUE, color = "red")
co2_regression
## Warning: Removed 12 rows containing non-finite values (stat_smooth).
The emergence of first industrial revolution in the mid 1700s, pioneered the the United Kingdom’s as a carbon dioxide emitter. Colonization and slavery produced raw products such as cotton that created the cotton industry. And with Britian, resources such as coal and iron propelled emergence of factories. Britian, North America, Europe and Japan were major players when the second industrial revolution from mid 1800s was in full gear until the 20th Century, driven by World War I & II.
co2_cumulative_emissions <- na.omit(co2_cumulative_emissions)
co2_cumulative_emissions %>%
filter(Country == "China" | Country == "Germany" | Country == "Japan" |
Country == "United Kingdom" | Country == "United States", ) %>%
ggplot(aes(x = Year, y = Cumulative_CO2_emissions_tonnes,
color = Country)) +
geom_line(size = .75, na.rm = TRUE) +
transition_reveal(Year, keep_last = TRUE) +
xlab("Year") +
ylab ("Cumulative CO2 (in tonnes)") +
xlim(1800, 2017) +
ggtitle("Cumulative CO2 Emissions")
## geom_path: Each group consists of only one observation. Do you need to
## adjust the group aesthetic?
In lieu of a ggplot, I opted to utilize Tableau to display the transition of CO2 emission per capita from 1960 to 2014. Per capita CO2 emissions
The key drawback of measuring the total national emissions is that it takes no account of the nation’s population size. China is currently the world’s largest emitter, but since it also has the largest population, all being equal we would expect this to be the case. To make a fair comparison of contributions, we have to therefore compare emissions in terms of CO2 emitted per person. Source: Our World in Data
A packed bubble visualization was added to show the correlation between income groups and the rate of CO2. The visualization symolizes a black hole; the low income at high risk of the effects of CO2 emission contributed by the high income group.
#Bibliography - CO₂ and Greenhouse Gas Emissions by Hannah Ritchie and Max Roser https://ourworldindata.org/co2-and-other-greenhouse-gas-emissions
Causes of the First Industrial Revolution by Instructor: Patricia Chappine https://study.com/academy/lesson/causes-of-the-first-industrial-revolution.html
The Industrial Revolution in the United States http://www.loc.gov/teachers/classroommaterials/primarysourcesets/industrial-revolution/pdf/teacher_guide.pdf