The data set we’re working with today collected global temperature anomalies and carbon emissions from 1751 to 2018. The analysis we will perfom today will help us answering this question: Are carbon emissions increasing or decreasing over time? To do this, we selected a 10-year sample, from 2004 to 2014.
Packages Installation
library(tidyverse)
── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ──
✔ dplyr 1.1.4 ✔ readr 2.1.5
✔ forcats 1.0.0 ✔ stringr 1.5.1
✔ ggplot2 3.5.2 ✔ tibble 3.2.1
✔ lubridate 1.9.4 ✔ tidyr 1.3.1
✔ purrr 1.0.4
── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
✖ dplyr::filter() masks stats::filter()
✖ dplyr::lag() masks stats::lag()
ℹ Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to become errors
Attaching package: 'plotly'
The following object is masked from 'package:ggplot2':
last_plot
The following object is masked from 'package:stats':
filter
The following object is masked from 'package:graphics':
layout
First let’s start by loading our Data set and all our libraries
library(ggthemes) # a package of new themeslibrary(ggrepel) # will help us add text labels to pointsdata("temp_carbon")
Perfoms the necessary cleaning
To perform our analysis here, we first need to clean our dataset and remove all na. We will also filter our dataset to work only on the period from 2004 to 2014.
Now let’s create a scatter plot to analyze the evolution of carbon emissions over the years. For our analysis, we chose a sample of 10 years, from 2004 to 2014. We first defined our graph by adding the x and y axes, a legend, and an attractive title. We then added points to the graph to determine if there is a relationship between the years and carbon emissions. The goal here is , we want to know if carbon emissions have increased or decreased over the years. We also added the temperature anomaly to our graph, which allows us to precisely observe the relationship between the temperature anomaly and carbon emissions.
ggplot(sample001, aes(x=year,y=carbon_emissions, color=temp_anomaly, label = temp_anomaly))+labs(x="Year",y="Carbon emissions in T/co2",caption ="Source: DS Labs Database",title ="COMPARISON OF THE CARBON EMISSIONS OVER THE YEAR ")+theme_minimal(base_size =12, base_family ="serif")+geom_point(size=5, alpha=0.5)+geom_smooth(method = lm, se=FALSE,color="maroon",lty=2, linewidth=0.6)+geom_text_repel(nudge_x =0.9)
`geom_smooth()` using formula = 'y ~ x'
Warning: The following aesthetics were dropped during statistical transformation: label.
ℹ This can happen when ggplot fails to infer the correct grouping structure in
the data.
ℹ Did you forget to specify a `group` aesthetic or to convert a numerical
variable into a factor?
As we can see from our scatter plot, from 2004 to 2014 carbon emissions increased.