Climate scientists have measured the concentration of carbon dioxide
(CO2) in the Earth’s atmosphere dating back thousands of years. In this
project, we will investigate two datasets containing information about
the carbon dioxide levels in the atmosphere. We will specifically look
at the increase in carbon dioxide levels over the past hundred years
relative to the variability in levels of CO2 in the atmosphere over
eight millennia.
These data are calculated by analyzing ice cores. Over time, gas gets
trapped in the ice of Antarctica. Scientists can take samples of that
ice and see how much carbon is in it. The deeper you go into the ice,
the farther back in time you can analyze!
1.The first dataset comes from World Data Center for
Paleoclimatology, Boulder and NOAA Paleoclimatology Program and
describes the carbon dioxide levels back thousands of years “Before
Present” (BP) or before January 1, 1950.
2.The second dataset explores carbon dioxide starting at year zero up
until the recent year of 2014. This dataset was compiled by the
Institute for Atmospheric and Climate Science (IAC) at Eidgenössische
Technische Hochschule in Zürich, Switzerland.
In order to understand the information in these datasets, it’s
important to understand two key facts about the data:
The metric for carbon dioxide level is measured as parts per million
or CO2 ppmv. This number describes the number of carbon dioxide
molecules per one million gas molecules in our atmosphere.
The second metric describes years before present, which is “a time
scale used mainly in … scientific disciplines to specify when events
occurred in the past… standard practice is to use 1 January 1950 as the
commencement date of the age scale, reflecting the origin of practical
radiocarbon dating in the 1950s. The abbreviation “BP” has alternatively
been interpreted as “Before Physics” that is, before nuclear weapons
testing artificially altered the proportion of the carbon isotopes in
the atmosphere, making dating after that time likely to be unreliable.”
This means that saying “the year 20 BP” would be the equivalent of
saying “The year 1930.”
1.Carbon Dioxide over Time Line Graph
# load libraries and data
library(readr)
library(dplyr)
library(ggplot2)
1.Let’s first explore the dataset containing the data from the World
Data Center for Paleoclimatology, Boulder and NOAA Paleoclimatology
Program. Import the “carbon_dioxide_levels.csv” and save it to a new
variable named noaa_data.
noaa_data <- read_csv("carbon_dioxide_levels.csv")
noaa_data
2.Inspect the head of the data frame. What are the names of the two
columns? What types of values are in each?
head(noaa_data)
3.Let’s visualize this data. First, create a new variable named
noaa_viz that is equal to a new ggplot() object and assign noaa_data as
its data argument. Be sure to state the name of the variable after you
define it so that it is rendered to the R notebook.
options(scipen=10000) #removes scientific notation
#Create NOAA Visualization here:
noaa_viz <- ggplot(noaa_data)
noaa_viz

4.Define your scales by creating an aesthetic mapping that maps
Age_yrBP on the x-axis and CO2_ppmv on the y-axis as part of the
canvas.
noaa_viz <- ggplot(noaa_data, aes(x = Age_yrBP, y = CO2_ppmv))
noaa_viz

5.In climate science, it’s common to create line graphs to best
portray the fluctuations in the levels of carbon dioxide. Add a
geom_line() layer to the noaa_viz plot.
noaa_viz <- ggplot(noaa_data, aes(x = Age_yrBP, y = CO2_ppmv)) + geom_line()
noaa_viz

6.Let’s add context to the plot and improve its legibility. Title the
plot “Carbon Dioxide Levels From 8,000 to 136 Years BP” and add a
subtitle that cites the data “From World Data Center for
Paleoclimatology and NOAA Paleoclimatology Program”.
noaa_viz <- ggplot(noaa_data, aes(x = Age_yrBP, y = CO2_ppmv)) + geom_line() + labs(title = "Carbon Dioxide Levels From 8,000 to 136 Years BP", subtitle = "From World Data Center for Paleoclimatology and NOAA Paleoclimatology Program")
noaa_viz

7.Tweak the axis labels so they are more descriptive than the column
headers. The x-axis should read “Years Before Today (0=1950)” and the
y-axis should read “Carbon Dioxide Level (Parts Per Million)”
noaa_viz <- ggplot(noaa_data, aes(x = Age_yrBP, y = CO2_ppmv)) + geom_line() + labs(title = "Carbon Dioxide Levels From 8,000 to 136 Years BP", subtitle = "From World Data Center for Paleoclimatology and NOAA Paleoclimatology Program", x = "Years Before Today (0=1950)", y = "Carbon Dioxide Level (Parts Per Million)")
noaa_viz

8.Currently, the order of the years is counterintuitive. Since the
most recent date is the date closest to 0, or 1950 as Before Physics is
described, we want the years on the x-axis arranged in descending order.
Add this to noaa_viz:
noaa_viz + scale_x_reverse(lim=c(800000,0))

noaa_viz <- noaa_viz + scale_x_reverse(lim=c(800000,0))
noaa_viz

2.Carbon Dioxide Levels in the last Two Millennia
9.In the second code block, let’s explore the second dataset
containing the data for the last 2014 years. Import the “yearly_co2.csv”
file and save it to a new variable named iac_data.
#Create IAC Visualization
iac_data <- read_csv( "yearly_co2.csv")
iac_data
10.Inspect the head. What are the names of the four columns? What
types of values are in each? Note that the data_mean_global is an
equivalent metric to CO2_ppmv. We will not be using the other two
columns in this project. What’s different about the year column in this
dataset?
head(iac_data)
11.Again, let’s create a new ggplot() object named iac_viz and
associate iac_data as its data argument. Let’s make a new variable named
iac_viz. Be sure to state the name of the variable after you define it
so that it is rendered to the R notebook.
iac_viz <- ggplot(iac_data)
iac_viz

12.Define your scales by creating an aesthetic mapping that maps year
on the x-axis and data_mean_global on the y-axis as part of the
canvas.
Note: The dataset column headers are different than the ones in the
previous data frame. Years are chornological starting from 0 up to 2014,
not in terms of BP. The data_mean_global references the same metric as
C02_ppmv for carbon dioxide average parts per million in the earth’s
atmosphere.
iac_viz <- ggplot(iac_data, aes(x = year, y = data_mean_global))
iac_viz

13.A line graph also makes sense for these data. Let’s explore how
much carbon dioxide was stored in the atmosphere over the past two
millennia by adding a geom_line() layer to the iac_viz plot.
iac_viz <- iac_viz + geom_line()
iac_viz

14.This plot still needs labels to add context to the plot. Title the
plot “Carbon Dioxide Levels over Time” and add a subtitle that cites the
data “From Institute for Atmospheric and Climate Science (IAC).”
iac_viz <- iac_viz + labs(title = "Carbon Dioxide Levels over Time", subtitle = "From Institute for Atmospheric and Climate Science (IAC).")
iac_viz

15.Tweak the axis labels so they are more descriptive than the column
headers. The x-axis should read “Year” and the y-axis should read
“Carbon Dioxide Level (Parts Per Million)”
iac_viz <- iac_viz + labs(x = "year", y = "Carbon Dioxide Level (Parts Per Million)")
iac_viz

16.Let’s highlight the rise in carbon dioxide levels by adding a
horizontal line that represents the maximum level in the first chart
spanning over 8,000 years of carbon dioxide data. On a new line of code
in the block, create a new variable named millennia_max and retrieve the
maximum value of the CO2_ppmv column in the noaa_data. Print the value
so you can see what it is.
millennia_max <- max(noaa_data$CO2_ppmv, na.rm = TRUE)
millennia_max
[1] 298.6
17.Now that we have the maximum number in the noaa_data let’s map it
on our iac_data plot. There’s a geom in ggolot called geom_hline() that
plots a horizontal line. Add a geom_hline() layer to iac_viz that has a
yintercept value in its aesthetic mapping of millennia_max.
iac_viz <- iac_viz + geom_hline(yintercept = millennia_max)
iac_viz

18.Add one more argument to the horizontal line’s aesthetic mapping
so that the legend can display information about what the line
represents. Assign the value of the linetype argument as “Historical CO2
Peak before 1950”
What do you notice has happened in the last 100 years relative to the
last 8 millennia?
iac_viz <- iac_viz + geom_hline(aes(yintercept = millennia_max, linetype = "Historical CO2 Peak before 1950"))
iac_viz

19.Add color
iac_viz <- iac_viz + geom_hline(aes(yintercept = millennia_max, linetype = "Historical CO2 Peak before 1950", color = "Historical CO2 Peak before 1950"))
iac_viz

20.Change color
iac_viz <- iac_viz + scale_color_manual(values = c("Historical CO2 Peak before 1950" = "blue"))
iac_viz

