Climate scientists have measured the concentration of carbon dioxide (CO2) in the Earth’s atmosphere dating back thousands of years. In this project, we will investigate two datasets containing information about the carbon dioxide levels in the atmosphere. We will specifically look at the increase in carbon dioxide levels over the past hundred years relative to the variability in levels of CO2 in the atmosphere over eight millennia.

These data are calculated by analyzing ice cores. Over time, gas gets trapped in the ice of Antarctica. Scientists can take samples of that ice and see how much carbon is in it. The deeper you go into the ice, the farther back in time you can analyze!

1.The first dataset comes from World Data Center for Paleoclimatology, Boulder and NOAA Paleoclimatology Program and describes the carbon dioxide levels back thousands of years “Before Present” (BP) or before January 1, 1950.

2.The second dataset explores carbon dioxide starting at year zero up until the recent year of 2014. This dataset was compiled by the Institute for Atmospheric and Climate Science (IAC) at Eidgenössische Technische Hochschule in Zürich, Switzerland.

In order to understand the information in these datasets, it’s important to understand two key facts about the data:

The metric for carbon dioxide level is measured as parts per million or CO2 ppmv. This number describes the number of carbon dioxide molecules per one million gas molecules in our atmosphere.

The second metric describes years before present, which is “a time scale used mainly in … scientific disciplines to specify when events occurred in the past… standard practice is to use 1 January 1950 as the commencement date of the age scale, reflecting the origin of practical radiocarbon dating in the 1950s. The abbreviation “BP” has alternatively been interpreted as “Before Physics” that is, before nuclear weapons testing artificially altered the proportion of the carbon isotopes in the atmosphere, making dating after that time likely to be unreliable.” This means that saying “the year 20 BP” would be the equivalent of saying “The year 1930.”

1.Carbon Dioxide over Time Line Graph

# load libraries and data
library(readr)
library(dplyr)
library(ggplot2)

1.Let’s first explore the dataset containing the data from the World Data Center for Paleoclimatology, Boulder and NOAA Paleoclimatology Program. Import the “carbon_dioxide_levels.csv” and save it to a new variable named noaa_data.

noaa_data <- read_csv("carbon_dioxide_levels.csv")
noaa_data

2.Inspect the head of the data frame. What are the names of the two columns? What types of values are in each?

head(noaa_data)

3.Let’s visualize this data. First, create a new variable named noaa_viz that is equal to a new ggplot() object and assign noaa_data as its data argument. Be sure to state the name of the variable after you define it so that it is rendered to the R notebook.

options(scipen=10000) #removes scientific notation
#Create NOAA Visualization here:
noaa_viz <- ggplot(noaa_data)
noaa_viz

4.Define your scales by creating an aesthetic mapping that maps Age_yrBP on the x-axis and CO2_ppmv on the y-axis as part of the canvas.

noaa_viz <- ggplot(noaa_data, aes(x = Age_yrBP, y = CO2_ppmv))
noaa_viz

5.In climate science, it’s common to create line graphs to best portray the fluctuations in the levels of carbon dioxide. Add a geom_line() layer to the noaa_viz plot.


noaa_viz <- ggplot(noaa_data, aes(x = Age_yrBP, y = CO2_ppmv)) + geom_line()
noaa_viz

6.Let’s add context to the plot and improve its legibility. Title the plot “Carbon Dioxide Levels From 8,000 to 136 Years BP” and add a subtitle that cites the data “From World Data Center for Paleoclimatology and NOAA Paleoclimatology Program”.

noaa_viz <- ggplot(noaa_data, aes(x = Age_yrBP, y = CO2_ppmv)) + geom_line() + labs(title =  "Carbon Dioxide Levels From 8,000 to 136 Years BP", subtitle =  "From World Data Center for Paleoclimatology and NOAA Paleoclimatology Program")

noaa_viz

7.Tweak the axis labels so they are more descriptive than the column headers. The x-axis should read “Years Before Today (0=1950)” and the y-axis should read “Carbon Dioxide Level (Parts Per Million)”

noaa_viz <- ggplot(noaa_data, aes(x = Age_yrBP, y = CO2_ppmv)) + geom_line() + labs(title =  "Carbon Dioxide Levels From 8,000 to 136 Years BP", subtitle =  "From World Data Center for Paleoclimatology and NOAA Paleoclimatology Program", x = "Years Before Today (0=1950)", y = "Carbon Dioxide Level (Parts Per Million)")

noaa_viz

8.Currently, the order of the years is counterintuitive. Since the most recent date is the date closest to 0, or 1950 as Before Physics is described, we want the years on the x-axis arranged in descending order. Add this to noaa_viz:

noaa_viz + scale_x_reverse(lim=c(800000,0)) 

noaa_viz <- noaa_viz + scale_x_reverse(lim=c(800000,0)) 

noaa_viz

2.Carbon Dioxide Levels in the last Two Millennia

9.In the second code block, let’s explore the second dataset containing the data for the last 2014 years. Import the “yearly_co2.csv” file and save it to a new variable named iac_data.

#Create IAC Visualization
iac_data <- read_csv( "yearly_co2.csv")
iac_data

10.Inspect the head. What are the names of the four columns? What types of values are in each? Note that the data_mean_global is an equivalent metric to CO2_ppmv. We will not be using the other two columns in this project. What’s different about the year column in this dataset?

head(iac_data)

11.Again, let’s create a new ggplot() object named iac_viz and associate iac_data as its data argument. Let’s make a new variable named iac_viz. Be sure to state the name of the variable after you define it so that it is rendered to the R notebook.

iac_viz <- ggplot(iac_data)
iac_viz

12.Define your scales by creating an aesthetic mapping that maps year on the x-axis and data_mean_global on the y-axis as part of the canvas.

Note: The dataset column headers are different than the ones in the previous data frame. Years are chornological starting from 0 up to 2014, not in terms of BP. The data_mean_global references the same metric as C02_ppmv for carbon dioxide average parts per million in the earth’s atmosphere.

iac_viz <- ggplot(iac_data, aes(x = year, y = data_mean_global))
iac_viz

13.A line graph also makes sense for these data. Let’s explore how much carbon dioxide was stored in the atmosphere over the past two millennia by adding a geom_line() layer to the iac_viz plot.

iac_viz <- iac_viz + geom_line()
iac_viz

14.This plot still needs labels to add context to the plot. Title the plot “Carbon Dioxide Levels over Time” and add a subtitle that cites the data “From Institute for Atmospheric and Climate Science (IAC).”

iac_viz <- iac_viz + labs(title = "Carbon Dioxide Levels over Time", subtitle = "From Institute for Atmospheric and Climate Science (IAC).")
iac_viz

15.Tweak the axis labels so they are more descriptive than the column headers. The x-axis should read “Year” and the y-axis should read “Carbon Dioxide Level (Parts Per Million)”

iac_viz <- iac_viz + labs(x = "year", y = "Carbon Dioxide Level (Parts Per Million)")
iac_viz

16.Let’s highlight the rise in carbon dioxide levels by adding a horizontal line that represents the maximum level in the first chart spanning over 8,000 years of carbon dioxide data. On a new line of code in the block, create a new variable named millennia_max and retrieve the maximum value of the CO2_ppmv column in the noaa_data. Print the value so you can see what it is.

millennia_max <- max(noaa_data$CO2_ppmv, na.rm = TRUE)

millennia_max
[1] 298.6

17.Now that we have the maximum number in the noaa_data let’s map it on our iac_data plot. There’s a geom in ggolot called geom_hline() that plots a horizontal line. Add a geom_hline() layer to iac_viz that has a yintercept value in its aesthetic mapping of millennia_max.

iac_viz <- iac_viz + geom_hline(yintercept = millennia_max)

iac_viz

18.Add one more argument to the horizontal line’s aesthetic mapping so that the legend can display information about what the line represents. Assign the value of the linetype argument as “Historical CO2 Peak before 1950”

What do you notice has happened in the last 100 years relative to the last 8 millennia?

iac_viz <- iac_viz + geom_hline(aes(yintercept = millennia_max, linetype = "Historical CO2 Peak before 1950")) 

iac_viz

19.Add color

iac_viz <- iac_viz + geom_hline(aes(yintercept = millennia_max, linetype = "Historical CO2 Peak before 1950", color = "Historical CO2 Peak before 1950"))  

iac_viz

20.Change color

iac_viz <- iac_viz  + scale_color_manual(values = c("Historical CO2 Peak before 1950" = "blue")) 
iac_viz

