Carbon Dioxide Concentration

Carbon dioxide, or CO2 for short, is one of three main greenhouse gases on Earth, along with methane and water vapor. Monitoring the concentration of these gases in the atmosphere is an extremely important matter in the 21st century because of the influence they have on temperature and weather patterns that affect water and food supply.

The Mauna Loa Observatory

Mauna Loa Observatory on the Big Island in Hawaii

Mauna Loa Observatory on the Big Island in Hawaii

The Mauna Loa Observatory is an atmospheric baseline station in Hawaii. Mauna Loa has the oldest continuous record of atmospheric carbon dioxide measurements.

NOAA Earth System Research Laboratory

Atmospheric carbon dioxide concentration data are monitored and published by the National Oceanic and Atmospheric Administration’s Earth System Research Laboratory.

Weekly aggregates of CO2 measurements at the Mauna Loa site, from 1974 to the present time, are released in text format via FTP: ftp://aftp.cmdl.noaa.gov/products/trends/co2/co2_weekly_mlo.txt

...
#      Start of week      CO2 molfrac           (-999.99 = no data)  increase
# (yr, mon, day, decimal)    (ppm)  #days       1 yr ago  10 yr ago  since 1800
  1974   5  19  1974.3795    333.29  7          -999.99   -999.99     50.31
  1974   5  26  1974.3986    332.94  6          -999.99   -999.99     50.06
  1974   6   2  1974.4178    332.16  5          -999.99   -999.99     49.43
...

Setting up R and RStudio

With the recent rise of open source (as in free!) data analysis software, such as R and R Studio, it’s now easy for anyone with access to a computer to examine environmental data for themselves and come to their own conclusions.

  1. Download and install the R interpreter from https://cran.r-project.org
  2. Download and install R Studio from https://www.rstudio.com/products/rstudio/download/

Importing the Data From the NOAA FTP Site into R

Lets use read.table to load the data from ESRL into a data frame in your R session and use head to look at the first few lines.

mauna_loa_weekly <- read.table('ftp://aftp.cmdl.noaa.gov/products/trends/co2/co2_weekly_mlo.txt')
head(mauna_loa_weekly)
##     V1 V2 V3       V4     V5 V6      V7      V8    V9
## 1 1974  5 19 1974.380 333.29  7 -999.99 -999.99 50.31
## 2 1974  5 26 1974.399 332.94  6 -999.99 -999.99 50.06
## 3 1974  6  2 1974.418 332.16  5 -999.99 -999.99 49.43
## 4 1974  6  9 1974.437 332.16  7 -999.99 -999.99 49.63
## 5 1974  6 16 1974.456 332.27  7 -999.99 -999.99 49.99
## 6 1974  6 23 1974.475 331.71  7 -999.99 -999.99 49.73

Preparing the Data

We can filter out the decimal years and historical comparisons from the table and keep the year, month, day and carbon concentration observed. We only need the first, second, third, and fifth columns from the original table.

mauna_loa_weekly <- mauna_loa_weekly[, c(1, 2, 3, 5)]
head(mauna_loa_weekly)
##     V1 V2 V3     V5
## 1 1974  5 19 333.29
## 2 1974  5 26 332.94
## 3 1974  6  2 332.16
## 4 1974  6  9 332.16
## 5 1974  6 16 332.27
## 6 1974  6 23 331.71

Now lets put names on the columns.

names(mauna_loa_weekly) <- c('year', 'month', 'day', 'co2ppm')
head(mauna_loa_weekly)
##   year month day co2ppm
## 1 1974     5  19 333.29
## 2 1974     5  26 332.94
## 3 1974     6   2 332.16
## 4 1974     6   9 332.16
## 5 1974     6  16 332.27
## 6 1974     6  23 331.71

In order to analyze this data over time, we should convert the year, month, and day columns into a data type that R understands as dates.

mauna_loa_weekly$date <- as.Date(paste(mauna_loa_weekly$year, mauna_loa_weekly$month, mauna_loa_weekly$day, sep = '-'), format = '%Y-%m-%d')
mauna_loa_weekly <- mauna_loa_weekly[, c('date', 'co2ppm')]
head(mauna_loa_weekly)
##         date co2ppm
## 1 1974-05-19 333.29
## 2 1974-05-26 332.94
## 3 1974-06-02 332.16
## 4 1974-06-09 332.16
## 5 1974-06-16 332.27
## 6 1974-06-23 331.71

Lets take a look at a summary of the data.

summary(mauna_loa_weekly)
##       date                co2ppm       
##  Min.   :1974-05-19   Min.   :-1000.0  
##  1st Qu.:1985-01-09   1st Qu.:  345.2  
##  Median :1995-09-03   Median :  360.8  
##  Mean   :1995-09-03   Mean   :  353.2  
##  3rd Qu.:2006-04-26   3rd Qu.:  381.4  
##  Max.   :2016-12-18   Max.   :  408.7

Something looks weird with carbon concentration data. Why is the minimum -1000?

Notice the original file says (-999.99 = no data). The value -999.99 is being used to mean that data isn’t available. We don’t want that value in our calculations, so in R, we use a special value called NA instead.

mauna_loa_weekly[mauna_loa_weekly$co2ppm == -999.99, ]$co2ppm = NA

Examining the Data

Lets summarize again.

summary(mauna_loa_weekly)
##       date                co2ppm     
##  Min.   :1974-05-19   Min.   :326.9  
##  1st Qu.:1985-01-09   1st Qu.:345.6  
##  Median :1995-09-03   Median :361.0  
##  Mean   :1995-09-03   Mean   :363.6  
##  3rd Qu.:2006-04-26   3rd Qu.:381.5  
##  Max.   :2016-12-18   Max.   :408.7  
##                       NA's   :17

Now the values make sense, and the invalid values are counted as NA. The lowest observed CO2 concentration in the Mauna Loa record is 326.9 ppm, and the highest is 408.7 ppm. That ppm unit stands for parts per million. It’s a convenient way of expressing very small ratios, equivalent to 0.0001%.

Now we can plot carbon dioxide concentration over time using the date column as the x-axis to look at the data visually. The plot function in R is a fast way to examine the relationship between variables.

plot(
  mauna_loa_weekly$date,
  mauna_loa_weekly$co2ppm,
  type = 'l',
  xlab = 'Date',
  ylab = 'CO2 Concentration PPM',
  main = 'Mauna Loa Weekly Carbon Dioxide Concentration'
)

Quantifying the Trend

There’s very clearly a linear trend from the 326 ppm minimum in 1974 to the 408 ppm maximum in 2016. Let’s quantify that trend with a linear regression using lm with CO2 concentration as the dependant variable and date as the independant variable.

trend <- lm(mauna_loa_weekly$co2ppm ~ mauna_loa_weekly$date)
summary(trend)
## 
## Call:
## lm(formula = mauna_loa_weekly$co2ppm ~ mauna_loa_weekly$date)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -7.3620 -2.0666  0.0539  2.0808  9.7247 
## 
## Coefficients:
##                        Estimate Std. Error t value Pr(>|t|)    
## (Intercept)           3.193e+02  1.400e-01  2281.0   <2e-16 ***
## mauna_loa_weekly$date 4.713e-03  1.344e-05   350.7   <2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 2.829 on 2204 degrees of freedom
##   (17 observations deleted due to missingness)
## Multiple R-squared:  0.9824, Adjusted R-squared:  0.9824 
## F-statistic: 1.23e+05 on 1 and 2204 DF,  p-value: < 2.2e-16

Notice the p-value <2e-16 *** for date. This means that the chance of this pattern occurring randomly is essentially zero. Also notice the R-squared value of 0.9824. This means that time explains 98% of the change of the CO2 concentration. Now that we’ve demonstrated this linear relationship mathematically, let’s show it on the plot using abline.

plot(
  mauna_loa_weekly$date,
  mauna_loa_weekly$co2ppm,
  type = 'l',
  xlab = 'Date',
  ylab = 'CO2 Concentration PPM',
  main = 'Mauna Loa Weekly Carbon Dioxide Concentration'
)
abline(trend, col = 'dark blue')

We have some initial results that are worth sharing, so let’s make them more presentable. The ggplot2 library makes good looking plots in R. Let’s install ggplot2.

if(!require(lazyeval)) {
    install.packages('lazyeval')
    library(lazyeval)
}

if(!require(ggplot2)) {
    install.packages('ggplot2')
    library(ggplot2)
}

Now we can make a nicer looking version of our plot using ggplot.

ggplot(data = mauna_loa_weekly, aes(date, co2ppm)) +
  geom_line() +
  xlab('Date') +
  ylab('CO2 Concentration PPM') + 
  ggtitle('Mauna Loa Weekly Carbon Dioxide Concentration') +
  stat_smooth(method = lm, color = 'dark blue')

Examining Seasonality

So we can see the long term trend in CO2 concentration is going up in almost a straight line over time, and dipped just below this trend in the 90s. But why is this variable going up and down along that line in such a regular fashion? It looks like it cycles each year, so it should be seasonal variation. Let’s take a closer look at a couple years.

To make it easier to deal with dates, we can use the lubridate package, and to make it easier to subset the data using these dates, we can use the dplyr package.

if(!require(lubridate)) {
    install.packages('lubridate')
    library(lubridate)
}

if(!require(dplyr)) {
    install.packages('dplyr')
    library(dplyr)
}

The subset function in the dplyr package lets you take only the rows of a table you want to look at. The %>% in dplyr lets you pass a table through a chain of such filters. Let’s take a look at the first few rows of 2015.

mauna_loa_weekly %>% subset(year(date) == 2015) %>% head()
##            date co2ppm
## 2121 2015-01-04 399.85
## 2122 2015-01-11 400.16
## 2123 2015-01-18 399.51
## 2124 2015-01-25 400.20
## 2125 2015-02-01 400.23
## 2126 2015-02-08 399.95

The year 2015 started off at 399.85 ppm.

We should also look at the last few rows using tail.

mauna_loa_weekly %>% subset(year(date) == 2015) %>% tail()
##            date co2ppm
## 2167 2015-11-22 400.37
## 2168 2015-11-29 400.80
## 2169 2015-12-06 401.31
## 2170 2015-12-13 402.35
## 2171 2015-12-20 402.60
## 2172 2015-12-27 402.07

It appears 2015 closed a little higher at 402.07 ppm.

We should also make note of the range of the values by looking at the summary.

mauna_loa_weekly %>% subset(year(date) == 2015) %>% summary()
##       date                co2ppm     
##  Min.   :2015-01-04   Min.   :397.2  
##  1st Qu.:2015-04-03   1st Qu.:399.2  
##  Median :2015-07-01   Median :400.8  
##  Mean   :2015-07-01   Mean   :400.8  
##  3rd Qu.:2015-09-28   3rd Qu.:402.4  
##  Max.   :2015-12-27   Max.   :404.1

The highest concentration observed in 2015 was 404.1 ppm, and the lowest observed was 397.2 ppm. The average or mean of all observations that year was 400.8 ppm.

Let’s look at all of this on a plot.

ggplot(data = mauna_loa_weekly %>% subset(year(date) == 2015), aes(date, co2ppm)) +
  geom_line() +
  xlab('Date') +
  ylab('CO2 Concentration PPM') + 
  ggtitle('Mauna Loa Weekly Carbon Dioxide Concentration')

It looks like the starting point of 399.85 ppm at the beginning of the year was close to the mean value of 400.8, and from there, it went up to the maximum of 404.1 ppm around late spring early summer in the northern hemisphere, then steeply dropped in the fall.

Let’s check when those maximum and minimum measurements were taken.

mauna_loa_weekly %>% subset(year(date) == 2015) %>% subset(co2ppm == max(co2ppm))
##            date co2ppm
## 2138 2015-05-03 404.13

The maximum value was observed on May 3rd.

mauna_loa_weekly %>% subset(year(date) == 2015) %>% subset(co2ppm == min(co2ppm))
##            date co2ppm
## 2159 2015-09-27  397.2

And the minimum value was observed on September 27th.

Let’s check the previous year.

ggplot(data = mauna_loa_weekly %>% subset(year(date) == 2014), aes(date, co2ppm)) +
  geom_line() +
  xlab('Date') +
  ylab('CO2 Concentration PPM') + 
  ggtitle('Mauna Loa Weekly Carbon Dioxide Concentration')

The pattern looks very similar.

mauna_loa_weekly %>% subset(year(date) == 2014) %>% subset(co2ppm %in% c(min(co2ppm), max(co2ppm)))
##            date co2ppm
## 2085 2014-04-27 402.15
## 2105 2014-09-14 394.85

The year 2014 reached a high of 402.15 on April 27th and a low of 394.85 on September 14th.

To look at this seasonal trend in all years observed, let’s break the dates back up into year and day of year.

mauna_loa_weekly$year <- year(mauna_loa_weekly$date)
mauna_loa_weekly$yday <- yday(mauna_loa_weekly$date)
head(mauna_loa_weekly)
##         date co2ppm year yday
## 1 1974-05-19 333.29 1974  139
## 2 1974-05-26 332.94 1974  146
## 3 1974-06-02 332.16 1974  153
## 4 1974-06-09 332.16 1974  160
## 5 1974-06-16 332.27 1974  167
## 6 1974-06-23 331.71 1974  174

And let’s plot each year with a different color.

ggplot(data = mauna_loa_weekly, aes(yday, co2ppm, colour = year, group = year)) +
  geom_line() +
  xlab('Day of Year') +
  ylab('CO2 Concentration PPM') + 
  ggtitle('Mauna Loa Weekly Carbon Dioxide Concentration') +
  scale_color_gradientn('Year', colors = rainbow(length(unique(mauna_loa_weekly$year))))

This seasonal cycle repeats each year, but is overcome by the long term increasing trend.

Monthly Data

This cycle and trend might be easier to see on a monthy basis. Returning to the source data, ESRL publishes monthly averages of the Mauna Loa observations.

ftp://aftp.cmdl.noaa.gov/products/trends/co2/co2_mm_mlo.txt

...
# CO2 expressed as a mole fraction in dry air, micromol/mol, abbreviated as ppm
#
#  (-99.99 missing data;  -1 no data for #daily means in month)
#
#            decimal     average   interpolated    trend    #days
#             date                             (season corr)
1958   3    1958.208      315.71      315.71      314.62     -1
1958   4    1958.292      317.45      317.45      315.29     -1
1958   5    1958.375      317.50      317.50      314.71     -1
1958   6    1958.458      -99.99      317.10      314.85     -1
1958   7    1958.542      315.86      315.86      314.98     -1
...

Let’s import that data as well. The monthly record at Mauna Loa goes back to 1958. Raw averages are provided, but as a convenience to you, an interpolated column is provided that fills in gaps in data. We’ll take the year, month, and gap-filled CO2 concentrations.

mauna_loa_monthly <- read.table('ftp://aftp.cmdl.noaa.gov/products/trends/co2/co2_mm_mlo.txt')
mauna_loa_monthly <- mauna_loa_monthly[, c(1, 2, 5)]
names(mauna_loa_monthly) = c('year', 'month', 'co2ppm')
mauna_loa_monthly$date <- as.Date(paste(mauna_loa_monthly$year, mauna_loa_monthly$month, '01', sep = '-'), format = '%Y-%m-%d')
summary(mauna_loa_monthly)
##       year          month            co2ppm           date           
##  Min.   :1958   Min.   : 1.000   Min.   :312.7   Min.   :1958-03-01  
##  1st Qu.:1972   1st Qu.: 4.000   1st Qu.:328.1   1st Qu.:1972-11-01  
##  Median :1987   Median : 7.000   Median :349.7   Median :1987-07-01  
##  Mean   :1987   Mean   : 6.506   Mean   :351.9   Mean   :1987-07-02  
##  3rd Qu.:2002   3rd Qu.: 9.000   3rd Qu.:373.1   3rd Qu.:2002-03-01  
##  Max.   :2016   Max.   :12.000   Max.   :407.7   Max.   :2016-11-01
ggplot(data = mauna_loa_monthly, aes(date, co2ppm)) +
  geom_line() +
  xlab('Date') +
  ylab('CO2 Concentration PPM') + 
  ggtitle('Mauna Loa Monthly Carbon Dioxide Concentration') +
  stat_smooth(method = lm, color = 'dark blue')

Uh, oh. Looking back to 1958 at the longer monthly record, the straight linear progression we found from 1974 to 2016 is starting to deviate. Look back the past half century, the increase of CO2 in the atmosphere curving up, growing at a faster rate of time.

Let’s check how far off the original linear approximation is in the longer record.

monthly_linear_trend <- lm(co2ppm ~ date, data = mauna_loa_monthly)
summary(monthly_linear_trend)
## 
## Call:
## lm(formula = co2ppm ~ date, data = mauna_loa_monthly)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -7.3389 -2.7821 -0.3473  2.4310 11.8537 
## 
## Coefficients:
##              Estimate Std. Error t value Pr(>|t|)    
## (Intercept) 3.253e+02  2.100e-01  1548.8   <2e-16 ***
## date        4.169e-03  2.360e-05   176.7   <2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 3.881 on 703 degrees of freedom
## Multiple R-squared:  0.978,  Adjusted R-squared:  0.9779 
## F-statistic: 3.121e+04 on 1 and 703 DF,  p-value: < 2.2e-16

Actually, not so bad. The coefficient of determination, R-squared is 0.9779, only the slightest bit down from 0.9824.

Now back to stacking the seasonal cycles of each year.

ggplot(data = mauna_loa_monthly, aes(factor(month), co2ppm, colour = year, group = year)) +
  geom_line() +
  xlab('Month') +
  ylab('CO2 Concentration PPM') + 
  ggtitle('Mauna Loa Monthly Carbon Dioxide Concentration') +
  scale_color_gradientn('Year', colors = rainbow(length(unique(mauna_loa_weekly$year))))

Yep, in recent years, this seasonal cycle of CO2 fluctuation has been starting higher and higher.

Yearly Data

Let’s zoom out one more time to the yearly scale to see just how fast this is increasing. ESRL publishes the yearly averages of the Mauna Loa data here: ftp://aftp.cmdl.noaa.gov/products/trends/co2/co2_annmean_mlo.txt

...
# CO2 expressed as a mole fraction in dry air, micromol/mol, abbreviated as ppm
#
# year     mean      unc
  1959   315.97     0.12
  1960   316.91     0.12
...

The yearly data includes an estimation of uncertainty, which is how far off you think your estimation of a variable is. For this dataset, the uncertainty is constant and very small, at 0.12 ppm.

mauna_loa_yearly <- read.table('ftp://aftp.cmdl.noaa.gov/products/trends/co2/co2_annmean_mlo.txt')
names(mauna_loa_yearly) <- c('year', 'co2ppm', 'uncertainty')
head(mauna_loa_yearly)
##   year co2ppm uncertainty
## 1 1959 315.97        0.12
## 2 1960 316.91        0.12
## 3 1961 317.64        0.12
## 4 1962 318.45        0.12
## 5 1963 318.99        0.12
## 6 1964 319.62        0.12

Let’s plot the annual data over time to see the real trend.

ggplot(data = mauna_loa_yearly, aes(year, co2ppm)) +
    geom_ribbon(data = mauna_loa_yearly, aes(ymin = co2ppm - uncertainty, ymax = co2ppm + uncertainty), alpha=0.3) +
  geom_line() +
  xlab('Date') +
  ylab('CO2 Concentration PPM') + 
  ggtitle('Mauna Loa Yearly Carbon Dioxide Concentration')

This annual trend isn’t perfectly linear. It looks like it’s curving up over time.

To see how much the trend is curving up, let’s calculate the increase in CO2 concentration year by year. The year 1959 we’ll leave blank as NA because the annual record doesn’t have 1958 to compare. So we can track increase from 1960 to the present.

mauna_loa_yearly$co2ppm.inc <- c(NA, diff(mauna_loa_yearly$co2ppm))
summary(mauna_loa_yearly)
##       year          co2ppm       uncertainty     co2ppm.inc   
##  Min.   :1959   Min.   :316.0   Min.   :0.12   Min.   :0.420  
##  1st Qu.:1973   1st Qu.:329.7   1st Qu.:0.12   1st Qu.:1.038  
##  Median :1987   Median :349.2   Median :0.12   Median :1.565  
##  Mean   :1987   Mean   :351.6   Mean   :0.12   Mean   :1.515  
##  3rd Qu.:2001   3rd Qu.:371.1   3rd Qu.:0.12   3rd Qu.:1.877  
##  Max.   :2015   Max.   :400.8   Max.   :0.12   Max.   :2.940  
##                                                NA's   :1

In all years observed, the change in yearly average CO2 concentration was a positive increase, but by how much varied from a minimum change of 0.420 ppm up to a maximum change of 2.940. The average rate of change of 1.515 ppm per year.

mauna_loa_yearly %>% na.omit() %>% subset(co2ppm.inc %in% c(min(co2ppm.inc), max(co2ppm.inc)))
##    year co2ppm uncertainty co2ppm.inc
## 7  1965 320.04        0.12       0.42
## 40 1998 366.65        0.12       2.94

The year of lowest increase was 1965, and year of highest increase was 1998. Let’s plot out this trend on a graph and see if it increases in general over time.

ggplot(data = mauna_loa_yearly, aes(year, co2ppm.inc)) +
  geom_line() +
  xlab('Year') +
  ylab('Change in CO2 Concentration PPM') + 
  ggtitle('Mauna Loa Annual Increase in CO2 Concentration') +
  stat_smooth(method = lm, color = 'dark blue') +
  scale_x_continuous(breaks = seq(1960, 2020, 10)) + 
  theme(axis.text.x = element_text(angle = 0, vjust = 0.7))

Now we’re really starting to see the variation in this data. But the overall trend within this variation is that the rate at which CO2 is increasing from year to year is itself increasing. So now we’ve seen that not only has CO2 been accumulating in the atmosphere the past half century, it’s been accumulating faster over time.

Global Marine Surface Data

But wait a second. This is just one site. One point on Earth. Maybe it’s not representative of the planet as a whole. Maybe this increase is due to the volcanoes in Hawaii? To check this hypothesis, we’re going to have to compare to data from more locations.

ESRL publishes the monthly global average of marine surface CO2 concentration observations from 1980 on at: ftp://aftp.cmdl.noaa.gov/products/trends/co2/co2_mm_gl.txt

...
# year month     decimal     average       trend
  1980     1    1980.042      338.45      337.82
  1980     2    1980.125      339.15      338.10
...
global_monthly <- read.table('ftp://aftp.cmdl.noaa.gov/products/trends/co2/co2_mm_gl.txt')
global_monthly <- global_monthly[, c(1, 2, 4)]
names(global_monthly) <- c('year', 'month', 'co2ppm')
global_monthly$date <- as.Date(paste(global_monthly$year, global_monthly$month, '01', sep = '-'), format = '%Y-%m-%d')

And the global annual averages: ftp://aftp.cmdl.noaa.gov/products/trends/co2/co2_annmean_gl.txt

# year     mean      unc
  1980   338.80     0.10
  1981   339.99     0.10
global_annual <- read.table('ftp://aftp.cmdl.noaa.gov/products/trends/co2/co2_annmean_gl.txt')
names(global_annual) <- c('year', 'co2ppm', 'uncertainty')
global_annual$co2ppm.inc <- c(NA, diff(global_annual$co2ppm))
head(global_annual)
##   year co2ppm uncertainty co2ppm.inc
## 1 1980 338.80         0.1         NA
## 2 1981 339.99         0.1       1.19
## 3 1982 340.76         0.1       0.77
## 4 1983 342.43         0.1       1.67
## 5 1984 343.98         0.1       1.55
## 6 1985 345.45         0.1       1.47
combined_monthly <- rbind.data.frame(
  mauna_loa_monthly %>% mutate(Source = 'Mauna Loa'),
  global_monthly %>% mutate(Source = 'Global Marine Surface')
)

ggplot(data = combined_monthly, aes(date, co2ppm, color = Source, group = Source)) +
  geom_line(size = 1.3, alpha = 0.7) +
  xlab('Date') +
  ylab('CO2 Concentration PPM') + 
  ggtitle('Average Monthly CO2 Concentration') +
  scale_color_manual(values = c('blue', 'dark grey'))

It looks like they match up very closely. Let’s scatterplot them against each other to make sure.

combined_monthly <- inner_join(
  global_monthly %>% select(date, co2ppm) %>% rename(co2ppm.gl = co2ppm),
  mauna_loa_monthly %>% select(date, co2ppm) %>% rename(co2ppm.ml = co2ppm),
  by = 'date'
)
  
ggplot(data = combined_monthly, aes(co2ppm.ml, co2ppm.gl)) +
  geom_point() +
  xlab('Mauna Loa Monthly CO2 PPM') +
  ylab('Global Marine Surface Monthly CO2 PPM') +
  ggtitle('Mauna Loa vs. Global Marine Surface CO2 Concentration')

Yes, the Mauna Loa record very closely matches the global marine surface record from 1980 onward, so we can reasonably assume the historical data measured at Mauna Loa prior to 1980 is representative of global CO2 concentrations as well.

The Law Dome Ice Core Record

Ok, so we can conclude so far after through investigation that the concentration of CO2 in the atmosphere has been increasing the past half century, and is now (in 2016) around 408 ppm, up from 313 ppm in 1958. But what about before the past 50 years? How usual or unusual are the concentrations we’ve been measuring since we’ve had instruments to do so? Maybe this is just a part of a natural cycle?

To see our current situation in context, we’ll have to find a way to compare current concentrations to the distant past. There’s a way we can do that using ice cores. When ice is frozen at the poles, little bubbles get trapped in the ice. We can drill into ancient ice sheets and measure gas concentrations in those little bubbles at different depths and measure the atmosphere of the distant past as it was preserved in the ice.

The Oak Ridge National Laboratory’s Carbon Dioxide Information Analysis Center published a 2004 year record of ice core gas concentration data from the Law Dome ice core here: ftp://ftp.ncdc.noaa.gov/pub/data/paleo/icecore/antarctica/law/law2006.txt

law_dome <- read.table(
  'ftp://ftp.ncdc.noaa.gov/pub/data/paleo/icecore/antarctica/law/law2006.txt',
  skip = 182,
  nrows = 2004,
  header = TRUE
)

law_dome <- law_dome %>% select(YearAD.1, CO2spl) %>% rename(year = YearAD.1, co2ppm = CO2spl)

law_dome$co2ppm.inc <- c(NA, diff(law_dome$co2ppm))

summary(law_dome)
##       year            co2ppm        co2ppm.inc      
##  Min.   :   1.0   Min.   :271.6   Min.   :-0.50000  
##  1st Qu.: 501.8   1st Qu.:277.6   1st Qu.: 0.00000  
##  Median :1002.5   Median :279.3   Median : 0.00000  
##  Mean   :1002.5   Mean   :281.9   Mean   : 0.04888  
##  3rd Qu.:1503.2   3rd Qu.:282.0   3rd Qu.: 0.10000  
##  Max.   :2004.0   Max.   :374.6   Max.   : 2.00000  
##                                   NA's   :1
ggplot(data = law_dome, aes(year, co2ppm)) +
  geom_line() +
  xlab('Year') +
  ylab('CO2 Concentration PPM') + 
  ggtitle('Law Dome Ice Core CO2 Concentration') +
  scale_x_continuous(breaks = seq(0, 2000, 100)) + 
  theme(axis.text.x = element_text(angle = -45, vjust = 0.7))

Wow! The 20th century is a huge spike compared to the past two millenia.

Let’s take a look at the rate of change.

ggplot(data = law_dome, aes(year, co2ppm.inc)) +
  geom_line() +
  xlab('Year of Common Era') +
  ylab('Change in CO2 Concentration PPM') + 
  ggtitle('Law Dome Ice Core CO2 Concentration Annual Rate of Change') +
  scale_x_continuous(breaks = seq(0, 2000, 100)) + 
  theme(axis.text.x = element_text(angle = -45, vjust = 0.7))

It looks like CO2 concentration was very stable from the beggining of the Common Era to the year 1400.

ggplot(data = law_dome %>% subset(year <= 1400), aes(year, co2ppm.inc)) +
  geom_line() +
  xlab('Year of Common Era') +
  ylab('Change in CO2 Concentration PPM') + 
  ggtitle('Law Dome Ice Core CO2 Concentration Annual Rate of Change') +
  scale_x_continuous(breaks = seq(0, 2000, 100)) + 
  theme(axis.text.x = element_text(angle = -45, vjust = 0.7))

summary(law_dome %>% subset(year <= 1400))
##       year            co2ppm        co2ppm.inc       
##  Min.   :   1.0   Min.   :275.9   Min.   :-0.200000  
##  1st Qu.: 350.8   1st Qu.:277.6   1st Qu.: 0.000000  
##  Median : 700.5   Median :279.1   Median : 0.000000  
##  Mean   : 700.5   Mean   :279.4   Mean   : 0.002502  
##  3rd Qu.:1050.2   3rd Qu.:280.7   3rd Qu.: 0.000000  
##  Max.   :1400.0   Max.   :283.9   Max.   : 0.200000  
##                                   NA's   :1

The average CO2 concentration during this stable period was 279.4 ppm, and it changed each year by no more than 0.2 ppm in either direction.

Proceeding through history, there were a couple of perturbations from 1400 to 1750.

ggplot(data = law_dome %>% subset(year >= 1400 & year <= 1750), aes(year, co2ppm.inc)) +
  geom_line() +
  xlab('Year of Common Era') +
  ylab('Change in CO2 Concentration PPM') + 
  ggtitle('Law Dome Ice Core CO2 Concentration Annual Rate of Change') +
  scale_x_continuous(breaks = seq(1400, 1750, 50)) + 
  theme(axis.text.x = element_text(angle = 0, vjust = 0.7))

law_dome %>% subset(year >= 1400 & year <= 1750) %>% summary()
##       year          co2ppm        co2ppm.inc       
##  Min.   :1400   Min.   :271.6   Min.   :-0.500000  
##  1st Qu.:1488   1st Qu.:276.6   1st Qu.:-0.100000  
##  Median :1575   Median :279.6   Median : 0.000000  
##  Mean   :1575   Mean   :279.0   Mean   :-0.009971  
##  3rd Qu.:1662   3rd Qu.:281.9   3rd Qu.: 0.100000  
##  Max.   :1750   Max.   :283.2   Max.   : 0.500000

In this period, average CO2 concentration is still at 279 ppm, but in some years it change by 0.5 ppm.

ggplot(data = law_dome %>% subset(year>= 1750), aes(year, co2ppm.inc)) +
  geom_line() +
  xlab('Year of Common Era') +
  ylab('Change in CO2 Concentration PPM') + 
  ggtitle('Law Dome Ice Core CO2 Concentration Annual Rate of Change') +
  scale_x_continuous(breaks = seq(1700, 2000, 10)) + 
  theme(axis.text.x = element_text(angle = -45, vjust = 0.7))

law_dome %>% subset(year >= 1750) %>% count(co2ppm.inc > 0)
## # A tibble: 2 × 2
##   `co2ppm.inc > 0`     n
##              <lgl> <int>
## 1            FALSE    66
## 2             TRUE   189

From 1750 on, most years experienced an increase in CO2 concentration.

law_dome %>% subset(year >= 1750 & year <= 1900) %>% summary()
##       year          co2ppm        co2ppm.inc     
##  Min.   :1750   Min.   :276.5   Min.   :-0.2000  
##  1st Qu.:1788   1st Qu.:279.2   1st Qu.: 0.0000  
##  Median :1825   Median :283.9   Median : 0.1000  
##  Mean   :1825   Mean   :284.0   Mean   : 0.1278  
##  3rd Qu.:1862   3rd Qu.:286.8   3rd Qu.: 0.2500  
##  Max.   :1900   Max.   :296.1   Max.   : 0.6000

But from 1750 to 1900, the maximum annual change in CO2 concentration only went up to 0.6 ppm from the 0.5 ppm of 1400 to 1750.

ggplot(data = law_dome %>% subset(year >= 1900), aes(year, co2ppm.inc)) +
  geom_line() +
  xlab('Year of Common Era') +
  ylab('Change in CO2 Concentration PPM') + 
  ggtitle('Law Dome Ice Core CO2 Concentration Annual Rate of Change') +
  scale_x_continuous(breaks = seq(1900, 2010, 10)) + 
  theme(axis.text.x = element_text(angle = -45, vjust = 0.7))

It looks like during World War II was the last time that there was ever an inter-annual decrease in CO2 concentration.

So the accumulation of CO2 in the past half century is an extreme departure from the historic concentrations of the past two millenia. And the current rate of increase of around 2 ppm per year is about four times the maximum rate of change observed in previous centuries, and the current concentration of around 400 ppm is 43% higher than the 279 ppm of previous centuries.

dome_c <- read.table(
  'ftp://ftp.ncdc.noaa.gov/pub/data/paleo/icecore/antarctica/domec/domec_co2.txt',
  skip = 11
)

names(dome_c) <- c('depth_m', 'before_1950', 'co2ppm', 'uncertainty')

dome_c$year <- 1950 - dome_c$before_1950

summary(dome_c)
##     depth_m       before_1950        co2ppm       uncertainty    
##  Min.   :352.6   Min.   : 9067   Min.   :184.4   Min.   :0.2000  
##  1st Qu.:429.0   1st Qu.:11580   1st Qu.:189.8   1st Qu.:0.4000  
##  Median :485.7   Median :14308   Median :227.2   Median :0.5500  
##  Mean   :479.8   Mean   :14703   Mean   :224.4   Mean   :0.5958  
##  3rd Qu.:533.9   3rd Qu.:17551   3rd Qu.:249.9   3rd Qu.:0.7000  
##  Max.   :578.1   Max.   :21676   Max.   :267.6   Max.   :1.3000  
##       year       
##  Min.   :-19726  
##  1st Qu.:-15601  
##  Median :-12358  
##  Mean   :-12753  
##  3rd Qu.: -9630  
##  Max.   : -7117
ggplot(data = dome_c, aes(year, co2ppm)) +
  geom_line() +
  xlab('Years Before Common Era') +
  ylab('CO2 Concentration PPM') + 
  ggtitle('Dome C Ice Core CO2 Concentration') +
  scale_x_continuous(breaks = seq(-20000, 0, 1000), labels = scales::comma) + 
  theme(axis.text.x = element_text(angle = -45, vjust = 0.7))

It looks like that stable 270s ppm concentration began about nine millenia ago, around the time the ice age came to a pause, and our current inter-glacial climate began. So in terms of the natural cycle of ice age and interglacial periods, 270 ppm is the high plateau.

We can get an even longer, 800,000 year outlook from a composite of several ice cores.

composite <- read.table(
  'https://www1.ncdc.noaa.gov/pub/data/paleo/icecore/antarctica/antarctica2015co2composite.txt',
  skip = 138
)

names(composite) <- c('years_before_present', 'co2ppm', 'uncertainty')

composite$year <- 1950 - composite$years_before_present

summary(composite)
##  years_before_present     co2ppm       uncertainty        year        
##  Min.   :   -51       Min.   :173.7   Min.   :0.01   Min.   :-803719  
##  1st Qu.: 14606       1st Qu.:204.8   1st Qu.:0.64   1st Qu.:-502227  
##  Median : 74526       Median :232.5   Median :1.07   Median : -72576  
##  Mean   :242810       Mean   :235.6   Mean   :1.34   Mean   :-240860  
##  3rd Qu.:504177       3rd Qu.:257.9   3rd Qu.:1.80   3rd Qu.: -12656  
##  Max.   :805669       Max.   :368.0   Max.   :9.96   Max.   :   2001
ggplot(data = composite, aes(year, co2ppm)) +
  geom_line() +
  xlab('Years Before Common Era') +
  ylab('CO2 Concentration PPM') + 
  ggtitle('Composite Ice Core CO2 Concentration') +
  scale_x_continuous(breaks = seq(-800000, 2000, 50000), labels = scales::comma) + 
  theme(axis.text.x = element_text(angle = -45, vjust = 0.7))