Plotting TraCE-21k timeseries

Author

Laura Boyall

Information

In this document you will find information on how to extract data from a TraCE-21ka file and plot a timeseries for a particular region. This guideline will use the example of surface temperature, however will identify where this code can be changed to support other climate variables.

You can download the netCDF file here. You will need to make a free online account to download. The file names are relatively long, but the one you are looking for is: “trace.01-36.22000BP.cam2.TS.22000BP_decavg_400BCE.nc” for temperature… where it says ‘TS’ in the center of this file name is where it refers to temperature, so be sure you have this one!

This code is built in Quartro to visually show both the code and output, however to use this yourself simply copy the code snippets and paste in your own RStudio script.

If you happen to use parts of this code, please ensure to credit it! Any questions please email at laura.boyall.2016@live.rhul.ac.uk

Happy plotting!

Loading the required packages

R requires a series of packages to be loaded to help with streamlining the different functions. The following packages are the ones which I find useful, but if you are more familiar with alternatives, then please do use.

You only have to install packages once, however each time you begin a new R session, you have to retrieve them from the library using the library(package name) function.

Remember to then set your working directory as normal to where your netCDF files are stored.

if (!requireNamespace("ggplot2", quietly = TRUE)) install.packages("ggplot2")
if (!requireNamespace("ncdf4", quietly = TRUE)) install.packages("ncdf4")
if (!requireNamespace("zoo", quietly = TRUE)) install.packages("zoo")
if (!requireNamespace("dplyr", quietly = TRUE)) install.packages("dplyr")

library(ggplot2) # used for plotting
library(ncdf4) # used for reading and extracting data from NetCDF model file
library(zoo) # helpful package for creating rolling averages
library(dplyr) # helpful tool for filtering data

Reading in the data

The data can be downloaded from here. Whilst monthly data is available, the file sizes here can be rather large, so decadal means should be sufficient. Note that there are atmospheric, land, oceanic and ice simulations… if you cannot find the variable of interest in one, check the other! (Temperature (TS), precipitation (PRECL), sea level pressure (SLP) etc is stored in atmospheric).

If you are using this on your own machine, change the source directory to where your model file is stored.

setwd("~/Mirror/TraCE files/") # change with your own directory
Temperature_data <- nc_open("trace.01-36.22000BP.cam2.TS.22000BP_decavg_400BCE.nc")

Now that we have uploaded our model file into R, we can extract the variable of interest. Each model file is going to be completely different. Some will only have one variable to plot e.g. temperature, however this still may have a number of different names. Some examples I have come across is TS, TAS, tas, T, temp, TC. A good way of checking what your variable is stored as is Temperature_data$var. This will the variables saved. E.g. in this TraCE file used in this example temperature is saved as TS for surface temperature.

TS <- ncvar_get(Temperature_data, "TS") # remember to change TS here if you are using a different variable e.g. PRESL / PSL

The same thing is true for the time variable, latitude and longitude. Instead of using Temperature_data$var as above, you should use Temperature_data$dim as this will show the dimensions of the data. NOTE: some files will also contain a fourth dimension which is height. In this TraCE temperature file the climate is only simulated at one level (2m above the surface), however other files will contain this. See below an example, however this is not run in this code example.

Time <- ncvar_get(Temperature_data, "time") # units of this are in ka BP 
Time <- Time*-1000 # converting to BP 
lat <- ncvar_get(Temperature_data, "lat")
lon <- ncvar_get(Temperature_data, "lon")
#height <- ncvar_get(Temperature_data, "height") # this is not required for TraCE temperature. 

Defining the region of interest

Now that we have the lat, lon and time dimensions loaded in, we can now define the region to which we want to extract the data from. As this script example is for a timeseries, we want to take a lat/lon average, rather than a temporal average of which we would use for a climate map etc.

Note that we subtract 273.15 from the data as the model data is in Kelvin and we would like to present this in degrees C.

lon_idx <- which(lon >= -20 | lon <= 30) # Enter your longitude boundary
lat_idx <- which(lat >= 40 & lat <= 60) # Enter your latitude boundary

# Here we are extracting and averaging the temperature data for the specified region and time
temp_region_mean <- apply(TS[lon_idx, lat_idx, ], 3, mean, na.rm = TRUE)
regional_temp <- data.frame(AgeBP = Time, temp_mean = temp_region_mean-273.15) # -273.15 from the temp_region_mean as the model data is in degrees Kelvin!

Now, all of the data required for this plot / extraction is complete. The next step is closing the model file uploaded at the beginning.

NOTE: Because you have closed this file, if you did want to change anything in the script above this (e.g. lat/lon), then you will have to re-run the script!

nc_close(Temperature_data)

Plotting a timeseries for that region

The next stage is plotting the timeseries results. There will be several examples including plotting the full record (21ka to present), from specific time points (e.g. 11ka to present), as well as creating rolling averages etc. For your own work, each of these steps will be highlighted to show where in the script things are changing.

Plotting the full record

This is a pretty basic plot, but good to see the full range of your data.

ggplot(regional_temp)+
  geom_line(aes(x=AgeBP, y=temp_mean))+
  labs(
    title = "TraCE Temperature", # Change this to whatever title
    x = "Age (years BP)",
    y = expression("Temperature "(degree * C))
  ) +
  ggpubr::theme_pubr() # this is removing the background and just makes more professional plots

Plotting only the Holocene

Now we want to change the temporal range of the data. For example, lets say we want to only plot the Holocene. The best way to do this is to filter to the dates of the Holocene rather than just changing the x-axis.

We then save the filtered data into another dataset called Holocene_temp.

Holocene_temp <- regional_temp %>% # %>% means we are using regional_temp data
  filter(AgeBP <11700) # we're then filtering it by AgeBP to be more than 11.7 ka

ggplot(Holocene_temp)+ # notice we are using this new data now
  geom_line(aes(x=AgeBP, y=temp_mean))+
  labs(
    title = "TraCE Holocene Temperature", 
    x = "Age (years BP)",
    y = expression("Temperature "(degree * C))
  ) +
  ggpubr::theme_pubr()

Smoothing the plot

The TraCE21ka dataset provides data every 10 years. This means that there is a lot of variability observed within the graph, so, we will complete a rolling mean to smooth the record to a 500-year mean.

# First, we will save the amount of data points we want to smooth it by. e.g. if we want it to be a 20 year rolling average we will save the below variable as 2. However, if we want a 500-year we will save it as 

roll <- 50 # 500-year roll, change to anything!

ggplot(Holocene_temp)+ # notice we are using this new data now
  geom_line(aes(x=AgeBP, y=rollmean(temp_mean, roll, na.pad = T)))+ # this line is where say how much we want it smoothed by
  labs(
    title = "TraCE Holocene Temperature 500-year rolling mean", 
    x = "Age (years BP)",
    y = expression("Temperature "(degree * C))
  ) +
  ggpubr::theme_pubr()
Warning: Removed 49 rows containing missing values (`geom_line()`).

Tidying up the plot

There are a few things which can be added to make the plots better, including adding colours, adding multiple lines, swapping the x-axis so it is more in line with palaeo publications, etc!

ggplot(Holocene_temp)+ 
  geom_line(aes(x=AgeBP, y=temp_mean), col = "red")+ 
  geom_line(aes(x=AgeBP, y=rollmean(temp_mean, roll, na.pad = T)), col = "black", linetype = "dashed", linewidth = 1)+ 
  scale_x_reverse(limits = c(12000,0), breaks = seq(12000,0, -2000)) +
  scale_y_continuous(limits = c(-5,1), breaks = seq(-5,1,1))+  
  labs(
    title = "Surface Air Temperature", 
    x = "Age (years BP)",
    y = expression("Temperature "(degree * C))
  ) +
  ggpubr::theme_pubr()
Warning: Removed 3 rows containing missing values (`geom_line()`).
Warning: Removed 49 rows containing missing values (`geom_line()`).

Exporting data

If the plotting isn’t for you, and you would much rather plot in another software, or you just want to export the temperature data from the grid we defined earlier. Here is how we can export the data with one line of code:

write.csv(regional_temp, "Exported temperature.csv") # where regional_temp is your data, and Exported temperature.csv is the name of the file you are exporting