PAGES LiPD file

LiPD file

This Quartro document will show some of the functions I use if I use LiPD files and some of the functions we did in the previous PAGES meeting.

The packages used are mostly not from CRAN so need to be downloaded from Github:

#remotes::install_github("nickmckay/geoChronR") 
#remotes::install_github("nickmckay/lipdR") 

library(lipdR) 
library(geoChronR)

Welcome to geoChronR version 1.1.15!

library(magrittr)
library(dplyr) 
library(ggplot2)

The data can either be downloaded to your local device but given that it is a lot of files then I normally just download directly from the source and keep online.

If you want to have any other LiPD files then just get the download link from the LiPD directory. Here are just two examples but in this script we will use the temp2k dataset for the rest of the examples.

invisible(capture.output(
  temp2k <- lipdR::readLipd("https://lipdverse.org/Pages2kTemperature/current_version/Pages2kTemperature2_1_2.zip")
))
#iso2k <- lipdR::readLipd("https://lipdverse.org/iso2k/current_version/iso2k1_0_1.zip")

I first like to just plot all of the data to see the spatial distribution of the proxies to give me an idea of where to focus my study

mapLipd(temp2k, global = TRUE,size = 3) + 
  ggtitle("Spatial distribution of archives in the 2k dataset")

LiPD files are essentially 3D with lots of layers to them and therefore we need to extract and ‘flatten’ each file so that we can use them for the timeseries properties.

And then also in this code block we are going to extract the spatial properties which are saved in each file.

TS <- extractTs(temp2k)
lat <- pullTsVariable(TS,"geo_latitude") # note that this name may change for each data product e.g. some are 'lat' etc. temp2k and ios2k are 'geo_latitude' though!
lon <- pullTsVariable(TS,"geo_longitude")
TS_name <- pullTsVariable(TS,"paleoData_variableName") # this extracts all of the 'names' used for the timeseries. e.g. year, depth, but also like proxy etc

Here we are going to set up some conditions to extract only the data which falls in here:

temperature_grid <- which(between(lat,0,90) & between (lon,-30,30) & TS_name == "temperature")
gTS <- TS[temperature_grid]

Now it is a good idea to plot the temporal distribution of the records to see whether we need to expand the search. This plots the changing archive quantity through time.

plotTimeAvailabilityTs(gTS,age.range = c(1,2000),age.var = "year")

And if we are happy with this then we can plot them on a map:

mapTs(gTS)+ 
  ggtitle("Specified region")

# to plot the two plots together you can use the following function:
#plotSummaryTs(gTS,age.var = "year", age.range = c(0,2000))

Plotting timeseries

If we are now happy with our data, we can start plotting the timeseries and do some more data manipulation.

Firstly we can convert our dataframe into a tidy dataframe. This will help with more flexibility in our approach.

I have chosen to filter a bit more as there is lots of data for this region, if you don’t want to filter then remove the lines / change the lines.

tidyData <- tidyTs(gTS,age.var = "year") # sorting by year

#filter for plotting
plot_df <- tidyData %>% 
  filter(between(year,1500,2000)) %>% #only years from 1500 to 2000 CE
  filter(between(geo_longitude,0,30)) %>% 
  filter(between(geo_latitude,30,60)) %>%  #between 30 and 60 N
  filter(interpretation1_variable == "T") %>% 
  group_by(paleoData_TSid) %>% #group by column
  arrange(archiveType) #and sort by archiveType

Now we can plot the timeseries stack - here I have ordered by archive type, but remove the color.var reference if you don’t want to do so.

plotTimeseriesStack(plot_df,
                    color.var =  "archiveType",
                    lab.size = 2,
                    color.ramp = c("coral","black","blue","magenta","darkgreen", "red"))