ETA course second session

Javier de la Casa & Miquel de Cáceres

Aims for the day

  • Learn about a real case scenario of ETA application
  • Try ETA with your own data
  • surprise!

The elementome is the elemental composition that characterizes an entity. Here we are going to work with ecossytem elementomes

  • Today you will use elemental meassurements from paleoecological records to reconstruct changes in the ecosystem elementome

  • This data is published in de la Casa et al. 2025. Unveiling two millennia of ecosystem changes in the Azores through elementome trajectory analysis

For this excersice we are going to move to the Azores, where we will study sediment cores from two shallow lakes

Map with the location of the example lakes

So lets start with R - Load packages and the database

library(readr)
library(dplyr)
library(ecotraj)
library(ggplot2)

ETA_azores_example <- read_delim("ETA_azores_example.csv", 
                                 delim = ";", escape_double = FALSE, trim_ws = TRUE)

Here we have:

  • the elemental composition (C, N, Ca, Fe, Fe, K, Mn, S, Si, Ti, Sr, Zr) from chronologically dated sedimet samples of the two lakes
  • C and N are measured with spectrometer, the rest with X-Ray fluorescence.
  • We will transform the data so it is comparable (first root sqare and then min-max transformation)

Lets transform

element_list <- c("C","N", "Ca",  "Fe", "K",
                  "Mn",  "S", "Si", "Ti",  "Sr",  "Zr")


azores <- ETA_azores_example %>%
  group_by(record_ID) %>%
  # Apply square root transformation dynamically
  mutate(across(all_of(element_list), ~ sqrt(.), .names = "{.col}")) %>%
  # Normalize dynamically
  mutate(across(all_of(element_list), ~ (. - min(.)) / (max(.) - min(.)), .names = "{.col}")) %>%
  # Remove duplicates and round age
  distinct(TIME_BP, .keep_all = TRUE) %>% 
  mutate(sample_ID = paste0(record_ID, TIME_BP)) %>% 
  ungroup()

Now all elements are between 0 and 1

  • record_ID identifies eack lake
  • TIME_BP is the age in years before present

There are two approaches now:

  • We can define a common multivariate space for the two sites. This is better to make comparisons between sites
  • Generate independent multivariate spaces for each site. this could be useful if the elements measured were distinct
  • Here, we will use the first approach!

For ETA you need:

  • Distance matrix: defined using the elemental composition of each lake sedimentary record
  • Sites: This will define how many independent trajectories are
  • Surveys: This is the steps withon the trajectories. Here surveys = time meassurements

Distance matrix:

azores <- azores[, c("record_ID","TIME_BP",all_of(element_list))]

#this is the distance matrix
distance_azores<-dist(azores[, -c(1:2)]) #eliminate non-element columns
distance_azores <- as.matrix(distance_azores)

Site:

site_azores <- azores[, 1]
site_azores <- site_azores[[1]] #should be a vector

Survey:

-MIQUEL I HAVE DOUBTS WITH THIS!!! BEFORE I USED MINUS BUT IT BUT NOW IT SEEMS THAT IT UNDERSTANDS TIME BP

survey_azores <- azores[, 2] 

survey_azores <- survey_azores[[1]]#should be a vector

Let’s create the trajectories:


traj_azores <- defineTrajectories(distance_azores, site_azores, times= survey_azores)

  
trajectoryPCoA(traj_azores , traj.colors = c("yellow", "green"), lwd = 2,
               survey.labels = F)
legend("topright", col=c("yellow", "green"), 
       legend=c("Caldeirao", "Empadadas"), bty="n", lty=1, lwd = 2)

To better interpret the trajectories, we can plot the position of the elements (loadinds) in the multivariate space HELP!!

We can plot each trajectory individually as well using subsetTrajectories

traj_caldeirao <- subsetTrajectories(traj_azores, site_selection = "caldeirao")
trajectoryPCoA(traj_caldeirao)

traj_empadadas <- subsetTrajectories(traj_azores, site_selection = "empadadas")
trajectoryPCoA(traj_empadadas)

This can also serve to subset specific moments of time

traj_azores_MiddleAge<- subsetTrajectories(traj_azores,  window_selection = c(450, 1450)) # remember the minus

trajectoryPCoA(traj_azores_MiddleAge , traj.colors = c("yellow", "green"), lwd = 2,
               survey.labels = F)
legend("topright", col=c("yellow", "green"), 
       legend=c("Trajectory 1", "Trajectory 2"), bty="n", lty=1, lwd = 2)

HELP MIQUEL

  • I want to plot the trajectories individually, but I also want to colour the trajectories from different periods (defined by myself. is this posible within ecotraj?

LEts calculate some metrics

metrics_azores_allperiod <- trajectoryMetrics(traj_azores)
  trajectory   n   t_start     t_end  duration   length mean_speed mean_angle
1  caldeirao  65 -35.56000 2000.9700 2036.5300 31.23727 0.01533848  109.55773
2  empadadas 116 -60.82425  592.7947  653.6189 33.56347 0.05135022   96.00328
  directionality internal_ss internal_variance
1      0.3416622    30.07345         0.4698976
2      0.3840902    75.97777         0.6606763

This gives us a mean assesment of each trajectory, however we may be interested in how this metric changes through time. - We need to use moving windows, lets do a 500 years one (for each survey, it generates a moving window and calculate the metrics within)

metrics_azores<- trajectoryWindowMetrics(traj_azores, 500, type = "times", add = TRUE)

metrics_azores <- metrics_azores%>% mutate(
  meantime = ((t_start + t_end)/2))  

Now we have a good quantitative assessment of how trajectories change through time, lets plot one of them (internal variance):

ggplot(metrics_azores, aes(x = meantime , y = internal_variance, color = trajectory, group = trajectory)) +geom_line(linewidth = 1) 

However, you see that caldeirao record collects 2035 years, while empadadas only 660. - we should use windows spanning proportional spans, lets say 15%

metrics_caldeirao<- trajectoryWindowMetrics(traj_caldeirao, 305, type = "times", add = TRUE)
metrics_caldeirao <- metrics_caldeirao%>% mutate(
  meantime = ((t_start + t_end)/2))  
  
metrics_empadadas<- trajectoryWindowMetrics(traj_empadadas, 99, type = "times", add = TRUE)
metrics_empadadas <- metrics_empadadas%>% mutate(
  meantime = ((t_start + t_end)/2))  

We will also add colors for different periods:

  • GREEN for prehuman conditions

  • BLUE for first settlers

  • RED for Portuguese colonization

Code for Caldeirao

ggplot(metrics_caldeirao, aes(x = meantime, y = internal_variance)) +
  geom_line(linewidth = 1) +
  # Green: no humans
  annotate("rect", xmin = 2000, xmax = 1100, 
           ymin = -Inf, ymax = Inf, alpha = 0.2, fill = "green") +
  # Blue: first settlers (TIME_BP ~1100)
  annotate("rect", xmin = 1100, xmax = 498, 
           ymin = -Inf, ymax = Inf, alpha = 0.2, fill = "blue") +
  # Red: European (TIME_BP ~450)
  annotate("rect", xmin = 498, xmax = -50, 
           ymin = -Inf, ymax = Inf, alpha = 0.2, fill = "red") +
  labs(x = "Time (years BP)", 
       y = "Internal Variance", 
       color = "Trajectory") +
  theme_minimal() +
  scale_x_reverse()
  
  ggplot(metrics_caldeirao, aes(x = meantime, y = directionality)) +
  geom_line(linewidth = 1) +
  # Green: no humans
  annotate("rect", xmin = 2000, xmax = 1100, 
           ymin = -Inf, ymax = Inf, alpha = 0.2, fill = "green") +
  # Blue: first settlers (TIME_BP ~1100)
  annotate("rect", xmin = 1100, xmax = 498, 
           ymin = -Inf, ymax = Inf, alpha = 0.2, fill = "blue") +
  # Red: European (TIME_BP ~450)
  annotate("rect", xmin = 498, xmax = -50, 
           ymin = -Inf, ymax = Inf, alpha = 0.2, fill = "red") +
  labs(x = "Time (years BP)", 
       y = "Directionality", 
       color = "Trajectory") +
  theme_minimal() +
  scale_x_reverse()

ggplot(metrics_caldeirao, aes(x = meantime, y = mean_speed)) +
  geom_line(linewidth = 1) +
  # Green: no humans
  annotate("rect", xmin = 2000, xmax = 1100, 
           ymin = -Inf, ymax = Inf, alpha = 0.2, fill = "green") +
  # Blue: first settlers (TIME_BP ~1100)
  annotate("rect", xmin = 1100, xmax = 498, 
           ymin = -Inf, ymax = Inf, alpha = 0.2, fill = "blue") +
  # Red: European (TIME_BP ~450)
  annotate("rect", xmin = 498, xmax = -50, 
           ymin = -Inf, ymax = Inf, alpha = 0.2, fill = "red") +
  labs(x = "Time (years BP)", 
       y = "Speed", 
       color = "Trajectory") +
  theme_minimal() +
  scale_x_reverse()
  

Code for Empadadas

ggplot(metrics_empadadas, aes(x = meantime, y = internal_variance)) +
  geom_line(linewidth = 1) +
  # Green: no humans
  annotate("rect", xmin = 600, xmax = 518, 
           ymin = -Inf, ymax = Inf, alpha = 0.2, fill = "blue") +
  # Red: European (TIME_BP ~450)
  annotate("rect", xmin = 518, xmax = -50, 
           ymin = -Inf, ymax = Inf, alpha = 0.2, fill = "red") +
  labs(x = "Time (years BP)", 
       y = "Internal Variance", 
       color = "Trajectory") +
  theme_minimal() +
  scale_x_reverse()


ggplot(metrics_empadadas, aes(x = meantime, y = directionality)) +
  geom_line(linewidth = 1) +
  # Green: no humans
  annotate("rect", xmin = 600, xmax = 518, 
           ymin = -Inf, ymax = Inf, alpha = 0.2, fill = "blue") +
  # Red: European (TIME_BP ~450)
  annotate("rect", xmin = 518, xmax = -50, 
           ymin = -Inf, ymax = Inf, alpha = 0.2, fill = "red") +
  labs(x = "Time (years BP)", 
       y = "Directionality", 
       color = "Trajectory") +
  theme_minimal() +
  scale_x_reverse()


ggplot(metrics_empadadas, aes(x = meantime, y = mean_speed)) +
  geom_line(linewidth = 1) +
  # Green: no humans
  annotate("rect", xmin = 600, xmax = 518, 
           ymin = -Inf, ymax = Inf, alpha = 0.2, fill = "blue") +
  # Red: European (TIME_BP ~450)
  annotate("rect", xmin = 518, xmax = -50, 
           ymin = -Inf, ymax = Inf, alpha = 0.2, fill = "red") +
  labs(x = "Time (years BP)", 
       y = "Speed", 
       color = "Trajectory") +
  theme_minimal() +
  scale_x_reverse()

How to interpret the metrics

Internal variance (elemental turnover)

Directionality

Speed

USE YOUR OWN DATA

SURPRISE: soundtraj