1 Objective

New data Rh and Rs! (Rs was calucalted as the sum of Rh and root respiration, fairly small difference between them)
What do we want to do about the climate inputs?
Preview how the methods chage the results
What do we want to present to the VCU team mebers?
What do we think about the EGU story line?

2 Rs vs Rh

They are pretty similar to one another I calculated based on Rh and ED root respiration. Is it reasonable for these two values to be so similar to one another?

# Import the results from the experiment one, carbon flux variables only. 
list.files(INPUT_DIR, "exp-1-yr.csv", full.names = TRUE) %>%  
  read.csv() %>%  
  dplyr::filter(variable %in% c("Rh", "Rs")) %>% 
  select(year, value, variable, scn, met) %>% 
  tidyr::spread(variable, value) %>% 
  mutate(value = Rs - Rh) %>% 
  ggplot() + 
  geom_histogram(aes(value)) + 
  labs(title = "What is the difference between Rs and Rh?", 
       x = "MgC ha-1 year-1", 
       y = "Frequency") + 
  THEME

## `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.

3 Climate Inputs

While we were meeting with the VCU team last month, I did get some quesions about why the carbon flux runs converge even in the control runs (the future scearios use different annual met inputs).

# Import the results from the experiment one, carbon flux variables only. 
list.files(INPUT_DIR, "exp-1-yr.csv", full.names = TRUE) %>%  
  read.csv() %>%  
  dplyr::filter(grepl(pattern = "harvest_0", x = scn)) %>% 
  dplyr::filter(variable %in% c("NPP", "Rh", "NEP", "GPP", "Rs")) -> 
  control_carbonflux_exp1

control_carbonflux_exp1 %>% 
  ggplot(aes(year, value, color = met)) + 
  geom_line() + 
  facet_wrap("description", scales = "free") + 
  THEME + 
  labs(title = "Control Runs")

I’ve gone over the input files, checcked them twice, and even do some extra runs with fresh met files configured the same way and still ended up seeing the same behavior, where the fluxes from the met runs started to converge around 2035…

Which is really confusing to me when I do some idealized runs where the same input is used for every year in the run we don’t see that type of convergence…

# Import the results from the experiment one, carbon flux variables only. 
list.files(INPUT_DIR, "met_constant-yr.csv", full.names = TRUE) %>%  
  read.csv() %>%  
#  dplyr::filter(scn %in% c("_met1-con", "_met2-con")) %>% 
  dplyr::mutate(scn = gsub(pattern = "_", replacement = "", x = scn)) %>% 
  dplyr::filter(variable %in% c("NPP", "Rh", "NEP", "GPP", "Rs")) -> 
  carbonF_met_constant
  
carbonF_met_constant %>% 
  ggplot(aes(year, value, color = scn)) + 
  geom_line() + 
  facet_wrap("description", scales = "free") + 
  THEME + 
  labs(title = "Control Runs")

With the two different met tracks we donn’t see that same sort of convergence that we were seeing in the previous runs. We also don’t see the same periodicity we were seeing in the previous runs.

I am not sure if this is actually all that helpful… I’ve gone over some other ED studies and while theye will repeate periods of met data in a row so I am glad that is standard practice, I haven’t seen anyone do idealized runs with the average/constant inputs.

Options * the idealized constant met for the entire run (potentially get a really large set of runs with this… but potnetially would be hard to compare the baseline with UMBS observations) * continue with the converging runs (we would need an explanation for why this is okay/happening) * use inputs from different RCPs (this would require processing ESM data and formating into the hdf5 file format would mean figuring out the variables/units/making sure the timing was correct which seems unideal) * use the shuffle method, where the order of the years is rearrannged that is what the paper CJF paper did

# Add the severity label names to make is so that it is easy to use the FORTE color palates 
# Args 
#   dt: data frame of FoRTE data must have a scn column
add_severity_lables <- function(dt){
  
  out <- dt[ , severity := gsub(pattern = "harvest_|_1day_above", replacement = "", x = scn)]
  out$severity <- paste0(out$severity, " %")
  return(out)
  
}

# There is a problem some negative values, or also 0 values... this needs to be addressed some how! 
# Calculate the log ratio for disturbance this is the data that is used to calcualte the dimensions of stability metrics 
# Args 
#   d: a data frame of ED data it must include the baseline scennario! 
# Rertun: a data frame with the value now beinng the log(treatment/baseline)
ln_disturbance <- function(d){

  # Check inputs 
  req_names <- c('scn', 'year', 'variable', 'description', 'unit', 'value', 'met')
  assert_that(has_name(d, req_names))
  
  baseline_scn <- "harvest_0_1day_above" 
  assert_that(baseline_scn %in% d$scn)

  # Separate the control or baseline scneario values from the output from the other 
  # treatment groups. 
  baseline_values <- d[scn == baseline_scn][ , c("year", "variable", "value", "met")]
  names(baseline_values) <- c("year", "variable", "baseline", "met")
  treatment_values <- d[scn != baseline_scn][ , c("scn", "year", "variable", "description", 
                                                  "unit", "treatment" = "value", "met")]
  names(treatment_values) <- c("scn", "year", "variable", "description", 
                               "unit", "treatment", "met")

  # Combine the control and the treatment values as a wide df, then take the 
  # ln(treatment/control) per year / variable / met realization. 
  disturbance_d <- treatment_values[baseline_values, on = c("year", "variable", "met")]
  disturbance_d$year <- disturbance_d$year - 2019
  disturbance_d <- disturbance_d[year >= 0]
  disturbance_d <- disturbance_d[ , value := log(treatment/baseline)]
  
  return(disturbance_d)
}

# Import the results from the experiment one, carbon flux variables only. 
list.files(INPUT_DIR, "met_constant-yr.csv", full.names = TRUE) %>%  
  read.csv() %>% 
  dplyr::filter(grepl(pattern = "harvest_65_1day_above", x = scn)) %>%
  dplyr::mutate(met = ifelse(grepl(pattern = "met2", scn), "met2", "met1")) %>%  
  dplyr::select(year, variable, disturbance = value, met) -> 
  disturbance 
  
list.files(INPUT_DIR, "met_constant-yr.csv", full.names = TRUE) %>%  
  read.csv() %>% 
  dplyr::filter(!grepl(pattern = "harvest_65_1day_above", x = scn)) %>%
  dplyr::mutate(met = ifelse(grepl(pattern = "met2", scn), "met2", "met1")) %>%  
  dplyr::select(year, variable, control = value, met) -> 
  control 
  
 disturbance %>% 
   dplyr::left_join(control, on = c("year", "variable", "met")) %>% 
   dplyr::mutate(value = log(disturbance / control)) %>% 
   dplyr::filter(variable %in% c("NPP", "Rh", "NEP", "GPP", "Rs")) %>% 
   na.omit %>% 
   dplyr::filter(year >= 2018) %>% 
   ggplot(aes(year, value, color = met)) + 
   geom_line() + 
   facet_wrap("variable", scales = "free") + 
   THEME + 
   labs(y = "ln(disturbance/control)", 
        title = "65% Disturbance for 2 different constant met runs")

## Joining, by = c("year", "variable", "met")

list.files(INPUT_DIR, "exp-1-yr.csv", full.names = TRUE) %>%  
  read.csv() %>%  
  as.data.table() %>% 
  dplyr::filter(variable %in% c("NPP", "Rh", "NEP", "GPP", "Rs"))  %>%
  ln_disturbance() %>%  
  add_severity_lables() -> 
  disturbance_df
disturbance_df %>% 
  ggplot() + 
  geom_line(aes(year, value, color = severity, linetype = met)) + 
  facet_wrap("variable", scales = "free") + 
  labs(x = 'Years since disturbance', y = "ln(disturbance/control)", 
       title = "Variable historical with different future met") + 
  scale_color_manual(values = FORTE_SEVERITY_COLORS) + 
  THEME

4 Resistance

Comparison of the two different methods used to calculate resistance.

# Clculate the resistance of the trough and when that trough occurs 
# Args:
#   data: ED data! 
# Return: a data frame of the trough resistance value 
through_resistance <- function(data){
  
  d <- ln_disturbance(data)
  
  dd <- d[, .(year, scn, variable, met, value)]
  info <- unique(data[, .(variable, description)])
  
  rslt <- do.call(what = "rbind", 
        args = split(dd, interaction(dd$scn, dd$variable, dd$met), drop = TRUE) %>% 
          lapply(function(x){
            index <- which.min(x$value)
            out <- x[index, ]
            out <- out[ , .(scn, trough_year = year, variable, met, trough_resistance = value)]
            return(out)
          }))
  
  out <- rslt[info, on = "variable"]
  out$unit <- "unitless"
  
  return(out)
  
}

list.files(INPUT_DIR, "exp-1-yr.csv", full.names = TRUE) %>%  
  read.csv() %>%  
  as.data.table() %>%  
  through_resistance() %>% 
  add_severity_lables() %>% 
  dplyr::filter(variable %in% c("NPP", "Rh", "NEP", "GPP", "Rs"))  %>%
  na.omit() %>% 
  rename(value = trough_resistance) %>%
  mutate(type = "trough resistance") -> 
  trough_resistance

list.files(INPUT_DIR, "exp-1-yr.csv", full.names = TRUE) %>%  
  read.csv() %>%  
  as.data.table() %>% 
  dplyr::filter(variable %in% c("NPP", "Rh", "NEP", "GPP", "Rs"))  %>%
  ln_disturbance() %>%  
  na.omit() %>% 
  add_severity_lables() %>% 
  filter(year == 0) %>% 
  mutate(type = "yr resistance") -> 
  yr_resistance

pd = position_dodge(0.5)

dplyr::bind_rows(trough_resistance, yr_resistance) %>% 
  ggplot(aes(severity, value, color = type)) + 
  geom_point(position = position_jitterdodge(), alpha = 0.5, size = 2) + 
  facet_wrap("variable", scales = "free") + 
  labs(y = "ln(dist/control) unitless", 
       title = "Comparing Resistance") + 
  THEME

5 Resilience

disturbance_df %>% 
  left_join(select(trough_resistance, met, variable, severity, trough_year)) %>% 
  dplyr::filter(year >= trough_year) %>% 
  dplyr::mutate(year = year - trough_year) %>% 
  select(-trough_year) -> 
  resillience_data1

## Joining, by = c("variable", "met", "severity")

Now let’s take a look at the data that will be feeding into the recovery and resillience calculations.

resillience_data1 %>%   
  dplyr::mutate(label = paste0(description, "(", unit, ")")) %>% 
  ggplot(aes(year, value, color = severity, linetype = met)) + 
  geom_line() + 
   facet_wrap('variable', scales = 'free',  
            labeller = label_wrap_gen(width = 30, multi_line = TRUE), ncol = 2) + 
  THEME + 
  labs(y = 'ln(Disturbance/Baseline)', 
       x = 'Year post trough resistance', 
       title = "Time series post the trough resistance") + 
  scale_color_manual(values = FORTE_SEVERITY_COLORS) + 
  guides(linetype = FALSE)

df_list <- split(resillience_data1, 
                 interaction(resillience_data1$scn, resillience_data1$variable, resillience_data1$met, sep = "~"), 
                 drop = TRUE)
df_list1 <- df_list

fit_data <- function(d){
  
  x <- d$year
  y <- d$value
  
  lin.mod <- lm(y ~ x)
  segmented.mod <- segmented(lin.mod, seg.Z = ~x)
  
  out <- list("lin.mod" = lin.mod, "seg.mod" = segmented.mod)
  return(out)

}

extract_fit_data <- function(l, n){
  
  ln_slope <- l[["lin.mod"]]$coefficients[2]

  # If it was determined that there was no segment
  if(length(l[["seg.mod"]]) == 12){
    seg_slope1 <- l[["seg.mod"]]$coefficients[2]
    seg_slope2 <- NA
    seg_brk <- NA 
    
  } else {
    slopes <- slope(l[["seg.mod"]])
    seg_slope1 <- slopes$x[1,1] # Extract the slope of the first segment
    seg_slope2 <- slopes$x[2,1] # Extract the slope of the second segment
    seg_brk <- l[["seg.mod"]]$psi[1,3] # Extract the year of the break point 
  }
  
  # Extract the data about the fit. 
  info <- unlist(strsplit(n, split = "~"))
  
  # Format the slope into a data table 
  dt <- data.table(scn = info[1], 
                   variable = info[2], 
                   met = info[3],
                   lin_slope = ln_slope, 
                   seg_slope1 = seg_slope1, 
                   seg_slope2 = seg_slope2, 
                   seg_brk = seg_brk) 

}

fits <- lapply(df_list, fit_data)
fits1 <- fits
mapply(FUN = extract_fit_data, l = fits, n = names(fits), SIMPLIFY = FALSE) %>% 
  do.call(what = "rbind") %>%  
  add_severity_lables() -> 
  compare_slope_fits1

resillience_data1 %>%  
  filter(year <= 10) -> 
  resillience_data2

df_list <- split(resillience_data2, 
                 interaction(resillience_data2$scn, resillience_data2$variable, resillience_data2$met, sep = "~"), 
                 drop = TRUE)

df_list2 <- df_list
fits <- lapply(df_list, fit_data)
fits2 <- fits
mapply(FUN = extract_fit_data, l = fits, n = names(fits), SIMPLIFY = FALSE) %>% 
  do.call(what = "rbind") %>%  
  add_severity_lables() -> 
  compare_slope_fits2

resillience_data1$data <- "all years"
resillience_data2$data <- "first 10 years"

dplyr::bind_rows(resillience_data1, resillience_data2) %>%  
  ggplot() + 
  geom_point(aes(year, value, color = data)) + 
  facet_wrap("variable", scales = "free") + 
  labs(title = "Compare the data sets used in the different regression analyses")

compare_slope_fits1$data <- "all years"
compare_slope_fits2$data <- "1st 10 years"

dplyr::bind_rows(compare_slope_fits1, compare_slope_fits2) %>%  
  ggplot(aes(severity, lin_slope, color = data)) + 
  geom_point(position = position_jitterdodge(), alpha = 0.5, size = 2) + 
  facet_wrap("variable", scales = "free") + 
  labs(y = "slope of linear fit", 
       title = "Compare Linear Fits with data used") + 
  THEME

dplyr::bind_rows(compare_slope_fits1, compare_slope_fits2) %>%  
  ggplot(aes(severity, lin_slope, color = data)) + 
  geom_point(position = position_jitterdodge(), alpha = 0.5, size = 2) + 
  facet_wrap("variable", scales = "free") + 
  labs(y = "fit of 1st segmented slope", 
       title = "Compare Segmented Slope Fits with data used") + 
  THEME

# Generate data that compares the 
do.call(what = "rbind",
        args = mapply(function(fit, d){
  y_seg <- predict(fit$seg.mod)
  y_lm  <- predict(fit$lin.mod)
  
  out <- cbind(d, data.table(seg_vals = y_seg, lm_vals = y_lm))
  return(out)
}, fit = fits1, d = df_list1, SIMPLIFY = FALSE)) -> 
  data

split(data, interaction(data$variable, data$severity)) %>%  
  lapply(function(d){
    
    var <-  unique(d$variable)
    treat <- unique(d$severity)
    
    ggplot(d) + 
      geom_point(aes(year, value)) + 
      geom_line(aes(year, lm_vals, color = "linear fit"), size = 1) + 
      geom_line(aes(year, seg_vals, color = "segmented fit"), size = 1) +
      facet_wrap("met", scales = "free") + 
      theme_bw() + 
      labs(y = "ln(dist/control)", x = "year since resistance trough", title = paste0(var, " ", treat)) -> 
      p
    
    return(p)
  }) -> 
  plot_fits1

# Generate data that compares the 
do.call(what = "rbind",
        args = mapply(function(fit, d){
  y_seg <- predict(fit$seg.mod)
  y_lm  <- predict(fit$lin.mod)
  
  out <- cbind(d, data.table(seg_vals = y_seg, lm_vals = y_lm))
  return(out)
}, fit = fits2, d = df_list2, SIMPLIFY = FALSE)) -> 
  data

split(data, interaction(data$variable, data$severity)) %>%  
  lapply(function(d){
    
    var <-  unique(d$variable)
    treat <- unique(d$severity)
    
    ggplot(d) + 
      geom_point(aes(year, value)) + 
      geom_line(aes(year, lm_vals, color = "linear fit"), size = 1) + 
      geom_line(aes(year, seg_vals, color = "segmented fit"), size = 1) +
      facet_wrap("met", scales = "free") + 
      theme_bw() + 
      labs(y = "ln(dist/control)", x = "year since resistance trough", title = paste0(var, " ", treat)) -> 
      p
    
    return(p)
  }) -> 
  plot_fits2

plot_fits1$`GPP.85 %` + 
  labs(title = "Evaluate the fits from using all of the years", 
           subtitle = "GPP 85%")

plot_fits2$`GPP.85 %` + 
  labs(title = "Evaluate the fits using only the 1st 10 years", 
       subtitle = "GPP 85%")

Looks to me like the first 10 years of data was much better than using data from all of the years.

Now compare the resilience vlaue reported by the two different fit methods.

compare_slope_fits2 %>% 
  add_severity_lables() %>% 
  select(scn, variable, met, lin_slope, seg_slope1, severity) %>% 
  tidyr::pivot_longer(cols = c(lin_slope, seg_slope1), names_to = "method") %>% 
  ggplot(aes(severity, value, color = method)) + 
  geom_point(position = position_jitterdodge(), alpha = 0.5, size = 2) + 
  facet_wrap("variable", scales = "free") + 
  labs(y = "") + 
  THEME

For the carbon flux variables that are approximately linear the different fits don’t make much of a difference Rs and Rh where as for the other carbon flux variables the fit method matters more.

6 Thoughts & Questions

What do we want to do about the ensmble?
What do we want to focus on during the VCU meeting?
EGU story line I think should be more related to manuscript than the comparison of methods…
What do we want to do about forte data?
- I think that it would be really helpful to have some actual observations to compare with…

BBL April FoRTE Catch Up