Diagnose Linearity in DGVMs

library(tidyverse)

## ── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ──
## ✔ dplyr     1.1.4     ✔ readr     2.1.5
## ✔ forcats   1.0.0     ✔ stringr   1.5.1
## ✔ ggplot2   3.5.2     ✔ tibble    3.3.0
## ✔ lubridate 1.9.4     ✔ tidyr     1.3.1
## ✔ purrr     1.1.0     
## ── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
## ✖ dplyr::filter() masks stats::filter()
## ✖ dplyr::lag()    masks stats::lag()
## ℹ Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to become errors

library(here)

## here() starts at /Users/benjaminstocker/dgvmlin-stineb

# remotes::install_github("geco-bern/rgeco")
library(rgeco)

source(here("R/collect_gdf_bymodl_HZ_Trendyv11.R"))
source(here("R/nc_to_df.R"))
source(here("R/read_nc_onefile.R"))

# # create script from this vignette
# knitr::purl(
#   here("vignettes/DGVMLIN_trendy_v11.rmd"), 
#   output = here("analysis/dgmvlin.R"), 
#   documentation = 1
#   )

#setwd("~/E/RA/DGVM_density_plot/trendy_v9") #h
#setwd('F:/Oxford/Chapter_four/dgvmlin')
# If you only want to draw figures, only do this: load(here("data/gdf_Trendyv11_202406.rda"))

Approach

Does (steady-state) SOC storage, simulated in DGVMs, scale with litter inputs? And do litter inputs scale with biomass? To investigate this, we can look at the following relationships:

The relative enhancement in steady-state SOC vs. the relative enhancement in NPP. \(C^\ast\) is the steady-state SOC. It is given by the initial SOC stocks in simulations (spun up to steady state). \[ \frac{\Delta C^\ast}{C^\ast} / \frac{\Delta NPP}{NPP_\text{init}} \]
The relative enhancement in steady-state SOC vs. the relative enhancement in biomass (\(B\)). \[ \frac{\Delta C^\ast}{C^\ast} / \frac{\Delta B}{B_\text{init}} \] \(\Delta\) is the difference between end and beginning of a transient simulation covering the historical period, where only CO2 changes (hence, soil turnover rates should not change in a typical 1st order kinetics-based SOC model).

Our hypothesis is that these relationships follow the 1:1 line.

\(\Delta C^\ast\) cannot be extracted directly from simulations. But it can be estimated from the imbalance between soil respiration and litter input before a steady state is reached: \[ \tau \approx \frac{C(t)}{\text{NPP}(t) - \Delta C(t)} \]

Combine this with \(\Delta C^\ast=\tau \; \text{NPP}\), to get: \[ C^\ast = \frac{C(t)}{1 - \frac{\Delta C(t)}{\text{NPP}(t)}} \] These variables are extracted from TRENDY S1 simulations as follows:

\(C(t)\): soil C at the end of the simulation (mean over 2008-2017)
\(\Delta C(t)\): Annual change in soil C during the last decade (2008-2017) (filn_cSoil_change - MUST BE DIVIDED BY 10!)
NPP\((t)\): Annual mean NPP during the last decade (2008-2017)
\(\Delta C^\ast\): \(C^\ast\) (as defined above) minus soil C during the first decade of the simulation (spun up to steady state)

The same is done with biomass (\(B\)).

Process TRENDY v7 files

(Adjust path for filnams to select v8 or v7)

From downloaded TRENDY outputs (here, simulation S1: only CO2 is changing), get global fields of:

Multi-annual mean NPP at the beginning of the simulation
Multi-annual mean NPP at the end of the simulation
Multi-annual mean SOC at the beginning of the simulation
Multi-annual mean SOC at the end of the simulation
Multi-annual mean biomass at the beginning of the simulation
Multi-annual mean biomass at the end of the simulation
Change in SOC over a multi-annual period at the end of the simulation (for \(\Delta C(t)\))

Simulations should cover 1700-2017 (318 years = time step in NetCDF outputs). But not all do.

Starting in year 1700: CABLE-POP, CLM5.0, JSBACH, LPX-Bern, LPJ-GUESS, OCN, ORCHIDEE-CNP, SDGVM
Starting in year 1701: CLASS-CTEM, ISAM, JULES, SURFEX
Starting in year 1900: DLEM
Starting in year 1901: ORCHIDEE
Starting in year 1860: VISIT

Peculiarities of outputs by model prohibiting their processing:

LPJ: No S1 outputs
DLEM: failed NPP processing
LPX-Bern: failed NPP processing
VISIT: No NPP output available

No separate output for pools (wood, leaf, root)

CLASS-CTEM
DLEM
JSBACH
JULES
OCN
SDGVM (only wood missing, could be derived by cdo -O sub SDGVM_S1_cVeg.nc SDGVM_S1_croot.nc SDGVM_S1_cLeaf.nc. cWood is then assumed zero: cdo mulc,0 SDGVM_S1_cLeaf.nc SDGVM_S1_cWood.nc, and cdo -O chname,cLeaf,cWood SDGVM_S1_cWood.nc SDGVM_S1_cWood2.nc, but is not!)
SURFEX
VISIT

Process files: multi-annual mean

Determine for which models we have all the files (cWood and cVeg) to calculate wood mass fractions and create system command.

filnams <- read_csv(here("data-raw/filnams_trendy_v11_S1.csv" )) %>% 
  setNames(c("modl", paste0("filn_", names(.)[-1])))

## Rows: 16 Columns: 11
## ── Column specification ────────────────────────────────────────────────────────
## Delimiter: ","
## chr (11): modl, npp, cVeg, cSoil, cWood, cLeaf, cRoot, gpp, rh, cLitter, lan...
## 
## ℹ Use `spec()` to retrieve the full column specification for this data.
## ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.

modls <- filnams %>% 
  pull(modl)

# This is a huge list of nc files that we will read
df <- filnams %>% 
  
  ## filter and correct typo
  filter(modl %in% modls) %>%
  mutate(dir = paste0(here(),"")) %>%
  
  # Please make sure R here() grab an address pointing to the data folder containing your nc files
  # my here() has "F:/Oxford/Chapter_four/dgvmlin", under this address it has a data folder
  mutate_at(vars(starts_with("filn")), ~str_replace(., "\\.nc", "")) %>%
  mutate_at(vars(starts_with("filn")), ~str_replace(., "cleaf", "cLeaf")) %>%
  mutate_at(vars(starts_with("filn")), ~str_replace(., "croot", "cRoot")) %>%
  mutate_at(vars(starts_with("filn")), ~str_replace(., "csoil", "cSoil")) %>%
  mutate_at(vars(starts_with("filn")), ~str_replace(., "cveg", "cVeg")) %>%
  mutate_at(vars(starts_with("filn")), ~str_replace(., "cwood", "cWood")) %>%
  mutate_at(vars(starts_with("filn")), ~str_replace(., "crootf", "cRoot")) %>%
  
  ## get starting year
  left_join(read_csv(here("data-raw/startyear_trendy_v11_S1.csv" )), by = "modl") %>% 
  rename(startyear_init = startyear, startyear_npp_init = startyear_npp) %>% 
  mutate(endyear_init = startyear_init + 9, endyear_npp_init = startyear_npp_init + 9) %>% 
  
  rowwise() %>% 
  mutate(filn_cVeg_init  = paste0(filn_cVeg, "_INIT_MEAN"),
         filn_cVeg_final = paste0(filn_cVeg, "_FINAL_MEAN"),

         filn_cLeaf_init  = paste0(filn_cLeaf, "_INIT_MEAN"),
         filn_cLeaf_final = paste0(filn_cLeaf, "_FINAL_MEAN"),

         filn_cRoot_init  = paste0(filn_cRoot, "_INIT_MEAN"),
         filn_cRoot_final = paste0(filn_cRoot, "_FINAL_MEAN"),

         filn_cWood_init  = paste0(filn_cWood, "_INIT_MEAN"),
         filn_cWood_final = paste0(filn_cWood, "_FINAL_MEAN"),
         
         filn_cSoil_init  = paste0(filn_cSoil, "_INIT_MEAN"),
         filn_cSoil_final = paste0(filn_cSoil, "_FINAL_MEAN"),
         
         filn_cSoil_change = paste0(filn_cSoil, "_CHANGE"),
         filn_cVeg_change  = paste0(filn_cVeg, "_CHANGE"),
         filn_cLeaf_change = paste0(filn_cLeaf, "_CHANGE"),
         filn_cWood_change = paste0(filn_cWood, "_CHANGE"),
                  
         filn_npp_init  = paste0(filn_npp, "_INIT_MEAN"),
         filn_npp_final = paste0(filn_npp, "_FINAL_MEAN"),
         
         filn_gpp_init  = paste0(filn_gpp, "_INIT_MEAN"),
         filn_gpp_final = paste0(filn_gpp, "_FINAL_MEAN"),
          
         filn_rh_init  = paste0(filn_rh, "_INIT_MEAN"),
         filn_rh_final = paste0(filn_rh, "_FINAL_MEAN"),
         
         filn_cLitter_init  = paste0(filn_cLitter, "_INIT_MEAN"),
         filn_cLitter_final = paste0(filn_cLitter, "_FINAL_MEAN")
         )

## Rows: 16 Columns: 3
## ── Column specification ────────────────────────────────────────────────────────
## Delimiter: ","
## chr (1): modl
## dbl (2): startyear, startyear_npp
## 
## ℹ Use `spec()` to retrieve the full column specification for this data.
## ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.

Collect data

… into tidy data frame

gdf <- purrr::map(

  as.list(seq(nrow(df))),

  ~collect_gdf_bymodl(

    df$modl[.],
    df$dir[.],

    df$filn_cSoil_init[.],
    df$filn_cSoil_final[.],

    df$filn_cVeg_init[.],
    df$filn_cVeg_final[.],

    df$filn_cLeaf_init[.],
    df$filn_cLeaf_final[.],

    df$filn_cRoot_init[.],
    df$filn_cRoot_final[.],

    df$filn_cWood_init[.],
    df$filn_cWood_final[.],

    df$filn_npp_init[.],
    df$filn_npp_final[.],

    df$filn_cSoil_change[.],
    df$filn_cVeg_change[.],
    
    df$filn_gpp_init[.],
    df$filn_gpp_final[.],
    
    df$filn_rh_init[.],
    df$filn_rh_final[.],
    
    df$filn_cLitter_init[.],
    df$filn_cLitter_final[.],
    
    df$filn_landCoverFrac[.]
    )) %>% 
  
  bind_rows() %>%

  group_by(modl) %>%
  nest()

## Collecting outputs for CABLE-POP
## Collecting outputs for CLASSIC
## Collecting outputs for CLM5.0
## Collecting outputs for DLEM
## Collecting outputs for IBIS
## Collecting outputs for ISAM
## Collecting outputs for ISBA-CTRIP
## Collecting outputs for JSBACH
## Collecting outputs for JULES
## Collecting outputs for LPJ
## Collecting outputs for LPJ-GUESS
## Collecting outputs for LPX-Bern
## Collecting outputs for ORCHIDEE
## Collecting outputs for SDGVM
## Collecting outputs for VISIT
## Collecting outputs for VISIT-NIES

# Huanyuan's previous data object, load this instead of reading from file data/gdf.rds
# load(here("data/gdf_Trendyv11_202407.rda"))

Cleaning

Remove outputs for gridcells that have zero GPP or cVeg at the start or at the end.

remove_nonvegetated <- function(df){
  df %>% 
    mutate(testvar = gpp_init * gpp_final * cveg_init * cveg_final) %>% 
    mutate(across(
      .cols = matches("(_init|_final|_change)$"),
      .fns = ~ ifelse(testvar == 0, NA, .)
    ))
}

gdf <- gdf %>% 
  mutate(data = map(data, ~remove_nonvegetated(.)))

Longitude transformation for CLM5.0, ISAM, and JULES.

shift_lon <- function(df){
  df |> 
    mutate(lon = ifelse(lon > 180, lon - 360, lon))
}

gdf <- gdf |> 
  mutate(data = map2(.x = data,
    .y = modl,
    .f = ~ if (.y %in% c("CLM5.0", "ISAM", "JULES")) shift_lon(.x) else .x
  ))

Overwrite landCoverFrac info for CABLE-POP. Many NA in this file, anyway no land cover change, we replace them with 0.

gdf <- gdf |> 
  mutate(data = map2(.x = data,
    .y = modl,
    .f = ~ if (.y  == "CABLE-POP") mutate(.x, landCoverFrac = 0) else .x
  ))

Calculate relationships

calc_steady_state <- function(c_t, dcdt, npp){
  # c_t * 1/(1 - dcdt/npp)
  c_t * npp / (npp - dcdt)
}

calc_steady_state_alternative <- function(c_t, npp, rh){
  c_t * npp / rh
}

get_steadystate <- function(df){

  df %>%

    # mutate(csoil_change = csoil_final - csoil_init) %>%   # overwriting what's read from file - XXX This was wrong XXX
    # _change is the Csoil(t+1) - Csoil(t) in final ten years, not the dCsoil = Csoil(t) - Csoil(1)

    mutate(
      csoil_change = csoil_change / 10,
      cveg_change = cveg_change / 10
      ) %>%

    # mutate(csoil_star_final = csoil_final / (1.0 - csoil_change/npp_final)) %>%

    # applying clean function
    mutate(
      csoil_star_final = calc_steady_state(
        c_t = csoil_final,
        dcdt = csoil_change,
        npp = npp_final
        ),
      cveg_star_final = calc_steady_state(
        c_t = cveg_final,
        dcdt = cveg_change,
        npp = npp_final
        )
      )
}

# # Version used by Huanyuan
# get_steadystate <- function(df){
#   df %>%
#     mutate(csoil_star_init  = csoil_init) %>%
#     # mutate(csoil_change = csoil_final - csoil_init) %>%   # overwriting what's read from file - XXX This was wrong XXX
#     # _change is the Csoil(t+1) - Csoil(t) in final ten years, not the dCsoil = Csoil(t) - Csoil(1)
#     mutate(csoil_change = csoil_change / 10.0) %>%
#     mutate(csoil_star_final = csoil_final / (1.0 - (csoil_change)/npp_final )) %>%
#     mutate(fveg_XX = npp_final - cveg_change / 10) %>%
#     mutate(csoil_star_final = npp_final * (csoil_final/rh_final)) %>% # HY proposed a new equation here for Csoil_star_final
#     mutate(cveg_star_final = npp_final * (cveg_final/fveg_XX))
# }

get_deltas <- function(df){
  df %>%
    mutate(cveg_ag_init = cleaf_init + cwood_init,
           cveg_ag_final = cleaf_final + cwood_final,
           csoil_final = csoil_final + cLitter_final,
           csoil_init = csoil_init + cLitter_init
           ) %>%
    mutate(dcveg = (cveg_final - cveg_init)/cveg_init,
           dcveg_ag = (cveg_ag_final - cveg_ag_init)/cveg_ag_init,
           dnpp = (npp_final - npp_init)/npp_init,
           dcsoil = (csoil_final - csoil_init)/csoil_init,
           dcsoil_star = (csoil_star_final - csoil_init)/csoil_init,
           dgpp = (gpp_final - gpp_init)/gpp_init,
           dcleaf= (cleaf_final - cleaf_init)/cleaf_init,
           dcveg_star = (cveg_star_final - cveg_init)/cveg_init,
           dcroot = (croot_final - croot_init)/croot_init,
           dcwood = (cwood_final - cwood_init)/cwood_init
           ) %>%
    mutate(l_npp_gpp = dnpp / dgpp,
           l_croot_cveg = dcroot / dcveg,
           l_croot_cveg_star = dcroot / dcveg_star,
           l_cveg_star_npp = dcveg_star / dnpp
           )
}


filter_no_landcoverchange <- function(df){
  df %>%
    filter(landCoverFrac == 0)
}

gdf <- gdf %>%
  mutate(
    data = purrr::map(data, ~get_steadystate(.)),
    ) %>%
  mutate(
    data = purrr::map(data, ~get_deltas(.))
    ) %>%
  mutate(
    data = purrr::map(data, ~filter_no_landcoverchange(.))
    )

Regrid

Regrid all models’ outputs to a common 1x1 deg grid. Note: this takes quite long, therefore reading from file if available.

filnam <- here("data/gdf.rds")

if (!file.exists(filnam)){
  
  gdf <- gdf %>% 
      mutate(
        data_1x1deg = purrr::map(
          data, 
          ~regrid_df(
            ., 
            res = 1.0, 
            lon_start = -180,
            lon_end = 180,
            lat_start = -90,
            lat_end = 90
            )
          )
        )
  
  readr::write_rds(gdf, file = filnam)
  
} else {
  
  gdf <- readr::read_rds(file = filnam)
  
}

Model meta info

summary_table <- tibble(
  modl = c(
    'ALL',
    'CLASSIC',
    'IBIS',
    'LPJ',
    'ORCHIDEE',
    'VISIT',
    'ISBA-CTRIP',
    'CABLE-POP',
    'CLM5.0',
    'DLEM',
    'ISAM',
    'JSBACH',
    'JULES',
    'LPJ-GUESS',
    'LPX-Bern',
    'VISIT-NIES',
    'SDGVM'
    ),
  cn_coupled = c(
    NA,
    FALSE,
    FALSE,
    FALSE,
    TRUE,
    FALSE,
    FALSE,
    TRUE,
    TRUE,
    TRUE,
    TRUE,
    TRUE,
    TRUE,
    TRUE,
    TRUE,
    FALSE,
    TRUE
    )
  )

gdf <- gdf %>% 
  left_join(
    summary_table,
    by = join_by(modl)
  )

Plot relationships

Preparations.

get_p_value <- function(x, na.rm=TRUE) {
  p.value <- t.test(x, mu = 1)$p.value
  p.value
}

# sample for even representation, number determined by model with smallest 
# number of gridcells
# XXX this seems to have been updated to using more values per model. Please correct.
# nsample <- gdf %>% 
#   mutate(len = purrr::map_int(data, ~nrow(.))) %>% 
#   pull(len) %>% 
#   min()

# xxx adopted specification from paper here 
nsample <- 58697

coast <- rnaturalearth::ne_coastline(
  scale = 110, 
  returnclass = "sf"
  )

GPP vs NPP

All models pooled

set.seed(1982)

gdf_sampled <- gdf %>%
  mutate(data = purrr::map(data, ~slice_sample(., n = nsample))) %>%
  unnest(data) %>%
  dplyr::filter(
    !is.nan(dnpp), !is.nan(dgpp), 
    !is.infinite(dnpp), !is.infinite(dgpp),
    !is.nan(l_npp_gpp), !is.infinite(l_npp_gpp))

# density scatter-plot for pooled data, balanced samples, original grid
gg_densityscatter <- gdf_sampled %>%
  ggplot(aes(x = dgpp, y = dnpp)) +
  geom_hex(bins = 50, show.legend = FALSE) +
  theme_classic() +
  geom_abline(intercept = 0, slope = 1, linetype = "dotted") +
  geom_hline(yintercept = 0, linetype = "dotted") +
  # geom_vline(xintercept = 0, linetype = "dotted") +
  khroma::scale_fill_batlowW(trans = "log", reverse = TRUE) +
  xlim(-1, 2) + 
  ylim(-1, 2) +
  labs(
    x = expression(paste(Delta, "GPP/GPP")),
    y = expression(paste(Delta, "NPP/NPP"))
  )

# Number for text
print("Quartiles of L_NPP:GPP")

## [1] "Quartiles of L_NPP:GPP"

print(format(quantile(gdf_sampled$l_npp_gpp, probs =  c(0.25, 0.5, 0.75), na.rm = TRUE), digits = 3))

##     25%     50%     75% 
## "0.932" "1.022" "1.119"

# Lables for Histogram  
labels <- gdf_sampled %>%
  ungroup()%>%
  mutate(l_larger = l_npp_gpp > 1) %>%
  summarise(
    n_recs = n(),
    perc_larger = round(sum(l_larger, na.rm = TRUE)/n_recs, digits = 2),
    perc_smaller = 1 - perc_larger
    )

# Density plot
gg_density <- gdf_sampled %>% 
  ggplot(aes(x = l_npp_gpp, y = ..density..)) +
  geom_density(fill = "grey70") +
  theme_classic() +
  geom_hline(yintercept = 0, linetype = "dotted") +
  # # geom_vline(xintercept = 0, linetype = "dotted") +
  geom_vline(xintercept = 1, linetype = "dotted") +
  xlim(-1, 2.5) + 
  geom_text(
    data = labels,
    aes(x = 0.5, y = Inf, label = perc_smaller),
    vjust = 1.5
    ) +
  geom_text(
    data = labels,
    aes(x = 1.5, y = Inf, label = perc_larger),
    vjust = 1.5
    ) +
  labs(
    x = expression(paste(italic(L)[NPP:GPP]))
  )

# Combine density scatter and density distribution plots
gg3_gpp_npp <- cowplot::plot_grid(
  gg_densityscatter, 
  gg_density, 
  align = 'h',     # 'v' for vertical alignment
  axis = 'tb',
  ncol = 2,
  labels = c("a", "b")
)

## Warning: Removed 53612 rows containing non-finite outside the scale range
## (`stat_binhex()`).

## Warning: Removed 16 rows containing missing values or values outside the scale range
## (`geom_hex()`).

## Warning: The dot-dot notation (`..density..`) was deprecated in ggplot2 3.4.0.
## ℹ Please use `after_stat(density)` instead.
## This warning is displayed once every 8 hours.
## Call `lifecycle::last_lifecycle_warnings()` to see where this warning was
## generated.

## Warning: Removed 58119 rows containing non-finite outside the scale range
## (`stat_density()`).

gg3_gpp_npp

ggsave(here("fig/dgpp_dnpp_trendy_s1_pooled.pdf"), plot = gg3_gpp_npp, width = 8, height = 4)

By model

set.seed(1982)

gdf_sampled <- gdf %>%
  unnest(data) %>%
  bind_rows(
    .,
    gdf %>%
      mutate(n = purrr::map_int(data, ~nrow(.))) %>%
      mutate(data = purrr::map(data, ~slice_sample(., n = nsample, replace = TRUE))) %>% # smallest set here
      unnest(data) %>%
      mutate(modl = "ALL")
  ) %>%
  dplyr::filter(
    !is.nan(dnpp), !is.nan(dgpp), 
    !is.infinite(dnpp), !is.infinite(dgpp),
    !is.nan(l_npp_gpp), !is.infinite(l_npp_gpp))

# density scatter
gdf_sampled %>%
  ggplot(aes(x = dgpp, y = dnpp)) +
  geom_hex(bins = 50, show.legend = FALSE) +
  theme_classic() +
  geom_abline(intercept = 0, slope = 1, linetype = "dotted") +
  geom_hline(yintercept = 0, linetype = "dotted") +
  # geom_vline(xintercept = 0, linetype = "dotted") +
  khroma::scale_fill_batlowW(trans = "log", reverse = TRUE) +
  xlim(-1, 2) + 
  ylim(-1, 2) +
  labs(
    x = expression(paste(Delta, "GPP/GPP")),
    y = expression(paste(Delta, "NPP/NPP"))
  ) +
  facet_wrap(. ~ modl,  ncol = 3,  scales = "free") +
  theme(
    legend.position = "none",
    strip.background = element_blank(),
    strip.text = element_text(size = 12, hjust = 0)
    )

## Warning: Removed 160100 rows containing non-finite outside the scale range
## (`stat_binhex()`).

## Warning: Removed 42 rows containing missing values or values outside the scale range
## (`geom_hex()`).

ggsave(here("fig/dnpp_dgpp_trendy_s1_bymodl.pdf"), width = 9, height = 13)

## Warning: Removed 160100 rows containing non-finite outside the scale range
## (`stat_binhex()`).
## Removed 42 rows containing missing values or values outside the scale range
## (`geom_hex()`).

ggsave(here("fig/dnpp_dgpp_trendy_s1_bymodl.png"), width = 9, height = 13)

## Warning: Removed 160100 rows containing non-finite outside the scale range
## (`stat_binhex()`).
## Removed 42 rows containing missing values or values outside the scale range
## (`geom_hex()`).

# we need to get some stat for the overall excel table
df_quantiles <- gdf_sampled %>%
  group_by(modl) %>%
  summarise(l_npp_gpp_50 = quantile(l_npp_gpp, probs = 0.5, na.rm = TRUE)) 

summary_table <- left_join(summary_table, df_quantiles, by = 'modl')

# Lables for Histogram  
labels <- gdf_sampled %>%
  mutate(l_larger = l_npp_gpp > 1) %>%
  group_by(modl) %>%
  summarise(
    n_recs = n(),
    perc_larger = round(sum(l_larger, na.rm = TRUE)/n_recs,digits = 2),
    perc_smaller = 1 - perc_larger
    )

# Histograms per model
gdf_sampled %>%
  ggplot() +
  geom_density(aes(x = l_npp_gpp, y = ..density..),fill = "grey70") +
  theme_classic() +
  geom_hline(yintercept = 0, linetype = "dotted") +
  # geom_vline(xintercept = 0, linetype = "dotted") +
  geom_vline(xintercept = 1, linetype = "dotted") +
  xlim(-1, 2.5) +
  geom_text(data = labels, aes(x = 0.5, y = Inf, label = perc_smaller), vjust =  1.5) +
  geom_text(data = labels, aes(x = 1.5, y = Inf, label = perc_larger), vjust = 1.5)+
  labs(
    x = expression(paste(italic(L)[NPP:GPP]))
  ) +
  facet_wrap(. ~ modl, ncol = 3,  scales = "free")

## Warning: Removed 172454 rows containing non-finite outside the scale range
## (`stat_density()`).

ggsave(here("fig/hist_dnpp_dgpp_trendy_bymodl.pdf"), width = 9, height = 10.5)

## Warning: Removed 172454 rows containing non-finite outside the scale range
## (`stat_density()`).

ggsave(here("fig/hist_dnpp_dgpp_trendy_bymodl.png"), width = 9, height = 10.5)

## Warning: Removed 172454 rows containing non-finite outside the scale range
## (`stat_density()`).

Maps

By model.

plot_by_model <- function(use_modl, df){
  
  gdf_long <- df %>%
    unnest(data) %>%
    dplyr::filter(!is.nan(l_npp_gpp), !is.infinite(l_npp_gpp)) |> 
    filter(modl == use_modl)
  
  ggplot() +
    geom_tile(
      data = gdf_long,
      aes(x = lon, y = lat, fill = l_npp_gpp),
      show.legend = TRUE
      ) +
    geom_sf(
      data = coast,
      colour = 'black',
      linewidth = 0.3
    )  +
    coord_sf(
      ylim = c(-60, 85),
      expand = FALSE
    ) +
    khroma::scale_fill_vik(
      name = expression(paste(italic(L)[NPP:GPP])),
      limits = c(0, 2), 
      oob = scales::squish,
      midpoint = 1,
      reverse = TRUE
      ) +  # suppress individual legends
    labs(
      title = use_modl
    ) +
    theme_void() +
    theme(
      legend.position = "none",
      plot.title = element_text(hjust = 0.5, size = 10) 
      )
}

plotlist <- purrr::map(
  as.list(unique(gdf$modl)),
  ~ plot_by_model(., gdf)
)

legend_plot <- plotlist[[1]] +
    theme(legend.position = "right")

legend <- cowplot::get_legend(legend_plot)

map_grid <- cowplot::plot_grid(
  plotlist = plotlist,
  ncol = 3
)

final_plot <- cowplot::plot_grid(
  map_grid,
  legend,
  ncol = 2,
  rel_widths = c(1, 0.08)  # adjust the legend width as needed
)

final_plot

ggsave(
  filename = here("fig/map_dgpp_dnpp_bymodel.pdf"),
  height = 10,
  width = 10
  )

ggsave(
  filename = here("fig/map_dgpp_dnpp_bymodel.png"),
  height = 10,
  width = 10
  )

All models pooled.

gdf_long_1x1deg <- gdf %>%
  select(-data) %>% 
  unnest(data_1x1deg) %>%
  group_by(lon, lat) %>% 
  summarise(
    l_npp_gpp_median = median(l_npp_gpp, na.rm = TRUE),
    l_npp_gpp_sd = sd(l_npp_gpp, na.rm = TRUE)
  )

## `summarise()` has grouped output by 'lon'. You can override using the `.groups`
## argument.

gg_map_agg <- ggplot() +
  geom_tile(
    data = gdf_long_1x1deg,
    aes(x = lon, y = lat, fill = l_npp_gpp_median),
    show.legend = TRUE
    ) +
  geom_sf(
    data = coast,
    colour = 'black',
    linewidth = 0.3
  )  +
  coord_sf(
    ylim = c(-60, 85),
    expand = FALSE
  ) +
  khroma::scale_fill_vik(
    name = expression(paste(italic(L)[NPP:GPP])),
    limits = c(0.2, 1.8),  
    oob = scales::squish,
    midpoint = 1,
    reverse = TRUE
    ) +
  theme_void()

gg_map_agg

## Warning: Removed 297 rows containing missing values or values outside the scale range
## (`geom_tile()`).

Publication Fig. 1

cowplot::plot_grid(
  gg3_gpp_npp,
  gg_map_agg, 
  ncol = 1,
  labels = c("", "c"),
  rel_heights = c(1, 0.9)
)

## Warning: Removed 297 rows containing missing values or values outside the scale range
## (`geom_tile()`).

ggsave(filename = here("fig/combined_npp_gpp.pdf"), height = 5, width = 6)

Cveg* vs NPP

All models pooled

set.seed(1982)

gdf_sampled <- gdf %>%
  mutate(data = purrr::map(data, ~slice_sample(., n = nsample))) %>%
  unnest(data) %>%
  dplyr::filter(
    !is.nan(dnpp), !is.nan(dcveg_star), 
    !is.infinite(dnpp), !is.infinite(dcveg_star),
    !is.nan(l_cveg_star_npp), !is.infinite(l_cveg_star_npp))

# density scatter-plot for pooled data, balanced samples, original grid
gg_densityscatter <- gdf_sampled %>%
  ggplot(aes(x = dnpp, y = dcveg_star)) +
  geom_hex(bins = 50, show.legend = FALSE) +
  theme_classic() +
  geom_abline(intercept = 0, slope = 1, linetype = "dotted") +
  geom_hline(yintercept = 0, linetype = "dotted") +
  # geom_vline(xintercept = 0, linetype = "dotted") +
  khroma::scale_fill_batlowW(trans = "log", reverse = TRUE) +
  xlim(-1, 2) + 
  ylim(-1, 2) +
  labs(
    x = expression(paste(Delta, "NPP/NPP")),
    y = expression(paste(Delta, italic(C)[veg^"*"], ":", italic(C)[veg]))
  )

# Number for text
print("Quartiles of L_Cveg:NPP")

## [1] "Quartiles of L_Cveg:NPP"

print(format(quantile(gdf_sampled$l_cveg_star_npp, probs =  c(0.25, 0.5, 0.75), na.rm = TRUE), digits = 3))

##     25%     50%     75% 
## "0.695" "0.991" "1.348"

# Lables for Histogram  
labels <- gdf_sampled %>%
  ungroup()%>%
  mutate(l_larger = l_cveg_star_npp > 1) %>%
  summarise(
    n_recs = n(),
    perc_larger = round(sum(l_larger, na.rm = TRUE)/n_recs, digits = 2),
    perc_smaller = 1 - perc_larger
    )

# Density plot
gg_density <- gdf_sampled %>% 
  ggplot(aes(x = l_cveg_star_npp, y = ..density..)) +
  geom_density(fill = "grey70") +
  theme_classic() +
  geom_hline(yintercept = 0, linetype = "dotted") +
  # geom_vline(xintercept = 0, linetype = "dotted") +
  geom_vline(xintercept = 1, linetype = "dotted") +
  xlim(-1, 2.5) + 
  geom_text(
    data = labels,
    aes(x = 0.5, y = Inf, label = perc_smaller),
    vjust = 1.5) +
  geom_text(
    data = labels,
    aes(x = 1.5, y = Inf, label = perc_larger),
    vjust = 1.5) +
  labs(
    x = expression(paste(italic(L)[Cveg^"*":NPP]))
  )

# Combine density scatter and density distribution plots
gg3_cveg_star_npp <- cowplot::plot_grid(
  gg_densityscatter, 
  gg_density, 
  align = 'h',     # 'v' for vertical alignment
  axis = 'tb',
  ncol = 2,
  labels = c("a", "b")
)

## Warning: Removed 56444 rows containing non-finite outside the scale range
## (`stat_binhex()`).

## Warning: Removed 46 rows containing missing values or values outside the scale range
## (`geom_hex()`).

## Warning: Removed 101137 rows containing non-finite outside the scale range
## (`stat_density()`).

ggsave(here("fig/cveg_star_npp_trendy_pooled.pdf"), plot = gg3_cveg_star_npp, width = 8, height = 4)

gg3_cveg_star_npp

By model

set.seed(1982)

gdf_sampled <- gdf %>%
  unnest(data) %>%
  bind_rows(
    .,
    gdf %>%
      mutate(n = purrr::map_int(data, ~nrow(.))) %>%
      mutate(data = purrr::map(data, ~slice_sample(., n = nsample, replace = TRUE))) %>% # smallest set here
      unnest(data) %>%
      mutate(modl = "ALL")
  ) %>%
  dplyr::filter(!is.nan(dnpp), !is.nan(dcveg_star), !is.infinite(dnpp), !is.infinite(dcveg_star))

# density scatter
gdf_sampled %>%
  ggplot(aes(x = dnpp, y = dcveg_star)) +
  geom_hex(bins = 50, show.legend = FALSE) +
  theme_classic() +
  geom_abline(intercept = 0, slope = 1, linetype = "dotted") +
  geom_hline(yintercept = 0, linetype = "dotted") +
  # geom_vline(xintercept = 0, linetype = "dotted") +
  khroma::scale_fill_batlowW(trans = "log", reverse = TRUE) +
  xlim(-1, 2) + 
  ylim(-1, 2) +
  labs(
    x = expression(paste(Delta, "NPP/NPP")),
    y = expression(paste(Delta, italic(C)[veg^"*"], "/", italic(C)[veg]))
  ) +
  facet_wrap(. ~ modl,  ncol = 3,  scales = "free") +
  theme(
    legend.position = "none",
    strip.background = element_blank(),
    strip.text = element_text(size = 12, hjust = 0)
    )

## Warning: Removed 168746 rows containing non-finite outside the scale range
## (`stat_binhex()`).

## Warning: Removed 101 rows containing missing values or values outside the scale range
## (`geom_hex()`).

ggsave(here("fig/dnpp_dvegc_trendy_s1_bymodl.pdf"), width = 9, height = 13)

## Warning: Removed 168746 rows containing non-finite outside the scale range
## (`stat_binhex()`).
## Removed 101 rows containing missing values or values outside the scale range
## (`geom_hex()`).

ggsave(here("fig/dnpp_dvegc_trendy_s1_bymodl.png"), width = 9, height = 13)

## Warning: Removed 168746 rows containing non-finite outside the scale range
## (`stat_binhex()`).
## Removed 101 rows containing missing values or values outside the scale range
## (`geom_hex()`).

# we need to get some stat for the overall excel table
df_quantiles <- gdf_sampled %>%
  group_by(modl) %>%
  summarise(l_cveg_star_npp_50 = quantile(l_cveg_star_npp, probs = 0.5, na.rm = TRUE)) 

summary_table <- left_join(summary_table, df_quantiles, by = 'modl')

# Lables for Histogram  
labels <- gdf_sampled %>%
  mutate(l_larger = l_cveg_star_npp > 1) %>%
  group_by(modl) %>%
  summarise(
    n_recs = n(),
    perc_larger = round(sum(l_larger, na.rm = TRUE)/n_recs,digits = 2),
    perc_smaller = 1 - perc_larger
    )

# Histograms per model
gdf_sampled %>%
  ggplot() +
  geom_density(aes(x = l_cveg_star_npp, y = ..density..), fill = "grey70") +
  theme_classic() +
  geom_hline(yintercept = 0, linetype = "dotted") +
  # geom_vline(xintercept = 0, linetype = "dotted") +
  geom_vline(xintercept = 1, linetype = "dotted") +
  xlim(-1, 2.5) +
  geom_text(data = labels, aes(x = 0.5, y = Inf, label = perc_smaller), vjust =  1.5) +
  geom_text(data = labels, aes(x = 1.5, y = Inf, label = perc_larger), vjust = 1.5)+
  labs(
    x = expression(paste(italic(L)[Cveg^"*":NPP]))
  ) +
  facet_wrap(. ~ modl, ncol = 3,  scales = "free")

## Warning: Removed 287811 rows containing non-finite outside the scale range
## (`stat_density()`).

ggsave(here("fig/hist_dcveg_dnpp_trendy_bymodl.pdf"), width = 9, height = 10.5)

## Warning: Removed 287811 rows containing non-finite outside the scale range
## (`stat_density()`).

ggsave(here("fig/hist_dcveg_dnpp_trendy_bymodl.png"), width = 9, height = 10.5)

## Warning: Removed 287811 rows containing non-finite outside the scale range
## (`stat_density()`).

Maps

By model.

plot_by_model <- function(use_modl, df){
  
  gdf_long <- df %>%
    unnest(data) %>%
    dplyr::filter(!is.nan(l_npp_gpp), !is.infinite(l_npp_gpp)) |> 
    filter(modl == use_modl)
  
  ggplot() +
    geom_tile(
      data = gdf_long,
      aes(x = lon, y = lat, fill = l_cveg_star_npp),
      show.legend = TRUE
      ) +
    geom_sf(
      data = coast,
      colour = 'black',
      linewidth = 0.3
    )  +
    coord_sf(
      ylim = c(-60, 85),
      expand = FALSE
    ) +
    khroma::scale_fill_vik(
      name = expression(paste(italic(L)[Cveg^"*":NPP])),
      limits = c(0, 2), 
      oob = scales::squish,
      midpoint = 1,
      reverse = TRUE
      ) +
    labs(
      title = use_modl
    ) +
    theme_void() +
    theme(
      legend.position = "none",
      plot.title = element_text(hjust = 0.5, size = 10) 
      )
}

plotlist <- purrr::map(
  as.list(unique(gdf$modl)),
  ~ plot_by_model(., gdf)
)

legend_plot <- plotlist[[1]] +
    theme(legend.position = "right")

legend <- cowplot::get_legend(legend_plot)

map_grid <- cowplot::plot_grid(
  plotlist = plotlist,
  ncol = 3
)

final_plot <- cowplot::plot_grid(
  map_grid,
  legend,
  ncol = 2,
  rel_widths = c(1, 0.08)  # adjust the legend width as needed
)

final_plot

ggsave(
  filename = here("fig/map_dcveg_dnpp_bymodel.pdf"),
  height = 10, 
  width = 10
  )

ggsave(
  filename = here("fig/map_dcveg_dnpp_bymodel.png"),
  height = 10, 
  width = 10
  )

All models pooled.

gdf_long_1x1deg <- gdf %>%
  select(-data) %>% 
  unnest(data_1x1deg) %>%
  group_by(lon, lat) %>% 
  summarise(
    l_cveg_star_npp_median = median(l_cveg_star_npp, na.rm = TRUE),
    l_cveg_star_npp_sd = sd(l_cveg_star_npp, na.rm = TRUE)
  )

## `summarise()` has grouped output by 'lon'. You can override using the `.groups`
## argument.

gg_map_agg <- ggplot() +
  geom_tile(
    data = gdf_long_1x1deg,
    aes(x = lon, y = lat, fill = l_cveg_star_npp_median),
    show.legend = TRUE
    ) +
  geom_sf(
    data = coast,
    colour = 'black',
    linewidth = 0.3
  )  +
  coord_sf(
    ylim = c(-60, 85),
    expand = FALSE
  ) +
  khroma::scale_fill_vik(
    name = expression(paste(italic(L)[Cveg^"*":NPP])),
    limits = c(0.2, 1.8),  
    midpoint = 1,
    oob = scales::squish,
    reverse = TRUE
    ) +
  theme_void()

gg_map_agg

## Warning: Removed 297 rows containing missing values or values outside the scale range
## (`geom_tile()`).

Publication Fig. 3

cowplot::plot_grid(
  gg3_cveg_star_npp,
  gg_map_agg, 
  ncol = 1,
  labels = c("", "c"),
  rel_heights = c(1, 0.9)
)

## Warning: Removed 297 rows containing missing values or values outside the scale range
## (`geom_tile()`).

ggsave(filename = here("fig/combined_cveg_npp.pdf"), height = 5, width = 6)

Croot vs Cveg

Taking the ratio of Croot vs. Cveg here, both for their non-steady conditions (without applying Eq. 12 from paper). This is because root C definition is inconsistent across models and may include (or not) coarse roots with slow turnover - similar to that of wood C (reflected in total vegetation C).

All models pooled

set.seed(1982)

gdf_sampled <- gdf %>%
  mutate(data = purrr::map(data, ~slice_sample(., n = nsample))) %>%
  unnest(data) %>%
  dplyr::filter(
    !is.nan(dcroot), !is.nan(dcveg), 
    !is.infinite(dcroot), !is.infinite(dcveg),
    !is.nan(l_croot_cveg), !is.infinite(l_croot_cveg))

# density scatter-plot for pooled data, balanced samples, original grid
gg_densityscatter <- gdf_sampled %>%
  ggplot(aes(x = dcveg, y = dcroot)) +
  geom_hex(bins = 50, show.legend = FALSE) +
  theme_classic() +
  geom_abline(intercept = 0, slope = 1, linetype = "dotted") +
  geom_hline(yintercept = 0, linetype = "dotted") +
  # geom_vline(xintercept = 0, linetype = "dotted") +
  khroma::scale_fill_batlowW(trans = "log", reverse = TRUE) +
  xlim(-1, 2) + 
  ylim(-1, 2) +
  labs(
    x = expression(paste(Delta, italic(C)[veg], "/", italic(C)[veg])),
    y = expression(paste(Delta, italic(C)[root], "/", italic(C)[root])),
  )

# Number for text
print("Quartiles of L_Croot:Cveg")

## [1] "Quartiles of L_Croot:Cveg"

print(format(quantile(gdf_sampled$l_croot_cveg, probs =  c(0.25, 0.5, 0.75), na.rm = TRUE), digits = 3))

##     25%     50%     75% 
## "0.902" "1.022" "1.262"

# Lables for Histogram  
labels <- gdf_sampled %>%
  ungroup()%>%
  mutate(l_larger = l_croot_cveg > 1) %>%
  summarise(
    n_recs = n(),
    perc_larger = round(sum(l_larger, na.rm = TRUE)/n_recs,digits = 2),
    perc_smaller = 1 - perc_larger
    )

# Density plot
gg_density <- gdf_sampled %>% 
  ggplot(aes(x = l_croot_cveg, y = ..density..)) +
  geom_density(fill = "grey70") +
  theme_classic() +
  geom_hline(yintercept = 0, linetype = "dotted") +
  # geom_vline(xintercept = 0, linetype = "dotted") +
  geom_vline(xintercept = 1, linetype = "dotted") +
  xlim(-1, 2.5) + 
  geom_text(
    data = labels,
    aes(x = 0.5, y = Inf, label = perc_smaller),
    vjust= 1.5) +
  geom_text(
    data = labels,
    aes(x = 1.5, y = Inf, label = perc_larger),
    vjust= 1.5) +
  labs(
    x = expression(paste(italic(L)[Croot:Cveg]))
  )

# Combine density scatter and density distribution plots
gg3_l_croot_cveg <- cowplot::plot_grid(
  gg_densityscatter, 
  gg_density, 
  align = 'h',     # 'v' for vertical alignment
  axis = 'tb',
  ncol = 2,
  labels = c("a", "b")
)

## Warning: Removed 49416 rows containing non-finite outside the scale range
## (`stat_binhex()`).

## Warning: Removed 43 rows containing missing values or values outside the scale range
## (`geom_hex()`).

## Warning: Removed 66439 rows containing non-finite outside the scale range
## (`stat_density()`).

ggsave(here("fig/croot_cveg_pooled.pdf"), plot = gg3_l_croot_cveg, width = 8, height = 4)

By model

set.seed(1982)

gdf_sampled <- gdf %>%
  unnest(data) %>%
  bind_rows(
    .,
    gdf %>%
      mutate(n = purrr::map_int(data, ~nrow(.))) %>%
      mutate(data = purrr::map(data, ~slice_sample(., n = nsample, replace = TRUE))) %>% # smallest set here
      unnest(data) %>%
      mutate(modl = "ALL")
  ) %>%
  dplyr::filter(!is.nan(dcroot), !is.nan(dcveg), !is.infinite(dcroot), !is.infinite(dcveg))

# density scatter
gdf_sampled %>%
  ggplot(aes(x = dcveg, y = dcroot)) +
  geom_hex(bins = 50, show.legend = FALSE) +
  theme_classic() +
  geom_abline(intercept = 0, slope = 1, linetype = "dotted") +
  geom_hline(yintercept = 0, linetype = "dotted") +
  # geom_vline(xintercept = 0, linetype = "dotted") +
  khroma::scale_fill_batlowW(trans = "log", reverse = TRUE) +
  xlim(-1, 2) + 
  ylim(-1, 2) +
  labs(
    x = expression(paste(Delta, italic(C)[veg], "/", italic(C)[veg])),
    y = expression(paste(Delta, italic(C)[root], "/", italic(C)[root]))
  ) +
  facet_wrap(. ~ modl,  ncol = 3,  scales = "free") +
  theme(
    legend.position = "none",
    strip.background = element_blank(),
    strip.text = element_text(size = 12, hjust = 0)
    )

## Warning: Removed 148718 rows containing non-finite outside the scale range
## (`stat_binhex()`).

## Warning: Removed 93 rows containing missing values or values outside the scale range
## (`geom_hex()`).

ggsave(here("fig/dcroot_dcveg_bymodl.pdf"), width = 9, height = 13)

## Warning: Removed 148718 rows containing non-finite outside the scale range
## (`stat_binhex()`).
## Removed 93 rows containing missing values or values outside the scale range
## (`geom_hex()`).

ggsave(here("fig/dcroot_dcveg_bymodl.png"), width = 9, height = 13)

## Warning: Removed 148718 rows containing non-finite outside the scale range
## (`stat_binhex()`).
## Removed 93 rows containing missing values or values outside the scale range
## (`geom_hex()`).

# we need to get some stat for the overall excel table
df_quantiles <- gdf_sampled %>%
  group_by(modl) %>%
  summarise(l_croot_cveg_50 = quantile(l_croot_cveg, probs = 0.5, na.rm = TRUE)) 

summary_table <- left_join(summary_table, df_quantiles, by = 'modl')

# Lables for Histogram  
labels <- gdf_sampled %>%
  mutate(l_larger = l_croot_cveg > 1) %>%
  group_by(modl) %>%
  summarise(
    n_recs = n(),
    perc_larger = round(sum(l_larger, na.rm = TRUE)/n_recs, digits = 2),
    perc_smaller = 1 - perc_larger
    )

# Histograms per model
gdf_sampled %>%
  ggplot() +
  geom_density(aes(x = l_croot_cveg, y = ..density..), fill = "grey70") +
  theme_classic() +
  geom_hline(yintercept = 0, linetype = "dotted") +
  # geom_vline(xintercept = 0, linetype = "dotted") +
  geom_vline(xintercept = 1, linetype = "dotted") +
  xlim(-1, 2.5) +
  geom_text(data = labels, aes(x = 0.5, y = Inf, label = perc_smaller), vjust =  1.5) +
  geom_text(data = labels, aes(x = 1.5, y = Inf, label = perc_larger), vjust = 1.5)+
  labs(
    x = expression(paste(italic(L)[Croot:Cveg]))
  ) +
  facet_wrap(. ~ modl, ncol = 3,  scales = "free")

## Warning: Removed 196273 rows containing non-finite outside the scale range
## (`stat_density()`).

ggsave(here("fig/hist_dcroot_dcveg_bymodl.pdf"), width = 9, height = 10.5)

## Warning: Removed 196273 rows containing non-finite outside the scale range
## (`stat_density()`).

ggsave(here("fig/hist_dcroot_dcveg_bymodl.png"), width = 9, height = 10.5)

## Warning: Removed 196273 rows containing non-finite outside the scale range
## (`stat_density()`).

Maps

By model.

plot_by_model <- function(use_modl, df){
  
  gdf_long <- df %>%
    unnest(data) %>%
    dplyr::filter(!is.nan(l_npp_gpp), !is.infinite(l_npp_gpp)) |> 
    filter(modl == use_modl)
  
  ggplot() +
    geom_tile(
      data = gdf_long,
      aes(x = lon, y = lat, fill = l_croot_cveg),
      show.legend = TRUE
      ) +
    geom_sf(
      data = coast,
      colour = 'black',
      linewidth = 0.3
    )  +
    coord_sf(
      ylim = c(-60, 85),
      expand = FALSE
    ) +
    khroma::scale_fill_vik(
      name = expression(paste(italic(L)[Croot:Cveg])),
      limits = c(0, 2), 
      oob = scales::squish,
      midpoint = 1,
      reverse = TRUE
      ) +
    labs(
      title = use_modl
    ) +
    theme_void() +
    theme(
      legend.position = "none",
      plot.title = element_text(hjust = 0.5, size = 10) 
      )
}

plotlist <- purrr::map(
  as.list(unique(gdf$modl)),
  ~ plot_by_model(., gdf)
)

legend_plot <- plotlist[[1]] +
    theme(legend.position = "right")

legend <- cowplot::get_legend(legend_plot)

map_grid <- cowplot::plot_grid(
  plotlist = plotlist,
  ncol = 3
)

final_plot <- cowplot::plot_grid(
  map_grid,
  legend,
  ncol = 2,
  rel_widths = c(1, 0.08)  # adjust the legend width as needed
)

final_plot

# # test
# plot_by_model("DLEM", gdf)

ggsave(
  filename = here("fig/map_dcroot_dcveg_bymodel.pdf"),
  height = 10, 
  width = 10
  )

ggsave(
  filename = here("fig/map_dcroot_dcveg_bymodel.png"),
  height = 10, 
  width = 10
  )

All models pooled.

gdf_long_1x1deg <- gdf %>%
  select(-data) %>% 
  unnest(data_1x1deg) %>%
  group_by(lon, lat) %>% 
  summarise(
    l_croot_cveg_median = median(l_croot_cveg, na.rm = TRUE),
    l_croot_cveg_sd = sd(l_croot_cveg, na.rm = TRUE)
  )

## `summarise()` has grouped output by 'lon'. You can override using the `.groups`
## argument.

gg_map_agg <- ggplot() +
  geom_tile(
    data = gdf_long_1x1deg,
    aes(x = lon, y = lat, fill = l_croot_cveg_median),
    show.legend = TRUE
    ) +
  geom_sf(
    data = coast,
    colour = 'black',
    linewidth = 0.3
  )  +
  coord_sf(
    ylim = c(-60, 85),
    expand = FALSE
  ) +
  khroma::scale_fill_vik(
    name = expression(paste(italic(L)[Croot:Cveg])),
    limits = c(0.2, 1.8),  
    midpoint = 1,
    oob = scales::squish,
    reverse = TRUE
    ) +
  theme_void()

Publication Fig. 4

cowplot::plot_grid(
  gg3_l_croot_cveg,
  gg_map_agg, 
  ncol = 1,
  labels = c("", "c"),
  rel_heights = c(1, 0.9)
)

## Warning: Removed 297 rows containing missing values or values outside the scale range
## (`geom_tile()`).

ggsave(filename = here("fig/combined_croot_cveg.pdf"), height = 5, width = 6)

Croot vs Cveg*

Taking the ratio of Croot vs. Cveg where the latter is estimated for steady-state conditions.

All models pooled

set.seed(1982)

gdf_sampled <- gdf %>%
  mutate(data = purrr::map(data, ~slice_sample(., n = nsample))) %>%
  unnest(data) %>%
  dplyr::filter(
    !is.nan(dcroot), !is.nan(dcveg_star), 
    !is.infinite(dcroot), !is.infinite(dcveg_star),
    !is.nan(l_croot_cveg_star), !is.infinite(l_croot_cveg_star))

# density scatter-plot for pooled data, balanced samples, original grid
gg_densityscatter <- gdf_sampled %>%
  ggplot(aes(x = dcveg_star, y = dcroot)) +
  geom_hex(bins = 50, show.legend = FALSE) +
  theme_classic() +
  geom_abline(intercept = 0, slope = 1, linetype = "dotted") +
  geom_hline(yintercept = 0, linetype = "dotted") +
  # geom_vline(xintercept = 0, linetype = "dotted") +
  khroma::scale_fill_batlowW(trans = "log", reverse = TRUE) +
  xlim(-1, 2) + 
  ylim(-1, 2) +
  labs(
    x = expression(paste(Delta, italic(C)[veg^"*"], "/", italic(C)[veg])),
    y = expression(paste(Delta, italic(C)[root], "/", italic(C)[root])),
  )

# Number for text
print("Quartiles of L_Croot:Cveg*")

## [1] "Quartiles of L_Croot:Cveg*"

print(format(quantile(gdf_sampled$l_croot_cveg_star, probs =  c(0.25, 0.5, 0.75), na.rm = TRUE), digits = 3))

##     25%     50%     75% 
## "0.746" "0.932" "1.154"

# Lables for Histogram  
labels <- gdf_sampled %>%
  ungroup()%>%
  mutate(l_larger = l_croot_cveg_star > 1) %>%
  summarise(
    n_recs = n(),
    perc_larger = round(sum(l_larger, na.rm = TRUE)/n_recs, digits = 2),
    perc_smaller = 1 - perc_larger
    )

# Density plot
gg_density <- gdf_sampled %>% 
  ggplot(aes(x = l_croot_cveg_star, y = ..density..)) +
  geom_density(fill = "grey70") +
  theme_classic() +
  geom_hline(yintercept = 0, linetype = "dotted") +
  # geom_vline(xintercept = 0, linetype = "dotted") +
  geom_vline(xintercept = 1, linetype = "dotted") +
  xlim(-1, 2.5) + 
  geom_text(
    data = labels,
    aes(x = 0.5, y = Inf, label = perc_smaller),
    vjust= 1.5) +
  geom_text(
    data = labels,
    aes(x = 1.5, y = Inf, label = perc_larger),
    vjust= 1.5) +
  labs(
    x = expression(paste(italic(L)[Croot:Cveg^"*"]))
  )

# Combine density scatter and density distribution plots
gg3_l_croot_cveg_star <- cowplot::plot_grid(
  gg_densityscatter, 
  gg_density, 
  align = 'h',     # 'v' for vertical alignment
  axis = 'tb',
  ncol = 2,
  labels = c("a", "b")
)

## Warning: Removed 49701 rows containing non-finite outside the scale range
## (`stat_binhex()`).

## Warning: Removed 42 rows containing missing values or values outside the scale range
## (`geom_hex()`).

## Warning: Removed 65860 rows containing non-finite outside the scale range
## (`stat_density()`).

ggsave(here("fig/croot_cveg_star_pooled.pdf"), plot = gg3_l_croot_cveg_star, width = 8, height = 4)

By model

set.seed(1982)

gdf_sampled <- gdf %>%
  unnest(data) %>%
  bind_rows(
    .,
    gdf %>%
      mutate(n = purrr::map_int(data, ~nrow(.))) %>%
      mutate(data = purrr::map(data, ~slice_sample(., n = nsample, replace = TRUE))) %>% # smallest set here
      unnest(data) %>%
      mutate(modl = "ALL")
  ) %>%
  dplyr::filter(!is.nan(dcroot), !is.nan(dcveg_star), !is.infinite(dcroot), !is.infinite(dcveg_star))

# density scatter
gdf_sampled %>%
  ggplot(aes(x = dcveg_star, y = dcroot)) +
  geom_hex(bins = 50, show.legend = FALSE) +
  theme_classic() +
  geom_abline(intercept = 0, slope = 1, linetype = "dotted") +
  geom_hline(yintercept = 0, linetype = "dotted") +
  # geom_vline(xintercept = 0, linetype = "dotted") +
  khroma::scale_fill_batlowW(trans = "log", reverse = TRUE) +
  xlim(-1, 2) + 
  ylim(-1, 2) +
  labs(
    x = expression(paste(Delta, italic(C)[veg], "/", italic(C)[veg])),
    y = expression(paste(Delta, italic(C)[root], "/", italic(C)[root]))
  ) +
  facet_wrap(. ~ modl,  ncol = 3,  scales = "free") +
  theme(
    legend.position = "none",
    strip.background = element_blank(),
    strip.text = element_text(size = 12, hjust = 0)
    )

## Warning: Removed 149689 rows containing non-finite outside the scale range
## (`stat_binhex()`).

## Warning: Removed 93 rows containing missing values or values outside the scale range
## (`geom_hex()`).

ggsave(here("fig/dcroot_dcveg_star_bymodl.pdf"), width = 9, height = 13)

## Warning: Removed 149689 rows containing non-finite outside the scale range
## (`stat_binhex()`).
## Removed 93 rows containing missing values or values outside the scale range
## (`geom_hex()`).

ggsave(here("fig/dcroot_dcveg_star_bymodl.png"), width = 9, height = 13)

## Warning: Removed 149689 rows containing non-finite outside the scale range
## (`stat_binhex()`).
## Removed 93 rows containing missing values or values outside the scale range
## (`geom_hex()`).

# we need to get some stat for the overall excel table
df_quantiles <- gdf_sampled %>%
  group_by(modl) %>%
  summarise(l_croot_cveg_star_50 = quantile(l_croot_cveg_star, probs = 0.5, na.rm = TRUE)) 

summary_table <- left_join(summary_table, df_quantiles, by = 'modl')

# Lables for Histogram  
labels <- gdf_sampled %>%
  mutate(l_larger = l_croot_cveg_star > 1) %>%
  group_by(modl) %>%
  summarise(
    n_recs = n(),
    perc_larger = round(sum(l_larger, na.rm = TRUE)/n_recs, digits = 2),
    perc_smaller = 1 - perc_larger
    )

# Histograms per model
gdf_sampled %>%
  ggplot() +
  geom_density(aes(x = l_croot_cveg_star, y = ..density..), fill = "grey70") +
  theme_classic() +
  geom_hline(yintercept = 0, linetype = "dotted") +
  # geom_vline(xintercept = 0, linetype = "dotted") +
  geom_vline(xintercept = 1, linetype = "dotted") +
  xlim(-1, 2.5) +
  geom_text(data = labels, aes(x = 0.5, y = Inf, label = perc_smaller), vjust =  1.5) +
  geom_text(data = labels, aes(x = 1.5, y = Inf, label = perc_larger), vjust = 1.5)+
  labs(
    x = expression(paste(italic(L)[Croot:Cveg^"*"]))
  ) +
  facet_wrap(. ~ modl, ncol = 3,  scales = "free")

## Warning: Removed 197568 rows containing non-finite outside the scale range
## (`stat_density()`).

ggsave(here("fig/hist_dcroot_dcveg_star_bymodl.pdf"), width = 9, height = 10.5)

## Warning: Removed 197568 rows containing non-finite outside the scale range
## (`stat_density()`).

ggsave(here("fig/hist_dcroot_dcveg_star_bymodl.png"), width = 9, height = 10.5)

## Warning: Removed 197568 rows containing non-finite outside the scale range
## (`stat_density()`).

Maps

By model.

plot_by_model <- function(use_modl, df){
  
  gdf_long <- df %>%
    unnest(data) %>%
    dplyr::filter(!is.nan(l_npp_gpp), !is.infinite(l_npp_gpp)) |> 
    filter(modl == use_modl)
  
  ggplot() +
    geom_tile(
      data = gdf_long,
      aes(x = lon, y = lat, fill = l_croot_cveg_star),
      show.legend = TRUE
      ) +
    geom_sf(
      data = coast,
      colour = 'black',
      linewidth = 0.3
    )  +
    coord_sf(
      ylim = c(-60, 85),
      expand = FALSE
    ) +
    khroma::scale_fill_vik(
      name = expression(paste(italic(L)[Croot:Cveg^"*"])),
      limits = c(0, 2), 
      oob = scales::squish,
      midpoint = 1,
      reverse = TRUE
      ) +
    labs(
      title = use_modl
    ) +
    theme_void() +
    theme(
      legend.position = "none",
      plot.title = element_text(hjust = 0.5, size = 10) 
      )
}

plotlist <- purrr::map(
  as.list(unique(gdf$modl)),
  ~ plot_by_model(., gdf)
)

legend_plot <- plotlist[[1]] +
    theme(legend.position = "right")

legend <- cowplot::get_legend(legend_plot)

map_grid <- cowplot::plot_grid(
  plotlist = plotlist,
  ncol = 3
)

final_plot <- cowplot::plot_grid(
  map_grid,
  legend,
  ncol = 2,
  rel_widths = c(1, 0.08)  # adjust the legend width as needed
)

final_plot

ggsave(
  filename = here("fig/map_dcroot_dcveg_star_bymodel.pdf"),
  height = 10, 
  width = 10
  )

ggsave(
  filename = here("fig/map_dcroot_dcveg_star_bymodel.png"),
  height = 10, 
  width = 10
  )

All models pooled.

gdf_long_1x1deg <- gdf %>%
  select(-data) %>% 
  unnest(data_1x1deg) %>%
  group_by(lon, lat) %>% 
  summarise(
    l_croot_cveg_star_median = median(l_croot_cveg_star, na.rm = TRUE),
    l_croot_cveg_star_sd = sd(l_croot_cveg_star, na.rm = TRUE)
  )

## `summarise()` has grouped output by 'lon'. You can override using the `.groups`
## argument.

gg_map_agg <- ggplot() +
  geom_tile(
    data = gdf_long_1x1deg,
    aes(x = lon, y = lat, fill = l_croot_cveg_star_median),
    show.legend = TRUE
    ) +
  geom_sf(
    data = coast,
    colour = 'black',
    linewidth = 0.3
  )  +
  coord_sf(
    ylim = c(-60, 85),
    expand = FALSE
  ) +
  khroma::scale_fill_vik(
    name = expression(paste(italic(L)[Croot:Cveg^"*"])),
    limits = c(0.2, 1.8),  
    midpoint = 1,
    oob = scales::squish,
    reverse = TRUE
    ) +
  theme_void()

Publication Fig. SXXX

cowplot::plot_grid(
  gg3_l_croot_cveg_star,
  gg_map_agg, 
  ncol = 1,
  labels = c("", "c"),
  rel_heights = c(1, 0.9)
)

## Warning: Removed 297 rows containing missing values or values outside the scale range
## (`geom_tile()`).

ggsave(filename = here("fig/combined_croot_cveg_star.pdf"), height = 5, width = 6)

Summary table

df <- summary_table |> 
  filter(modl != "ALL")

# Pivot to long format
column_order <- c("l_npp_gpp_50", "l_cveg_star_npp_50", "l_croot_cveg_50")

column_labels <- c(
  expression(paste(italic(L)[NPP:GPP])),
  expression(paste(italic(L)[Cveg^"*":NPP])),
  expression(paste(italic(L)[Croot:Cveg]))
  )

# By model
df_long <- df[, c(1, 3, 4, 5)] %>%
  pivot_longer(-modl, names_to = "link", values_to = "value") |> 
  mutate(link = factor(link, levels = column_order))

gg1 <- ggplot(df_long, aes(x = link, y = modl, fill = value)) +
  geom_tile(color = "white", show.legend = FALSE) +
  geom_text(aes(label = round(value, 2)), color = "grey30") +
  khroma::scale_fill_vik(
    name = expression(paste(italic(L))),
    limits = c(0.2, 1.8),  
    midpoint = 1,
    reverse = TRUE,
    na.value = "grey70"
    ) +
  scale_x_discrete(labels = column_labels, position = "top") +
  scale_y_discrete(limits = rev(unique(df$modl))) +
  coord_fixed(ratio = 0.5) +
  theme_minimal() +
  theme(
    axis.text.x = element_text(angle = 0, hjust = 0.5),
    panel.grid = element_blank(),
    legend.position = "bottom"
  ) +
  labs(
    fill = "Value",
    y = "",
    x = ""
    )

# C-N vs C-only
cn_id <- tibble(
  cn_coupled = c(TRUE, FALSE),
  cn_label = c("    C-N", "    C-only")
)

df_long <- df %>%
  group_by(cn_coupled) |> 
  summarise(across(all_of(c("l_npp_gpp_50", "l_cveg_star_npp_50", "l_croot_cveg_50")), ~ mean(.x, na.rm = TRUE))) |> 
  pivot_longer(-cn_coupled, names_to = "link", values_to = "value") |> 
  mutate(link = factor(link, levels = column_order)) |> 
  left_join(
    cn_id,
    by = "cn_coupled"
  )

gg2 <- ggplot(df_long, aes(x = link, y = cn_label, fill = value)) +
  geom_tile(color = "white", show.legend = FALSE) +
  geom_text(aes(label = round(value, 2)), color = "grey30") +
  khroma::scale_fill_vik(
    name = expression(paste(italic(L))),
    limits = c(0.2, 1.8),  
    oob = scales::squish,
    midpoint = 1,
    reverse = TRUE,
    na.value = "grey70"
    ) +
  scale_x_discrete(labels = column_labels, position = "top") +
  coord_fixed(ratio = 0.5) +
  theme_minimal() +
  theme(
    axis.text.x = element_blank(),
    panel.grid = element_blank(),
    legend.position = "bottom"
  ) +
  labs(
    fill = "Value",
    y = "",
    x = ""
    )

# aggregated across all models
df_long <- df %>%
  summarise(across(all_of(c("l_npp_gpp_50", "l_cveg_star_npp_50", "l_croot_cveg_50")), ~ mean(.x, na.rm = TRUE))) |> 
  pivot_longer(cols = everything(), names_to = "link", values_to = "value") |> 
  mutate(link = factor(link, levels = column_order)) |> 
  mutate(label = "     ALL")

gg3 <- ggplot(df_long, aes(x = link, y = label, fill = value)) +
  geom_tile(color = "white", show.legend = FALSE) +
  geom_text(aes(label = round(value, 2)), color = "grey30") +
  khroma::scale_fill_vik(
    name = expression(paste(italic(L))),
    limits = c(0.2, 1.8),  
    oob = scales::squish,
    midpoint = 1,
    reverse = TRUE,
    na.value = "grey70"
    ) +
  scale_x_discrete(labels = column_labels, position = "top") +
  coord_fixed(ratio = 0.5) +
  theme_minimal() +
  theme(
    axis.text.x = element_blank(),
    panel.grid = element_blank(),
    legend.position = "bottom"
  ) +
  labs(
    fill = "Value",
    y = "",
    x = ""
    )

# get legend
# Dummy plot just to extract the legend
legend_plot <- ggplot(df_long, aes(x = link, y = label, fill = value)) +
  geom_tile() +
  khroma::scale_fill_vik(
    name = expression(paste(italic(L))),
    limits = c(0.2, 1.8),  
    oob = scales::squish,
    midpoint = 1,
    reverse = TRUE,
    na.value = "grey70"
    ) +
  theme(legend.position = "bottom")

# Extract legend object
legend <- cowplot::get_legend(legend_plot)

heatmaps <- cowplot::plot_grid(
  gg1,
  gg2, 
  gg3,
  ncol = 1,
  align = "v",    # vertical alignment
  axis = "lr",    # align left and right margins (y-axis)
  rel_heights = c(1, 0.182, 0.127),
  labels = letters[1:3]
)

## Warning: Removed 5 rows containing missing values or values outside the scale range
## (`geom_text()`).

final_plot <- cowplot::plot_grid(
  heatmaps,
  legend,
  ncol = 1,
  rel_heights = c(1, 0.1)  # adjust legend height as needed
)

# Display
final_plot

ggsave(here("fig/summary_table.pdf"), width = 3.5, height = 9)

Diagnose Linearity in DGVMs

Huanyuan Zhang-Zheng and Benjamin Stocker

2025-07-30

Approach

Process TRENDY v7 files

Process files: multi-annual mean

Collect data

Cleaning

Calculate relationships

Regrid

Model meta info

Plot relationships

GPP vs NPP

All models pooled

By model

Maps

Publication Fig. 1

Cveg* vs NPP

All models pooled

By model

Maps

Publication Fig. 3

Croot vs Cveg

All models pooled

By model

Maps

Publication Fig. 4

Croot vs Cveg*

All models pooled

By model

Maps

Publication Fig. SXXX

Summary table