I calibrated Hector climate parameters to best emulate the ESM temperature results from the CMIP5 concentration driven experiments. I used CMIP5 model ensemble means as the comparison data. This is just a first pass at looking at the results.
# Library
library(ggplot2)
library(dplyr)
library(tidyr)
library(gridExtra)
# Input directory
INPUT_DIR <- '/Users/dorh012/Documents/2019/hectorcal/singleESMcalibration'
# Import results from the single ESM calibration.
#
# Args:
# file_list: A list of the rds file lists.
# Return: A data frame of the calibration convergence, run, model, ensemble, the minimized MSE
# value, the optim run count, parameter values, experiment, variable, and the experiment variable MSE.
import_calibration_rslts <- function(file_list){
dplyr::bind_rows(lapply(file_list , function(file = X){
name <- gsub('.rds', '', basename(file))
rslt <- readRDS(file)
# Save the run information as a tibble.
tibble::tibble(convergence = rslt$optim_rslt$convergence,
name = name,
value = rslt$optim_rslt$par,
param = names(rslt$optim_rslt$par),
minimized = rslt$optim_rslt$value,
count = rslt$optim_rslt$counts[['function']]) %>%
tidyr::spread(param, value) %>%
tidyr::separate(name, into = c('run', 'model', 'ensemble'), sep = '_') ->
info
if(info$convergence == 0){
info %>%
dplyr::mutate(join = 1) %>%
dplyr::left_join(tibble::tibble(experiment = rslt$MSE$experiment,
variable = rslt$MSE$variable,
exp_var_MSE = rslt$MSE$MSE,
join = 1),
by = 'join') %>%
dplyr::select(-join)
} else {
info %>%
dplyr::bind_cols(tibble::tibble(experiment = NA,
variable = NA,
exp_var_MSE = NA))
}}))
}
ensembleMean_rslts <- import_calibration_rslts(list.files(INPUT_DIR, '_ensembleMean', full.names = TRUE))
ensembleMean_rslts %>%
select(model, ensemble) %>%
distinct() %>%
nrow
## [1] 35
unique(ensembleMean_rslts$convergence)
## [1] 0
Since all of the convergence values equal 0 it means that all of the runs converged!
summary(ensembleMean_rslts$minimized)
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 0.05121 0.09744 0.12070 0.12803 0.15528 0.26166
best_fit_model <- unique(ensembleMean_rslts[which.min(ensembleMean_rslts$minimized), ]$model)
best_fit_rslts <- readRDS(list.files(INPUT_DIR, paste0(best_fit_model, '_ensembleMean'), full.names = TRUE))
best_fit_rslts$comparison_plot
best_fit_rslts$residual_plot
worst_fit_model <- unique(ensembleMean_rslts[which.max(ensembleMean_rslts$minimized), ]$model)
worst_fit_rslts <- readRDS(list.files(INPUT_DIR, paste0(worst_fit_model, '_ensembleMean'), full.names = TRUE))
worst_fit_rslts$comparison_plot
worst_fit_rslts$residual_plot
It looks like best and the worst model fits Hector dips below the CMIP models from 1955 to around 2000. That begs the question if this happens in all of the fits which will require looking at all of the model fits. The dip below the CMIP models in the comparison plot manifests itself as a period of prelonged negative residuals, so I think the best way to check to see if this happens in all of the models it to look at all of the residual plots.
I think that the best way to determine this is to look at the plots of the residuals (the difference between the normalized hector values and the normalized esm values). I’ve added a horizontal line at y equals 0 to make it easier to see when Hector underestimates the ESM temperature.
# Import the residual plots
lapply(list.files(INPUT_DIR, 'ensembleMean', full.names = TRUE),
function(X){
plot <- readRDS(X)[['residual_plot']]
plot +
geom_hline(aes(yintercept = 0)) +
theme_bw(base_size = 5) +
theme(legend.position = 'none')
}) ->
residual_plots
grid.arrange(grobs = residual_plots)
So see a trend in all of the residuals, including those that only calibrated to the historical period (which I wasn’t expecting). At the start of the historical period the residuals are large and positive and decrease as time approaches 2006.
Here are my initial thoughts on this..
Summary stats of what the calibrated parameter space looks like.
ensembleMean_rslts %>%
select(model, ensemble, alpha, diff, S) %>%
distinct() %>%
gather(param, value, alpha, diff, S) %>%
group_by(param) %>%
dplyr::summarise(min = min(value),
mean = mean(value),
max = max(value))
Some questions I would like to look into and explore.