library(tidyverse)
library(matilda)
gcp_data <- read.csv("data/gcp_data.csv", stringsAsFactors = F)
ocean_uptake_df <- gcp_data %>%
select(Year, ocean_sink) %>%
rename(ocean_uptake = ocean_sink,
year = Year) %>%
filter(year > 1799) %>%
na.omit()
criterion_ocean_uptake <- new_criterion(var = OCEAN_UPTAKE(),
years = ocean_uptake_df$year,
obs_values = ocean_uptake_df$ocean_uptake)
When scoring models using historcial data it may be important to understand the effect each criterion has on final weights of model iterations, as the weights will eventually affect the final distribution of the variable of interest.
One way to assess the affect of different criterion on model weights is a sensitivity test whereby we use different scoring mechanisms and compare the final results among them.
In matilda
there is a way we can build a sensitivity
test using multi_criterion_weighting
. This function takes a
score list of different criterion (e.g.,
criterion_co2_obs()
and criterion_gmst_obs()
)
and produces a combined score. The primary utility of this function is
we can use multiple line of evidence to weight models. But secondarily,
we can use this function to conduct sensitivity tests. If we think of
all our separate criterion as dials, we can change the volume of each
criterion as we compute combined multi-criterion score. By default, all
the dials are balanced. But, if we want temperature to be louder in the
final multi-criterion scores, we can turn it up, and turn CO2 down.
The data for this tutorial should be an unweighted model result. For
this criterion, we are using a data frame with only one scenario.
However, if you are analyzing multiple scenarios (a list of unweighted
model results) this tutorial is transferable to an lapply
workflow.
We will weight this result with different criterion and show how
scores change as we adjust the “criterion dials” with
multi-criterion
weighting.
The data we are using looks like this:
## scenario year variable value units run_number
## 1 ssp245 1800 gmst -0.05391733 degC 1
## 2 ssp245 1801 gmst -0.04609289 degC 1
## 3 ssp245 1802 gmst -0.03992357 degC 1
## 4 ssp245 1803 gmst -0.03456811 degC 1
## 5 ssp245 1804 gmst -0.02978247 degC 1
## 6 ssp245 1805 gmst -0.02637711 degC 1
If we score this result using observed temperature
(criterion_gmst_obs()
), like this:
# Compute model scores based on observed temperature
temp_scores <-
score_runs(model_result_df, # model result as a data frame
criterion_gmst_obs(), # criterion used in scoring
score_bayesian) # score function we want to use
We get this result:
# Print first five lines of the data frame
head(temp_scores)
## weights run_number
## 1 1.578293e-02 1
## 2 5.270561e-03 2
## 3 6.127983e-06 3
## 4 1.027169e-06 4
## 5 2.881162e-02 5
## 6 2.153625e-02 6
If we score the result using observed CO2 concentrations
(criterion_co2_obs()
), like this
# Compute model scores based on observed CO2 concentrations
co2_scores <- score_runs(model_result_df,
criterion_co2_obs(),
score_bayesian)
We get this result:
# Print first five lines of the data frame
head(co2_scores)
## weights run_number
## 1 1.326210e-03 1
## 2 1.268191e-02 2
## 3 4.241526e-03 3
## 4 1.289947e-06 4
## 5 1.845426e-03 5
## 6 4.090668e-03 6
multi_criteria_weighting
Now that we have scores, we can play with the criterion dials.
The multi_criteria_weighting
function take two
arguments:
score_list
: a list of the score data frames - these
are the two data frames we previously created that score using observed
temperature and CO2. The length of this list has not limit.
criterion_weights
: a list of weights applied to each
scored data frame - this list should add to 1.0 and can be thought of as
a proportion of a total. The list corresponds to the order of the score
list. In other words, df1
in the score list corresponds to
value1
in criterion_weights
vector
list.
We can test the logic of this function.
Based on how the function works, if we turn the “temperature dial” up to 100% and the co2 dial down to 0%. the scores produced should equal what we have already calculated using observed temperature.
# list of scored data frames
scores <- list(temp_scores, co2_scores)
# running multi_criteria_weighting function
mc_weights <- multi_criteria_weighting(scores_list = scores, criterion_weights = c(1.0, 0.0))
# View first five lines of the mc_weights result
head(mc_weights)
## run_number mc_weight
## 1 1 1.578293e-02
## 2 2 5.270561e-03
## 3 3 6.127983e-06
## 4 4 1.027169e-06
## 5 5 2.881162e-02
## 6 6 2.153625e-02
# View first five lines of the temp_scores result
head(temp_scores)
## weights run_number
## 1 1.578293e-02 1
## 2 5.270561e-03 2
## 3 6.127983e-06 3
## 4 1.027169e-06 4
## 5 2.881162e-02 5
## 6 2.153625e-02 6
We could do the same thing by turning the “CO2 dial” up and temperature down.
If we compute multi_criteria_weights
with a balance
(default). We should get a compromised score.
In this example I add a new scoring criterion -
ocean_c_uptake
.
We can build our own function to complete sensitivity tests:
score_sensitivity <- function(result_df, criterion_weights = NULL, mc_weights_name = "mc_weights"){
# Scoring with temperature
scores_temp = score_runs(result_df,
criterion_gmst_obs(),
score_bayesian)
# Scoring with co2
scores_co2 = score_runs(result_df,
criterion_co2_obs(),
score_bayesian)
# Scoring with ocean_c_uptake
scores_ocean_c = score_runs(result_df,
criterion_ocean_uptake,
score_bayesian)
# creating score_list
score_list = list(scores_temp, scores_co2, scores_ocean_c)
# computing criterion weights
if (is.null(criterion_weights)) {
mc_weights = multi_criteria_weighting(score_list, criterion_weights = criterion_weights)
} else {
mc_weights = multi_criteria_weighting(score_list)
}
# merge with data frame
result_scored_df <- merge(result_df, scores_temp, by = "run_number")
result_scored_df <- merge(result_scored_df, scores_co2, by = "run_number")
result_scored_df <- merge(result_scored_df, scores_ocean_c, by = "run_number")
result_scored_df <- merge(result_scored_df, mc_weights, by = "run_number")
# rename mc_weights column
names(result_scored_df)[names(result_scored_df) == "mc_weight"] <- mc_weights_name
names(result_scored_df)[names(result_scored_df) == "weights.x"] <- "temp_score"
names(result_scored_df)[names(result_scored_df) == "weights.y"] <- "co2_score"
names(result_scored_df)[names(result_scored_df) == "weights"] <- "ocean_c_score"
return(result_scored_df)
}
With this function we can quickly substitute in and out different “dial settings” to emphasize different criterion for weighting our model results.
These scores can then be added to our result data frame and be used when computing metric and probabilities, for example:
# computing a temperature scored df
result_scored <- score_sensitivity(model_result_df, criterion_weights = c(0.4, 0.1, 0.6), mc_weights_name = "oc_temp")
We can check to see how scoring has worked with a simple visualization. However, the spaghetti plot is only goes so far. In order to really see the possible impact, completing the probabilistic workflow to produce a stacked bar plot of probabilities will be the most informative.
gmst_spaghetti_plot <-
ggplot(data = result_scored) + # sets the plotting environment
geom_line( # identifies plot type
aes(
x = year, # identifies x-axis variable
y = value, # identifies y-axis variable
group = run_number, # identifies how data should be grouped
color = temp_score, # how color should be applied
alpha = temp_score), # how transparency should be applied
linewidth = 0.5) + # sets line width - personal preference
scale_color_gradient(low = "skyblue", high = "dodgerblue4", name = "Weights") + # what colors should be used
scale_alpha_continuous(range(c(0.8, 1))) + # how extreme should the transparency gradient be - this is optional
guides(alpha = "none") + # remove alpha legend
facet_wrap(~ variable, scales = "free_y") + # panels the figure by scenario name
labs(x = "Years", y = "Ocean Carbon Uptake") + # adds labels to axes
ggtitle(label = "Matilda Ensemble weighted by historical temperature") +
theme_light() # sets a general theme - personal preference
gmst_spaghetti_plot