Function Guide

This guide explains two helper functions used in this project:

  1. normalize_gmst()
  2. produce_metrics()

These functions are written in our R project and therefore donot need to be installed from an R package. Instead, to use them, you must source the .R file that contains the functions.

For example, you must run a line of code like:

source("scripts/source/helper_functions.R")

After sourcing the file, the functions are available in your R session.

Use this document to reference functions that we have written for our analysis.

normalize_gmst()

Purpose

normalize_gmst() normalizes global mean surface temperature anomaly data (gmst) to a selected reference period.

In climate model analysis, we often want to compare temperature change relative to a baseline period. This function subtracts the average gmst value during the reference period from all gmst values in the same model run.

By default, the reference period is 1850–1900.

Function

normalize_gmst(data, ref_start = 1850, ref_end = 1900)

Arguments

Argument Description
data A data frame containing model output. Must include the columns run_number, variable, year, and value.
ref_start The first year of the reference period. The default is 1850.
ref_end The last year of the reference period. The default is 1900.

What the function does

The function will:

  1. Splits the data by run_number.
  2. Finds the rows where variable == “gmst”.
  3. Calculates the mean gmst value during the reference period.
  4. Subtracts that reference-period mean from all gmst values in the same run.
  5. Leaves all non-gmst variables unchanged.
  6. Combines the results back into one data frame.

Example use

This example assumes our model result is named model_output and is present available in the R session (present in the global evironment – upper right panel).

normalized_data <- normalize_gmst(
  data = model_output,
  ref_start = 1850,
  ref_end = 1900)

Because the default reference period is 1850–1900, you can also write:

normalized_data <- normalize_gmst(model_output)

Output

The function returns a data frame with the same general structure as the input data. The gmst values are normalized to the selected reference period. All other variables are returned unchanged.


produce_metrics()

Purpose

produce_metrics() calculates a summary metric for a selected model output variable over a selected range of years.

For example, it calculates the mean, median, minimum, or maximum value of a variable from 2081–2100 (but can be any year range) for each model run.

By default, the function calculates late-century mean values from 2081–2100.

Function

produce_metrics(data, var, years = 2081:2100, FUN = mean)

Arguments

Argument Description
data A model result data frame.
var The variable to calculate a metric for. This variable must be present in the model result.
years The years used to calculate the summary metric. The default is 2081:2100.
FUN The function used to calculate the metric, such as mean, median, max, or min. The default is mean.

What the function does

The function will:

  1. Create a new metric using the selected variable, years, and summary function.
  2. Calculates that metric for each model run.
  3. Adds a variable column showing which variable was summarized.
  4. Renames the metric result column to value.
  5. Returns a summary data frame.

Example use

This example assumes our model result is named model_output and is present available in the R session (present in the global evironment – upper right panel).

Calculate the mean gmst value from 2081–2100:

gmst_metrics <- produce_metrics(
  data = model_output,
  var = "gmst",
  years = 2081:2100,
  FUN = mean)

Or, calculate the median gmst value from 2081–2100:

gmst_median_metrics <- produce_metrics(
  data = model_output,
  var = "gmst",
  years = 2081:2100,
  FUN = median)

Or, calculate the mean value for a different variable:

npp_metrics <- produce_metrics(
  data = model_output,
  var = "npp",
  years = 2081:2100,
  FUN = mean)

Output

The function returns a summary data frame with one metric value for each model run. The output includes the summarized variable name and the metric value.