Background

To this point you have worked through cases for which matilda can be used to compute probabilistic climate projections. A high number of runs (10k+) is often needed to get a stabilized solution that is robust enough to be used for inference. In these cases you may have noticed that using matilda functions (specifically iterate_model()) can be laborious. One way we can increase the computational efficiency of the matilda workflow is to take advantage of localized parallel computing (also called single-node parallel computing).

By leveraging localized parallel computing, the workload can be split among multiple cores on a single machine (i.e., workers), allowing runs to occur simultaneously and thereby increasing the speed of computation. This approach can significantly reduce the time required to complete the computations necessary for generating many thousand Hector runs using matilda.

For this tutorial we will take advantage of functions in the parallel package.

Loading packages

Frost install and load the packages we need to complete this tutorial. We will need matilda and parallel. If you have not installed these packages before, you will need to complete this before loading the package. Here is an example of installing matilda:

# Install Matilda from Github
remotes::install_github("jgcri/matilda")

If you already have matilda installed we can proceed with loading the package:

# Loading packages
library(matilda)

## Loading required package: hector

library(parallel)

Building .ini List

matilda is integrated with the Hector simple climate model and therefore we run the model in a similar way.

For this tutorial we will run matilda with different scenarios:

# locate the ini directory
ini_dir <- paste0(system.file("input", package = "hector"), "/")

# read in ini files into a list
ini_list <- list(ssp119 = paste0(ini_dir, "hector_ssp119.ini"),
                 ssp245 = paste0(ini_dir, "hector_ssp245.ini"),
                 ssp370 = paste0(ini_dir, "hector_ssp370.ini"),
                 ssp585 = paste0(ini_dir, "hector_ssp585.ini"))

By creating a list we can loop a function across a list of objects. In this case each element in the list will be an ini file.

Building Perturbed Parameter Set

Before running the model we will use one of the ini files to initiate a core that will be used to produce parameters for the model.

# Using generate_params() to build the perturbed parameter set

# set seed for replication
set.seed(123)

# set sample size (we will run with a smallish sample size now)
n = 10

# initiate a core using the first ini file in ini_list
params_core <- newcore(ini_list[[1]])

# produce parameter set 
params <- generate_params(core = params_core,
                          draws = n)

Splitting the Jobs

When we run Hector in matilda we iterate the model a number of times with different parameters (this is our perturbed parameter set).

When we use parallel computing, we split the perturbed parameter sets into a list of “jobs”. Doing this will distribute the workload among multiple parallel processes (which may also be referred to as ‘threads’). This splitting allows each parallel worker to handle a subset of the parameters independently. Here we will split params into five param_chunks.

# split params into chunks
param_chunks <- split(params, 1:5)

This will create a new list object called param_chunks. Inside the list, there are 5 elements which are parameter data frames (equivalent to the type of params data frame we would pass to iterate_model()).

Run Model with Parallel Computing

In order to run the model with parallel computing we first need to initialize a cluster. This will identify the number of cores or processors we want to act as workers, each handling a portion of the workload concurrently.

To initialize the cluster we use makeCluster() in parallel. We need to pass this function the number of clusters we want to use. The number of cores or processors you have will depend on the machine you are working on. To know how many cores you have available you can use detectCores().

# detect the number of cores you have
detectCores()

## [1] 8

This tells me that I have 8 cores available on my machine. For this tutorial I want to keep at least one core to run other tasks on my computer. Here, we will initialize a cluster using all cores, minus 1.

# initializing a cluster
cl <- makeCluster(detectCores() - 1)

Another necessary step in utilizing parallel computing is exporting objects and functions from the main R session to each of the workers in the cluster. This ensures that all the necessary objects and functions are available to the “workers” when executing tasks. We will export these objects and functions to the cluster we just identified (cl).

In our case we want to export the following objects and functions:

param_chunks
ini_list
newcore()
interate_model()

# Export required functions and objects to the cl cluster we just created
clusterExport(cl, c("param_chunks",
                    "ini_list",
                    "newcore",
                    "iterate_model"))

With this pre-processing information completed we can run the model. For this step we will use parLapply() which requires the cluster we built as the first argument, but after that it is very similar to running lapply().

# run the model with parallel computing
result <- parLapply(cl, names(ini_list), function(scenario_name){
  
  # extract the scenario information from the ini_list 
  # using the scenario name
  scenario <- ini_list[[scenario_name]]
  
  # initialize model core for the current scenario
  core <- newcore(scenario, name = scenario_name)
  
  # run the model looping across param_chunks for the current core
  result_list <- lapply(param_chunks, function(chunk) {
    
    iterate_model(core = core, 
                  params = chunk, 
                  save_years = 1800:2100,
                  save_vars = "gmst")
  })
  
  ## This step ensures a correct run_numbers are added to each model run ##
  # Starting with the second data frame of the current scenario
  for (i in 2:length(result_list)) {
    
    # calculate the max value of the previous element in the result list
    max_run_number <- max(result_list[[i - 1]]$run_number)
    
    # Add the max value of the previous element to the run_number of the current 
    # element to get an updated run_number that is continuous from the previous element.
    result_list[[i]]$run_number <- result_list[[i]]$run_number + max_run_number
    
  }
  
  return(result_list)
  
})

In this code we are using parLapply() to loop through scenarios in ini_list, initiate a core, and then loop through each chunk in param_chunks to run the model.

However, there is a sneaky piece to this result, which we fix with the following lines in the code above:

# DONT RUN #

## This step ensures a correct run_numbers are added to each model run ##
  # Starting with the second data frame of the current scenario
  for (i in 2:length(result_list)) {
    
    # calculate the max value of the previous element in the result list
    max_run_number <- max(result_list[[i - 1]]$run_number)
    
    # Add the max value of the previous element to the run_number of the current 
    # element to get an updated run_number that is continuous from the previous element.
    result_list[[i]]$run_number <- result_list[[i]]$run_number + max_run_number
    
  }

If we did not run these lines to correct for the run_number, when we view result we would see that each result has a run_number column starting with run_number = 1. We know this is not correct because if we completed n = 10 param draws split to 5 chunks, we know the first data frame in the result should include runs 1-2 and the last data frame should include runs 9-10. Therefore, the only data frame with the correct run_number would be the first one.

This mistake occurs because each iteration uses a new chunk of params_chunk and is viewed as a new (non-consecutive) run for the worker in the cluster.

This is the same code without the correction. You can compare the two outputs to see the difference.

# run the model with parallel computing
result_incorrect <- parLapply(cl, names(ini_list), function(scenario_name){
  
  # extract the scenario information from the ini_list 
  # using the scenario name
  scenario <- ini_list[[scenario_name]]
  
  # initialize model core for the current scenario
  core <- newcore(scenario, name = scenario_name)
  
  # run the model looping across param_chunks for the current core
  result_list <- lapply(param_chunks, function(chunk) {
    
    iterate_model(core = core, 
                  params = chunk, 
                  save_years = 1800:2100,
                  save_vars = "gmst")
  })
  
  return(result_list)
  
})

Bind Results

Once we have results with corrected run_numbers we can bind our results. Notice that result has nested lists. The first level is the scenario level, each of the 4 elements in this top level are results from a different SSP scenario. Going one level down we have the results of each params_chunk for the current scenario. We can think if the nested lists like this:

result
- scenario 1
  - params_chunk 1
  - params_chunk 2
  - params_chunk 3
- scenario 2
  - params_chunk 1
  - params_chunk 2

…and so on…

We can bind the param_chunks directly in our model run code like so:

# run the model with parallel computing
result <- parLapply(cl, names(ini_list), function(scenario_name){
  
  # extract the scenario information from the ini_list 
  # using the scenario name
  scenario <- ini_list[[scenario_name]]
  
  # initialize model core for the current scenario
  core <- newcore(scenario, name = scenario_name)
  
  # run the model looping across param_chunks for the current core
  result_list <- lapply(param_chunks, function(chunk) {
    
    iterate_model(core = core, 
                  params = chunk, 
                  save_years = 1800:2100,
                  save_vars = "gmst")
  })
  
  ## This step ensures a correct run_numbers are added to each model run ##
  # Starting with the second data frame of the current scenario
  for (i in 2:length(result_list)) {
    
    # calculate the max value of the previous element in the result list
    max_run_number <- max(result_list[[i - 1]]$run_number)
    
    # Add the max value of the previous element to the run_number of the current 
    # element to get an updated run_number that is continuous from the previous element.
    result_list[[i]]$run_number <- result_list[[i]]$run_number + max_run_number
    
  }
  
  # bind param_chunks
  result <- do.call(rbind, result_list)
  
  return(result)
  
})

After that we can bind the scenario data frames together when we are prepared for plotting, etc:

# bind result to create a result data frame
result_df <- do.call(rbind, result)

Single-node Parallel Computing with Matilda

Joe Brown

2024-03-14