2022-02-09

Acknowledgements – Funding

Acknowledgements – People

  • Ben Bolker – McMasterPandemic Creator/Architect
  • Mac-Theobio COVID Modelling Group Covid Modelling Blog: https://mac-theobio.github.io/covid-19/
    • Irena Papst, Mike Li (Public Health Agency of Canada, McMaster)
    • Ben Bolker, Jonathan Dushoff, David Earn (McMaster)
  • Weiguang Guan – C++ Developer (Sharcnet)
  • Help Understanding Models / Alpha Testing / Test Cases
    • Zach Levine (Weizmann Institute of Science)
    • Lisa Kanary (Yukon University)
    • Caroline Colijn (Simon Fraser University)

Background

What does Refactoring mean?

  • What?
    • Change the internals of an existing code base
    • Match behaviour of original code
  • Why?
    • Maintainability
    • Readability
    • Speed
    • Extensibility

  • Related concepts
    • Non-functional requirements
    • Technical debt

What is McMasterPandemic?

  • Forecasts
    • Ontario Science Advisory Table
    • Public Health Agency of Canada
    • Chief Public Health Officer
    • World Health Organization (2020, Afro-Region, 49 countries)
  • Small user base
    • Number of forecasters \(\approx 3\)

Prediction of the Third Wave

Prediction of the Third Wave

Why Refactor McMasterPandemic?

Why Refactor? – slow forecasting

  • Confidence bands slow to compute
  • Based on repeated simulation
  • Most time-consuming step for PHAC
  • Two approaches:
    • Translate to C++ (done)
    • Parallelize (todo)

Why Refactor? – slow calbration

  • Calibration: tuning parameters to data
  • Example
    • observed case reports – dots
    • simulated case reports – line
    • optimize transmission rate using maximum likelihood
    • negative binomial error model

Why Refactor? – slow calbration

  • Calibration: tuning parameters to data
  • Based on trajectory matching
    • select parameter set
    • simulate data
    • compare simulated and observed data
    • update parameter set
    • repeat
  • Again – repeated simulations are slow

Why Refactor? – slow scenario exploration

  • Public health modeller has to make weekly forecasts
  • Model fitting/calibration takes two days
  • Only three days to
    • explore scenarios
    • write the report
  • No time to refine the model
  • No time to refine messaging

Why Refactor? – slow scenario exploration

  • Public health modeller has to make weekly forecasts
  • Model fitting/calibration takes two days
  • Only three days to
    • explore scenarios
    • write the report
  • No time to refine the model
  • No time to refine messaging

What if two days becomes two hours?

More time modelling and communicating, less time waiting for results

Why Refactor? – slow model development

Need a bit more background before answering …

What is a Compartmental Model?

Base McMasterPandemic Model

Figure from Irena Papst

Model with Vaccination and Variants

Figure from Irena Papst

Matrices and Linear Algebra

As models get more complex, quantities that looked like this …

Matrices and Linear Algebra

… become more complexThis is actually a simplification

Why Refactor? – slow model development

  • New variant emerges, as immunity from vaccination wanes
  • Public health modeller has to create new model structure
  • Requires a few weeks of software development
  • Public health situation changes over this time
  • Model is less relevant when complete

Figure from Irena Papst

Why Refactor? – slow model development

  • New variant emerges
  • Immunity from vaccination wanes
  • Public health modeller has to create new model structure
  • Requires a few weeks of software development
  • Public health situation changes over this time
  • Model is less relevant when complete

Figure from Irena Papst

What if a few weeks becomes an afternoon?

Forecasts are delivered quickly, resulting in bigger public health impact

Architecture

Architecture

  • Call stack diagram
    • Arrows from (layer of) functions that call other functions
    • Arrows to (layer of) functions being called by other functions
  • Two use cases
    • Calibration
    • Forecasting
  • Both interfaces utilize same simulation engine

Refactored Architecture

  • Refactored components in grey
  • Focus on two things
    • model definition conveniences (focused human time)
    • simulation speed (human waiting time)

Increasing Simulation Speed

Technology Choice – TMB

  • Template Model Builder – TMB
  • R package (convenient)
  • Objective functions in C++ (fast)
  • Uses automatic differentiation (fast)
  • Options for statistical approaches

Aside – Automatic Differentiation

  • Classical methods of computer-assisted differentiation
    • Numerical differentiation by finite differencing
    • Symbolic differentiation by computer algebra systems
    • These are both slow with many inputs
  • Automatic differentiation
    • Automatically applies the chain rule to standard mathematical functions
    • Will not give general answers (like symbolic differentiation)
    • Give very fast and accurate evaluations of derivatives at specific values

Benchmark Example

TMB: Time difference of 7.8s
R: Time difference of 1741.0s
Speedup: 224x

But to be conservative we were seeing 60x speedups regularly

Does this mean that a workflow that took an hour, will now take one minute?

Not necessarily

Managing Expectations – Theory

Split up code into two sets:

  1. Refactored code
  2. Not refactored code

Simple but important insight:

If the code that is not refactored takes \(x\) minutes to run, the fastest run time that could be achieved by refactoring is \(x\) minutes

Managing Expectations – Theory

Two type of code that are in the “Not refactored” set

  • The code in MacPan that is being executed by the user’s job, but is not refactored
  • The code that the user wrote (don’t forget about this!)

Managing Expectations – Theory

Run-times of two sets of code, in different states

  • \(r_{\text{before}}\): refactored code, before refactoring
  • \(r_{\text{not_done}}\): unrefactored code
  • \(r_{\text{after}}\): refactored code, after refactoring

The total run-time before refactoring: \[ t_{\text{before}} = r_{\text{before}} + r_{\text{not_done}} \]

The total run-time after refactoring: \[ t_{\text{after}} = r_{\text{after}} + r_{\text{not_done}} \]

Managing Expectations – Theory

Dimensionless measures of this system:

  • Realized speedup: \(\tau = \frac{t_\text{before}}{t_\text{after}}\)
  • Naive speedup: \(\alpha = \frac{r_\text{before}}{r_\text{after}}\)
  • Fraction refactored: \(\rho = \frac{r_\text{before}}{t_{\text{before}}}\)

Expression for the realized speedupsThis is Amdahl’s Law: \[ \tau = \frac{1}{\frac{\rho}{\alpha} + 1 - \rho} \]

Managing Expectations – Theory

  • Conservative but typical realized speedups: \(\tau \approx 6\)
  • Conservative but typical naive speedups: \(\alpha \approx 60\)

Estimate fraction refactored, \(\rho\), by rearranging Amdahl’s Law

\[ \rho = \frac{1 - \frac{1}{\tau}}{1 - \frac{1}{\alpha}} \approx 0.85 \]

Performance Enhancements in Practice

  • Removing performance bottlenecks rarely results in speed ups
  • People try even more challenging computations
  • Measuring performance gains in units of ‘times speed-up’ misses the point
  • More of a binary thing:

Has the performance optimization allowed people to do things that they couldn’t do before?

Testimonial

I just wanted to send out this email to express how impressed/happy I am running the new TMB version.

It is insanely fast and definitely saves me hours everyday and allows me to explore more scenarios.

I can do many more things I couldn’t before due to computation limitations.

Thank you, you guys are heroes!

Increasing Model Flexibility

Original McMasterPandemic Architecture

Original McMasterPandemic Architecture

  • Create any compartment topology
  • Fairly general specification of rates of flow among compartments

Example – SIR

state = c(S = 20000, I = 100, R = 0)
params = c(gamma = 0.06, beta = 0.15, N = sum(state))
sir_model = (init_model(
    params, state,
    start_date = "2000-01-01",
    end_date = "2000-05-01",
  )
  %>% add_rate("S", "I", ~ (1/N) * (beta) * (I))
  %>% add_rate("I", "R", ~ (gamma))
)

Example – SIR

(sir_model
 %>% simulate
 %>% filter(variable %in% names(state))
 %>% rename(compartment = variable, state = value)
 %>% mutate(compartment = factor(compartment, names(sir_model$state)))
 %>% ggplot
  +  geom_line(aes(x = Date, y = state, colour = compartment))
)

Example – SIR

Example – SIR

(sir_model
 %>% simulate(format = 'wide')
 %>% rename(`Force of Infection` = S_to_I)
 %>% ggplot
  +  geom_line(aes(x = Date, y = `Force of Infection`))
)

Example – SIR

Example – Matrix-Valued Rates

Create symbolic vectors and matrices

C = vec('Ca', 'Cp', '(Cm) * (1 - iso_m)', '(Cs) * (1 - iso_s)')

Use vectors and matrices to define rates

%>% add_rate("S", "E", sum(I * C * beta * inverse_N))

Keep things compact even when the model gets complex

%>% vec_rate(
  "S" %_% vax_cat,
  "E" %_% vax_cat,
  kronecker(vax_trans_red, t(baseline_trans_rates)) %*% Istate
)

Example – Erlang SEIR Model

Example – Erlang SEIR Model

m = 5 # number of E boxes
(chained_E_boxes = "E" %_% 1:m)
## [1] "E_1" "E_2" "E_3" "E_4" "E_5"
# sojourn through exposed compartments
  %>% rep_rate(
    chained_E_boxes[1:(m-1)],
    chained_E_boxes[2:m],
    ~ (m) * (sigma)
  )

Full example onlinehttps://canmod.net/misc/flex_intro#erlang-seir

Example – BC Omicron Model

Library of Examples

Interface Lessons

  • Simple models are easy anyways
  • Need to focus on complex structured models as the primary use case
  • Focus on modelling, not data manipulation or visualization – plug into standard tools

In Progress

Architecture

In Progress

  • Refactor data fitting components in C++
  • Utilize automatic differentiation
    • More speed
    • Better optimization steps
    • Converge in fewer iterations
  • Flexibility in statistical methodology
    • Latent variables
    • Process noise
    • Splines

Thank you!