---
title: "Exercise 4.1 & 4.2 — 1918 Spanish Influenza & Measles Dynamics"
author: "Kornelius Langga Son, MPH"
date: today
format:
html:
toc: true
toc-depth: 3
toc-title: "Table of Contents"
toc-location: left
number-sections: true
theme: darkly
code-fold: true
code-tools: true
highlight-style: dracula
execute:
warning: false
message: false
---
```{=html}
<style>
.quarto-title-block {
background: linear-gradient(135deg, #0f3460, #16213e, #1a1a2e) !important;
padding: 50px 40px !important;
border-radius: 12px !important;
border-left: 6px solid #e94560 !important;
}
.quarto-title h1 {
color: #e94560 !important;
font-weight: 800 !important;
}
.quarto-title-meta-heading {
color: #a8dadc !important;
letter-spacing: 2px !important;
text-transform: uppercase !important;
}
.quarto-title-meta-contents p {
color: #ffffff !important;
}
</style>
```
---
## Exercise 4.1
Table 4.4 shows the weekly number of influenza cases reported in Gothenburg, Sweden during the 1918 Spanish influenza pandemic (population = 196,943). We aim to:
- **a)** Calculate the growth rate during the first and second waves and estimate the net (Rn) and basic (R0) reproduction numbers, assuming 30% and 50% of individuals were immune at the start of each wave respectively, and that the average pre-infectious and infectious periods were both 2 days.
- **b)** Calculate R0 using the final epidemic size.
- **c)** Determine which estimate of R0 is most reliable.
- **d)** Interpret what the results tell us about transmissibility between waves.
```{r setup}
library(pacman)
pacman::p_load(tidyverse, ggplot2, knitr, kableExtra)
# Parameters
S1 <- 1 - 0.3 # susceptible fraction, Wave 1 (30% immune)
S2 <- 1 - 0.5 # susceptible fraction, Wave 2 (50% immune)
growth_rate1 <- 0.367 # growth rate Wave 1 (per day)
growth_rate2 <- 0.104 # growth rate Wave 2 (per day)
D <- 2 # infectious period (days)
D2 <- 2 # pre-infectious period (days)
population <- 196943
```
## Part a) Growth Rate & Reproduction Numbers
Growth rates estimated from the slope of ln(cumulative cases) vs time:
- \*\*Wave 1:\*\* Λ1 = 0.367 per day (06/07–27/07/1918)
- \*\*Wave 2:\*\* Λ2= 0.104 per day (07/09–19/10/1918)
**\### Formula 1: SIR — 1 + ΛD**
Assumes no latent period; individuals become infectious immediately after infection.
**\### Formula 2: SEIR — (1 + ΛD)(1 + ΛD')**
Accounts for both latent (D') and infectious (D) periods, both assumed exponentially distributed.
**\### Formula 3 & 4: Erlang Staged Model — m=n=10 and m=n=100**
Uses multiple sub-stages for latent (m) and infectious (n) periods. As m and n increase, period durations become less variable and more realistic. Convergence between m=n=10 and m=n=100 confirms the estimate is stable.
```{r part-a}
# --- Formula 1: SIR (ignores latent period) -----------------
Rn1_SIR <- 1 + growth_rate1 * D
Rn2_SIR <- 1 + growth_rate2 * D
R0_1_SIR <- Rn1_SIR / S1
R0_2_SIR <- Rn2_SIR / S2
# --- Formula 2: SEIR (exponential latent + infectious) ------
Rn1_SEIR <- (1 + growth_rate1 * D) * (1 + growth_rate1 * D2)
Rn2_SEIR <- (1 + growth_rate2 * D) * (1 + growth_rate2 * D2)
R0_1_SEIR <- Rn1_SEIR / S1
R0_2_SEIR <- Rn2_SEIR / S2
# --- Formula 3 & 4: Erlang staged model ---------------------
erlang_Rn <- function(Lambda, D, D2, m, n) {
numerator <- Lambda * D2 * (1 + Lambda * D / m)^m
denominator <- 1 - (1 + Lambda * D / n)^(-n)
numerator / denominator
}
Rn1_m10 <- erlang_Rn(growth_rate1, D, D2, 10, 10)
Rn2_m10 <- erlang_Rn(growth_rate2, D, D2, 10, 10)
Rn1_m100 <- erlang_Rn(growth_rate1, D, D2, 100, 100)
Rn2_m100 <- erlang_Rn(growth_rate2, D, D2, 100, 100)
R0_1_m10 <- Rn1_m10 / S1; R0_2_m10 <- Rn2_m10 / S2
R0_1_m100 <- Rn1_m100 / S1; R0_2_m100 <- Rn2_m100 / S2
# --- Summary table ------------------------------------------
results <- tibble(
Formula = c("1 + \u039bD",
"(1+\u039bD)(1+\u039bD')",
"Erlang m=n=10",
"Erlang m=n=100"),
Rn_Wave1 = round(c(Rn1_SIR, Rn1_SEIR, Rn1_m10, Rn1_m100), 2),
Rn_Wave2 = round(c(Rn2_SIR, Rn2_SEIR, Rn2_m10, Rn2_m100), 2),
R0_Wave1_s0.7 = round(c(R0_1_SIR, R0_1_SEIR, R0_1_m10, R0_1_m100), 2),
R0_Wave2_s0.5 = round(c(R0_2_SIR, R0_2_SEIR, R0_2_m10, R0_2_m100), 2)
)
kable(results,
col.names = c("Formula", "Rn Wave 1", "Rn Wave 2",
"R0 Wave 1 (s=0.7)", "R0 Wave 2 (s=0.5)"),
caption = "Table 1. Estimated Rn and R0 using four model structures") |>
kable_styling(bootstrap_options = c("striped","hover","condensed"),
full_width = FALSE) |>
row_spec(3:4, bold = TRUE, background = "#d4edda") # highlight best estimates
```
## Part b) R0 from Final Epidemic Size
considering the first wave of the pandemic (taken to be during the period of 6/7/1918 - 31/8/1918), 4,657 individuals were reported to have experience disease. If 70% of the individuals were susceptible at the start of the first wave (S0=0.7). the proportion that were susceptible at the end of the first wave is given by the difference between 0.7 and the proportion of the population who experienced disease during the first wave. This calculation assumes that all of those who were reported as cases became immune (see below). We therefore obtain the following result:
$$R_0 = \frac{\ln(s_f) - \ln(s_0)}{s_f - s_0}$$
### Wave 1 (06/07/1918 – 31/08/1918)
```{r final-size-wave1}
cases_w1 <- 4657
s0_w1 <- S1
sf_w1 <- s0_w1 - (cases_w1 / population)
R0_fs_w1 <- (log(sf_w1) - log(s0_w1)) / (sf_w1 - s0_w1)
cat(sprintf("Cases : %d\n", cases_w1))
cat(sprintf("s0 : %.3f\n", s0_w1))
cat(sprintf("sf : %.3f\n", sf_w1))
cat(sprintf("R0 : %.2f\n", R0_fs_w1))
```
### Wave 2 (after 07/09/1918)
Considering the second wave of the pandemic (the period after 7/9/1918), 19,484 individuals were reported to have experienced
disease. Assuming that 50% of
individuals were susceptible at the start of this wave (*s~0~*=0.5), then applying a similar reasoning to that used to calculate *s~f~* for the
first wave, we obtain the following for the proportion of the population that was susceptible at the end of the second wave:
```{r final-size-wave2}
cases_w2 <- 19484
s0_w2 <- S2
sf_w2 <- s0_w2 - (cases_w2 / population)
R0_fs_w2 <- (log(sf_w2) - log(s0_w2)) / (sf_w2 - s0_w2)
cat(sprintf("Cases : %d\n", cases_w2))
cat(sprintf("s0 : %.3f\n", s0_w2))
cat(sprintf("sf : %.3f\n", sf_w2))
cat(sprintf("R0 : %.2f\n", R0_fs_w2))
```
### Comparison: Growth Rate vs Final Size
```{r comparison-table}
comparison <- tibble(
Method = c("Growth rate (Erlang m=n=10)", "Final epidemic size"),
R0_Wave1 = round(c(R0_1_m10, R0_fs_w1), 2),
R0_Wave2 = round(c(R0_2_m10, R0_fs_w2), 2)
)
kable(comparison,
col.names = c("Method", "R0 Wave 1", "R0 Wave 2"),
caption = "Table 2. R0 comparison: growth rate vs final epidemic size") |>
kable_styling(bootstrap_options = c("striped","hover","condensed"),
full_width = FALSE)
```
## Part c) Which Estimate is Most Reliable?
The growth rate method is more reliable for three reasons:
1. **Under-reporting independence** — the growth rate estimate is unaffected by the absolute number of cases reported, as long as the *proportion* reported stays consistent during the early exponential phase.
2. **Incomplete epidemic** — the final size formula assumes the epidemic is over, but cases were still occurring in January 1919, meaning the attack rate is underestimated, and therefore R0 is underestimated.
3. **Shared limitation** — both methods still require assumptions about the proportion susceptible at the start of each wave (s0 = 0.7 and 0.5). These values are plausible but unverified.
```{r reliability}
cat("Growth rate R0 | Wave 1:", round(R0_1_m10, 2),
"| Wave 2:", round(R0_2_m10, 2), "\n")
cat("Final size R0 | Wave 1:", round(R0_fs_w1, 2),
"| Wave 2:", round(R0_fs_w2, 2), "\n")
cat("\nFinal size underestimates R0 in both waves due to under-reporting.\n")
```
## Part d) Transmissibility Between Waves
```{r transmissibility}
change <- ((R0_2_m10 - R0_1_m10) / R0_1_m10) * 100
cat(sprintf("Wave 1 R0 : %.2f\n", R0_1_m10))
cat(sprintf("Wave 2 R0 : %.2f\n", R0_2_m10))
cat(sprintf("Change : %.1f%%\n", change))
```
The lower R0 in Wave 2 suggests **transmissibility decreased** between waves in Gothenburg. However, two caveats apply:
- Wave 1 R0 (\~4.2) is **higher than typical estimates** for the 1918 pandemic elsewhere (\~2–3), suggesting the apparent growth rate during Wave 1 may have been inflated by **increasing case reporting** as awareness of the outbreak grew.
- Despite lower R0, Wave 2 produced **more cases** (peak 3,287 vs 1,120) because a larger susceptible pool had accumulated by autumn 1918.
## Exercise 4.2
### Measles Inter-Epidemic Period
#### a) Estimating R0 from T = 2 years (Equation 4.3.1)
Rearranging the inter-epidemic period formula:
$$T=2\pi\sqrt{\frac{L(D+D')}{R_0-1}}\implies R_0 \ approx 1 + \frac{4\pi^2 L(D+D')}{T^2}$$
```{r}
L <- 70 * 365 # life expectancy in days
T1 <- 2 * 365 # inter-epidemic period 1950-1968 (days)
D_measles <- 7 # infectious period (days)
D2_measles <- 8 # pre-infectious period (days)
R0_measles <- 1 + (4 * pi^2 * L * (D_measles + D2_measles)) / T1^2
cat(sprintf("L : %d days (%.0f years)\n", L, L/365))
cat(sprintf("T : %d days (%.0f years)\n", T1, T1/365))
cat(sprintf("D + D' : %d days\n", D_measles + D2_measles))
cat(sprintf("R0 (approx): %.1f\n", R0_measles))
```
**Part b) Average Age at Infection from T = 3 years (Equation 4.32)**
After vaccination was introduced in 1968, the inter-epidemic period
lengthened to \~3 years. Using the simplified formula:
$T \approx 2\pi\sqrt{A(D+D')} \implies A \approx \frac{T^2}{4\pi^2(D+D')}$
```{r}
T2 <- 3 * 365 # inter-epidemic period 1968-1988 (days)
A_days <- T2^2 / (4 * pi^2 * (D_measles + D2_measles))
A_years <- A_days / 365
cat(sprintf("T : %d days (%.0f years)\n", T2, T2/365))
cat(sprintf("A : %.0f days ≈ %.1f years\n", A_days, A_years))
```
### Limitations
Two key limitations of this estimate:
1. **Random mixing assumption** — the formula assumes individuals contact each other randomly. In reality, contact is strongly age-dependent (children mix with children), which affects transmission dynamics.
2. **Changing inter-epidemic period** — vaccination after 1968 caused T to change over time. Using a single fixed T ignores this dynamic transition, reducing the accuracy of the estimated average age at infection.
### Exercise 4.2
> *"What do models tell us about the dynamics of infections?"* — Chapter 4, Vynnycky & White
The inter-epidemic period (*T*) is the time between successive epidemic waves, governed by:
###
$T = 2\pi\sqrt{\frac{L(D+D')}{R_0-1}}$
For a new epidemic to ignite, 6 steps must occur:
1. Susceptible (S) newborns continuously enter the population
2. S eventually exceeds the threshold 1/R0 → incidence starts rising
3. The epidemic grows, consuming susceptibles faster than births replace them
4. Incidence peaks when S hits exactly 1/R0
5. S falls below 1/R0 → incidence declines
6. Births slowly replenish S → cycle repeats
**Vaccination & the Honeymoon Period**
Vaccination reduces the number of susceptibles entering the population, lengthening *T*. This quiet phase of very low incidence following a successful programme is known as the **Honeymoon Period**.
**Why Do Outbreaks Persist After SIAs?**
SIAs lengthen \*T\* on average, but they do not eliminate susceptible pockets. The dramatic drop in cases breeds complacency, routine immunization weakens, and susceptibles silently accumulate again.
The honeymoon ends, sometimes with a larger outbreak than before.
\> SIAs complement, but can never replace, strong routine immunization.\*\*
## **About This Document**
This Quarto document was developed in \*\*R\*\* as a free learning resource for anyone interested in infectious disease modelling.
::: {.callout-note title="Book Source"}
Vynnycky E, White RG. \*An Introduction to Infectious Disease Modelling.\* Oxford University Press; 2010.
With an introduction by \*\*Professor Paul E.M. Fine.\*\*
:::
::: {.callout-tip title="Note for Readers"}
The code and worked solutions are intended to complement self-study of the textbook. Readers are encouraged to modify parameters and explore model behaviour independently.
:::
::: {.callout-important title="Disclaimer"}
This document is a free educational resource.
:::