class: center, middle, inverse, title-slide # The first wave of COVID-19 in Iceland ## Data, methods and story ### Birgir Hrafnkelsson & Brynjólfur Gauti Jónsson ### University of Iceland & Icelandic Heart Association ### 2020/07/22 --- class: inverse, center, middle # Covid in Iceland ---   --- # The people * Who run the world - Þórólfur Guðnason, Chief Epidemiologist - Alma Möller, Director of Health - Víðir Reynisson, Civil Protection and Emergency Management at National Commissioner of Police * Data science team - Brynjólfur Gauti Jónsson, PhD student in Biostatistics - Birgir Hrafnkelsson, Professor of Statistics - Jóhanna Jakobsdóttir, Assistant Professor of Biostatistics - Rögnvaldur Sæmundsson, Professor of Industrial Engineering - Tómas P. Rúnarsson, Professor of Industrial Engineering - Thor Aspelund, Professor of Biostatistics - Unnur A. Valdimarsdóttir, Professor of Epidemiology - Þórarinn Jónmundsson, PhD Student in Biostatistics - And many more! --- # Aims * Predict date and magnitude of maximum burden for health care system - Especially ICU admissions * Model should be operational as soon as possible   --- # Aims cont. We use a top-down approach to simplify predictions. Some might call this the *Fermi Method*. 1. Predict diagnosed COVID-19 cases - Generalized Logistic Growth Model 2. Sample age distribution of predicted cases - Simple multinomial sampling using observed age distribution in Iceland 3. Predict hospital admissions - Binomial sampling by joining age distribution of cases with Table 1 from Ferguson et al. 4. Predict ICU admissions - Same as above. Age distributed hospital admissions along with Table 1 from Ferguson et al. Informed guesses for length of infection, length of stay (hospital and ICU) and time from infection to hospital admission. --- class: inverse, center, middle # Hierarchical Generalized Logistic Growth Curves --- class: middles   --- # The Richards Growth Model $$ `\begin{aligned} f(t) &= \frac{S}{(1 + \nu e^{-\nu\tilde\beta (t - \alpha)})^{\frac{1}{\nu}}} \\ \frac{d}{dt}f(t) &= \tilde\beta \cdot f \cdot \left(1 - \left(\frac{f}{S}\right)^{\nu}\right), \end{aligned}` $$  --- # Special cases $$ `\begin{aligned} f(t) &= \frac{S}{(1 + \nu e^{-\nu\tilde\beta (t - \alpha)})^{\frac{1}{\nu}}} \\ \frac{d}{dt}f(t) &= \tilde\beta \cdot f \cdot \left(1 - \left(\frac{f}{S}\right)^{\nu}\right), \end{aligned}` $$ * Logistic growth model: `\(\nu = 1\)` * Gompertz model: `\(\nu \rightarrow 0^+\)` and `\(\nu\tilde\beta\)` a constant. * Truncated exponential model: `\(\nu \rightarrow \infty\)`  --- # Our parametrization $$ `\begin{aligned} f(t) &= \frac{S}{(1 + \nu e^{-\beta (t - \alpha)})^{\frac{1}{\nu}}} \\ \frac{d}{dt}f(t) &= \frac{\beta}{\nu} \cdot f \cdot \left(1 - \left(\frac{f}{S}\right)^{\nu}\right) \\ \beta &= \tilde\beta\nu \end{aligned}` $$ Same special cases but easier to approximate Gompertz model. * Logistic growth model: `\(\nu = 1\)` * Gompertz model: `\(\nu \rightarrow 0^+\)`. * Truncated exponential model: `\(\nu \rightarrow \infty\)` --- # Inflection Points The point of inflection *(i.e. maximum growth)* is at $$ \left(t, f(t) \right) = \left(\alpha, \frac{S}{(1 + \nu)^{\frac{1}{\nu}}}\right) $$ * `\(\alpha\)`: Day of maximum growth * `\(\nu\)`: One-to-one correspondence with percent of asymptote, `\(S\)`, at which maximum growth occurs. $$ `\begin{aligned} \frac{S}{(1 + \nu)^{\frac{1}{\nu}}} &\xrightarrow[\nu \to 0^+]{} S e^{-1} \sim 0.37 \cdot S \\ \frac{S}{(1 + \nu)^{\frac{1}{\nu}}} &\xrightarrow[\nu \to \infty]{} S \end{aligned}` $$ Possible inflections points as percents of asymptote are between `\(e^{-1}\)` and `\(1\)`. --- # Inflection Points cont. * COVID-19 growth curves exhibit greater amounts of right-skew than allowed by current parameterization.  --- # Inflection Points cont. * Estimation fails as countries reach the lower limit  --- # Proposition * Points of inflection can be arbitrarily close to asymptote, S. * So we mirror the model around the line `\(x = \alpha\)` $$ `\begin{aligned} g(t) &= S - \frac{S}{(1 + \nu e^{\beta (t - \alpha)})^{\frac{1}{\nu}}} = S - f(2\alpha - t) \\ \frac{d}{dt}g(t) &= \beta \cdot (S - g(t)) \cdot \left(1 - \left(\frac{(S - g(t))}{S}\right)^{\nu}\right) \end{aligned}` $$ * Maximum growth can now happen between `\(0 \cdot S\)` and `\((1 - e^{-1})\cdot S\)` ##### Special cases * `\(\nu = 1\)`: Logistic growth model * `\(\nu \rightarrow \infty\)`: Negative exponential model * `\(\nu \rightarrow 0^+\)`: Negative Gompertz? --- class: inverse, center, middle # Bayesian Hierarchical Model --- # First steps * `\(P_{i, t}\)`: Population in country `\(i\)` at time `\(t\)`. * `\(I_{i, t}\)`: Cumulative number of infected in country `\(i\)` at time `\(t\)`. $$ `\begin{aligned} I_{i, t} &\sim \hspace{2mm} \mathrm{NegBin}(\mu_{i, t} \cdot P_{i, t}, \phi_i) \\ \mu_{i, t} &= \hspace{10mm} \frac{S_i}{(1 + \nu_i e^{-\beta_i (t - \alpha_i)})^{\frac{1}{\nu_i}}} &\text{(Richards)} \\ \mu_{i, t} &= S_i - \frac{S_i}{(1 + \nu_i e^{\beta_i (t - \alpha_i)})^{\frac{1}{\nu_i}}} &\text{(Mirrored Richards)} \end{aligned}` $$ * `\(\alpha_i\)`, `\(\beta_i\)`, `\(\nu_i\)`, `\(S_i\)`, `\(\phi_i\)` are country-specific parameters * `\(\mathrm{NegBin}(\mu, \phi)\)` is negative binomial with mean `\(\mu\)` and variance `\(\mu + \phi\mu^2\)` --- # Daily Diagnosed Cases * Implicit correlation in `\(I_{i, t}\)` for different `\(t\)`. * Might be better to model daily diagnosed cases `\(D_{i, t} = I_{i, t} - I_{i, t - 1}\)` * Simplifying assumption - If we model `\(I_{i, t}\)` by `\(f(t)\)`, then - Model `\(D_{i, t}\)` by `\(\frac{d}{dt} f(t)\)` * Model on logarithmic scale for computational reasons <br> <br> $$ `\begin{gathered} D_{i, t} \sim \mathrm{NegBin}(\mu_{i, t} \cdot P_{i, t}, \phi_i) \\ \ln(\mu_{i, t}) = \ln\left(\frac{d}{dt}f_i(t)\right) = \ln(\beta_i) + \ln(S_i) + z_{i, t} - \left(1 + \frac{1}{\nu_i}\right)\cdot\left(1 + \nu_i e^{z_{i, t}}\right) \\ z_{i, t} = - \beta_i \cdot (t - \alpha_i) \quad \text{(Richards)} \\ z_{i, t} = \beta_i \cdot (t - \alpha_i) \quad \text{(Mirrored Richards)} \end{gathered}` $$ --- # Hierarchical Parameters * Logistic growth parameters as multivariate normal. * Overdispersion as hierarchically log-normal. $$ `\begin{gathered} \begin{pmatrix} \ln(\alpha_i) \\ \ln(\beta_i) \\ \ln(\nu_i) \\ \text{logit}(S_i) \end{pmatrix} \sim \mathrm{MVN} \left( \begin{pmatrix} \mu_\alpha \\ \mu_\beta \\ \mu_\nu \\ \mu_S \end{pmatrix} , \tau\Omega\tau' \right) \\ \\ \mu_\alpha \sim \mathrm{N}(3, 1), \quad \mu_\beta \sim \mathrm{N}(-2, 1) \\ \mu_\nu \sim \mathrm{N}(0, 1), \quad \mu_S \sim \mathrm{N}(-4, 1) \\ \\ \mathrm{diag}(\tau) \sim \mathrm{Exponential}(1), \quad \Omega \sim \mathrm{LKJCorr}(2) \\ \\ \ln(\phi_i) \sim \mathrm{N}(\mu_\phi, \sigma_\phi) \\ \mu_\phi \sim \mathrm{N}(-1, 1), \quad \sigma_\phi \sim \mathrm{Exponential}(1) \end{gathered}` $$ --- # Software * Hamiltonian Monte Carlo using Stan. * Stan code executed from R using package `rstan`. - Code freely available.  --- # Software cont. * Non-centered parametrizations * Univariate - If `\(X \sim \mathrm{N}(\mu, \sigma)\)` - Write `\(X = \mu + \sigma \cdot z\)`, where `\(z \sim \mathrm{N}(0, 1)\)` * Multivariate - If `\(X \sim \mathrm{MVN}(\mu, \Sigma)\)` - Write `\(X = \mu + \tau \Lambda Z\)`, `\(Z_{ij} \sim \mathrm{N}(0, 1)\)`, `\(\Sigma = \tau\Omega\tau^T\)`, `\(\Omega = \Lambda\Lambda^T\)`  --- # Data * Data from European Center for Disease Prevention and Control *(ECDC)* on 65 countries. * Start of first wave - When cumulative cases exceeded 20 per million capita * End of first wave - Fit GAMs to daily diagnosed cases - Use smoothed predictions for stable derivatives - When derivative is > 0 after having previously been < 0 say second wave starts.  --- class: inverse, center, middle # Results --- # Predictions  --- # Predictions cont. 