Disclaimer: The views and opinions expressed in this document are those of the author and do not necessarily reflect the official policy or position of any of the affiliations above.

Contents

Aim

This is a relatively naive analysis, which purpose is just to illustrate, with an example, Generalized Linear Model fitting and usage of Monte Carlo simulations to obtain empirical confidence intervals. The analysis assume that daily number of deaths associated to COVID-19 can be explained with just the covariate time. Hence, results shown in this report should be interpreted with caution.

Methods

Data

Data source: European Centre for Disease Prevention and Control (ECDC). Updated data can be automatically loaded using the R script available here. Data include daily mortality counts from the first day showing at least 20 deaths (2020-03-10) to the last available day (2020-07-04). Below we illustrate potential impacts of changing the source of data on the results.

Model

We fitted Poisson regression models, allowing for overdispersion using quasi-Poisson, to model the daily mortality count as a function of time. Specifically, we analyzed two different options for the mean mortality for day \(t\), \(\lambda_t\): \[\begin{equation} \label{eq:lambda1} \begin{aligned} \text{Model 1:}\quad\lambda_t &= \exp{(\beta_0 + \beta_1 t + \beta_2 t^2)} \\ \text{Model 2:}\quad\lambda_t &= t^\gamma\,\exp{(\beta_0 + \beta_1 t + \beta_2 t^2)} \end{aligned} \end{equation}\] Note that Model 1 is the particular case of Model 2 when \(\gamma = 0\). In both models, \(\beta_1\) is related to the initial exponential growth of the daily mortality and \(\beta_2\), which is expected to be negative, is related to the long-term decreasing of mortality. In addition, \(\gamma\) allows to asymmetry in the curve. Under these models, the peak location, \(T\), and the expected mortality count peak, \(M\), can be easily derived. Specifically, under Model 2, \[\begin{equation} \label{eq:peak2} T_2 = - \frac{\beta_1}{4 \beta_2} \left( 1 + \sqrt { 1 - \frac{8\beta_2 \gamma}{\beta_1^2} } \right) \quad\text{and}\quad M_2 = T_2^\gamma\,\exp{\left(\beta_0 + \frac{1}{2}\beta_1 T_2 - \frac{\gamma}{2}\right)} , \end{equation}\] while, under Model 1, \(T\) and \(M\) can be obtained as particular cases of \(T_2\) and \(M_2\) for \(\gamma = 0\), \[\begin{equation} \label{eq:peak1} T_1 = - \frac{\beta_1}{2 \beta_2} \quad\text{and}\quad M_1 = \exp{\left(\beta_0 - \frac{\beta_1^2}{4 \beta_2}\right)} . \end{equation}\] Under these models, the daily percent change in the expected mortality, \(\Delta_\%\lambda_{t-1 \to t}\), can also be easily derived. Specifically, under Model 2, \[\begin{equation} \label{eq:deltalambda2} \Delta_\%\lambda_{t-1 \to t} = \left\{ \left( \frac{t}{t - 1} \right) ^\gamma \exp { \left[ \beta_1 + \beta_2 (2 t - 1) \right] - 1 } \right\} \times 100\% , \end{equation}\] while, under Model 1, it is \[\begin{equation} \label{eq:deltalambda1} \Delta_\%\lambda_{t-1 \to t} = \left\{ \exp { \left[ \beta_1 + \beta_2 (2 t - 1) \right] - 1 } \right\} \times 100\% . \end{equation}\]

All analyses were performed using R version 3.6.3. In R, models can be fitted using

that perform maximum likelihood estimation (MLE) of \(\beta_0\), \(\beta_1\), \(\beta_2\) and \(\gamma\). Under MLE, the estimate of the vector \((\beta_0, \beta_1, \beta_2, \gamma)\) is normally distributed with vector of means and covariance matrix estimated in the model fitting. Hence, computing both point and interval estimates for any linear combination of \(\beta_0\), \(\beta_1\), \(\beta_2\) and \(\gamma\), or a monotonic function of such a linear combination as, for instance, \(\Delta_\%\lambda_{t-1 \to t}\), is straightforward based on propierties of the multivariate normal distribution. However, obtaining such a formula for expressions \(M\) and \(T\) is hard (if not impossible). Hence, 95% empirical confidence intervals for \(M\) and \(T\) were obtained using 5000 Monte Carlo simulations according to the probability distribution of MLE of \((\beta_0, \beta_1, \beta_2, \gamma)\).

Back to top

Results

Parameters estimates

Fitting the models to the daily time series of mortality counts from the first day showing at least 20 deaths (2020-03-10) to the last available day (2020-07-04), the following estimates were obtained:

\(\boldsymbol{\beta_0}\) (95% CI) \(\boldsymbol{\beta_1}\) (95% CI) \(\boldsymbol{\beta_2}\) (95% CI) \(\boldsymbol{\gamma}\) (95% CI)
Model 1 5.37 ( 4.02 , 6.44) 0.06 ( 0.00 , 0.12) -0.0008 (-0.0016 , -0.0002)
Model 2 -2.96 (-9.96 , 2.32) -0.21 (-0.38 , -0.06) 0.0007 (-0.0002 , 0.0016) 4.52 ( 1.85 , 7.84)

Animated daily update of the fitted model

The following figure shows the daily evolution of the results of fitting the model using available data until the day indicated in the plot.

Mortality peak

The following figure shows the daily evolution of the estimate of mortality peak.

Back to top

Daily change in mortality

The following figure shows the evolution of the estimated daily percent change of mortality.

Going back to lower mortality

The following figure explores the estimated dates to get back to 500, 300 and 100 daily deaths, according to the model fitted to the whole available daily time series (i.e. data until 2020-07-04).

Specifically:

Deaths Model 1 estimate (95% CI) Model 2 estimate (95% CI)
500 04-26 (04-01 , 05-12) 04-23 (04-11 , 05-02)
300 05-09 (04-23 , 05-25) 05-04 (04-26 , 05-14)
100 05-29 (05-16 , 06-22) 05-26 (05-16 , 06-08)

The following figure shows the daily evolution of previous estimates.

Despite the estimate of the peak under both models is relatively good and stable through the time, the estimates of the time needed to drop daily mortality to 500, 300 and 100 cases is not stable, showing a trend that can be interpretated as a lack of ability of these models to capture the asymmetry of the curve. Model 2 seems to perform slightly better than Model 1 when fitting the cues of the curve.

Back to top

Impact of the data source

To illustrative potential impact of the source data are gathered from, we considered as an alternative source of data the 2019 Novel Coronavirus COVID-19 (2019-nCoV) Data Repository by Johns Hopkins CSSE. The following figure shows discrepancies when comparing both data sources (left) and estimates of the parameter \(\beta_1\) and \(\beta_2\) in Model 1 (right), using data since the first day with reported deaths (2020-03-05).