Rt Comparison

Luis Alvarez, Jean-David Morel and Jean-Michel Morel

2022-05-19

Overview

In this vignette, we compare the results of the reproduction number estimation, Rt, obtained by the methods:

We will compare the different methods using the daily incidence reported cases in France, Germany, the USA and the UK. In this comparison, we will focus on the similarity of the different Rt estimations computing an optimal temporal shift between them. We will also analyze the regularity of the different estimations and the behavior of the estimations when we approach the current time. We do not intend, to conclude here, what is the best method. We want just, to contribute to a better understanding of the relation between the Rt estimations provided by these methods

Data and required packages

We attach some required packages

library(ggplot2)
library(dplyr)
library(grid)
library(EpiEstim)
library(EpiNow2)
library(EpiInvert)

Loading stored data on COVID-19 daily incidence up to 2022-05-05 for France, Germany, the USA and the UK:

data(incidence)
tail(incidence)
#>           date   FRA    DEU    USA    UK
#> 828 2022-04-30 49482  11718  23349     0
#> 829 2022-05-01 36726   4032  16153     0
#> 830 2022-05-02  8737 113522  81644    32
#> 831 2022-05-03 67017 106631  61743 35518
#> 832 2022-05-04 47925  96167 114308 16924
#> 833 2022-05-05 44225  85073  72158 12460

Loading some festive days for the same countries:

data(festives)
head(festives)
#>          USA        DEU        FRA         UK
#> 1 2020-01-01 2020-01-01 2020-01-01 2020-01-01
#> 2 2020-01-20 2020-04-10 2020-04-10 2020-04-10
#> 3 2020-02-17 2020-04-13 2020-04-13 2020-04-13
#> 4 2020-05-25 2020-05-01 2020-05-01 2020-05-08
#> 5 2020-06-21 2020-05-21 2020-05-08 2020-05-25
#> 6 2020-07-03 2020-06-01 2020-05-21 2020-06-21

Loading the serial interval proposed in Nishiura et al. which follows a log-normal distribution with a median equals to 4 days, a mean equals to 4.7 days and standard deviation of 2.9 days:

si_distrib <- read.csv(url("https://www.ctim.es/covid19/nishiura.csv"),header=FALSE)
head(si_distrib)
#>           V1
#> 1 0.00000000
#> 2 0.04169583
#> 3 0.16207936
#> 4 0.20316240
#> 5 0.17512594
#> 6 0.13039113

In what follows we will estimate Rt using the different methods in the case of France. For the other countries, we can do it in the same way, changing, in the script ‘FRA’ by ‘DEU’, ‘USA’ or ‘UK’

Rt estimation using the different methods

EpiEstim (EE)

We prepare the incidence data for EpiEstim:

  FRA_incid <- data.frame(
    dates = as.Date(incidence$date),
    I = incidence$FRA
  )
  FRA_incid$I[incidence$FRA < 0] <- 0

We compute the EpiEstim Rt estimate

  N<-length(FRA_incid$I)
  FRA_EpiEstim <- estimate_R(FRA_incid,
                            method="non_parametric_si",
                            config = make_config(list(
                            t_start = seq(30, N-6),
                            t_end = seq(30+6, N),
                            si_distr = si_distrib$V1))
  )
  tail(FRA_EpiEstim$R$Mean)
  [1] 0.7337554 0.7319374 0.7672001 0.7490753 0.7473938 0.7532675

Wallinga - Teunis (WT)

We compute the Wallinga-Teunis Rt estimate:

  FRA_Wall_Teun <- wallinga_teunis(FRA_incid,
                                   method="non_parametric_si",
                                   config = list(
                                   t_start = seq(30, N-6),
                                   t_end = seq(30+6, N),
                                   si_distr =si_distrib$V1,
                                   n_sim =0))
  tail(FRA_Wall_Teun$R$`Mean(R)`)
  [1] 0.6699915 0.6393988 0.6294066 0.5260823 0.4289424 0.3322501

EpiInvert (EI)

We compute the EpiInvert Rt estimate:

  FRA_EpiInvert <- EpiInvert(incidence$FRA,
                             "2022-05-05",
                             festives$FRA,
                             select_params(list(si_distr =si_distrib$V1,
                                                max_time_interval=9999)))
 tail(FRA_EpiInvert$Rt)
  [1] 0.7637433 0.7652671 0.7659675 0.7661722 0.7661985 0.7661985

EpiNow2 (EN)

EpiNow2 does not use the serial interval. Instead, it uses the generation time and the incidence is computed backward using distributions for the incubation period and reporting delay.

First we upload the information about these new required distributions:

reporting_delay <- estimate_delay(rlnorm(1000, log(2), 1),
                                  max_value = 15, bootstraps = 1)
generation_time <- get_generation_time(disease = "SARS-CoV-2", source = "ganyani")
incubation_period <- get_incubation_period(disease = "SARS-CoV-2", source = "lauer")

We were unable to run EpiNow2 for the full sequences of the incidence data with 833 samples. We have used a subsample of 490 data corresponding to the incidence after January 1, 2021. In the following instruction we eliminate the data of previous dates:

  d <- dplyr::filter(incidence,date >= "2021-01-01")

We prepare the incidence data to call to EpiNow2

  reported_cases <- data.frame(
    date = as.Date(d$date),
    confirm = d$FRA
  )

We execute EpiNow2 which includes a 7 day forecast of Rt

FRA_EpiNow2 <- epinow(reported_cases = reported_cases,
                    generation_time = generation_time,
                    delays = delay_opts(incubation_period, reporting_delay),
                    rt = rt_opts(prior = list(mean = 2, sd = 0.2)),
                    stan = stan_opts(cores = 4))
tail(summary(FRA_EpiNow2, type = "parameters", params = "R")$mean)
[1] 0.8171678 0.8171678 0.8171678 0.8171678 0.8171678 0.8171678

Limitations of this study concerning the use of EpiNow2: The inputs of EpiNow2 are quite different to the ones of the previous methods. In particular, the reporting delay is country depending and we have not information about it for the countries we study, so, we use a synthetic example proposed in the EpiNow2 GitHub webpage. On the other hand the execution of EpiNow2 for the incidence data we use produced, in some cases, a warning. Next, we present a table with the summary of the EpiNow2 execution:

Country Execution time Warning message
France 10 hours -
Germany 4 hours There were 3 divergent transitions after warmup
the USA 6 hours There were 2 divergent transitions after warmup
the UK 2 hours There were 15 divergent transitions after warmup

Comparison results

Optimal shift bewteen the different Rt estimates

First, we will compare the Rt estimations provided by the different methods by estimating the optimal shift between every pair of methods. We assume that the Rt estimates are defined in the time interval [0,T] where the value in T corresponds to the last value of Rt provided by the method. Let R and R’ be two estimations of Rt provided by 2 methods. We define the optimal shift, s(R,R’), by minimizing the average error d(R,R’:s) :

\((1) \ \ \ \ s(R,R^{\prime})=\underset{s\in\lbrack-25,25]}{\arg\min} \ \ d(R,R^{\prime}\!:\!s)\equiv\sum_{t=30}^{T-30}\frac{|R(t)-R^{\prime}(t+s)|+|R(t-s)-R^{\prime}(t)|}{2(T-60)}\)

The value 30 is introduced in the formula to avoid boundary effects in the comparison. The lower is the value of d(R,R’:s) the better is the match between R and R’ after a shift given by s. Notice that the optimal shift is anti-symmetric, that is s(R,R’)=-s(R’,R) and d(R,R’:s) is symmetric, that is d(R,R’:s)=d(R’,R:s). To evaluate R(t-s) or R’(t+s) we use linear interpolation.

In the following 4 figures, we show, for the different countries, the plot of Rt obtained by the methods EE (EpiEstim), EI (EpiInvert), WT (Wallinga-Teunis) and EN (EpiNow2) after applying the optimal shift estimated between EI and the other methods. That is, we compute s(EI,EE), s(EI,WT) and s(EI,EN).

We observe a quite good agreement between the different Rt estimates after applying the optimal shift. The obtained shifted Rt estimates that we show in the above figures can be downloaded from:

  FRA_Rt_shift <- read.csv(url("https://www.ctim.es/covid19/FRA_Rt_shift.csv"),sep=";")
  DEU_Rt_shift <- read.csv(url("https://www.ctim.es/covid19/DEU_Rt_shift.csv"),sep=";")
  USA_Rt_shift <- read.csv(url("https://www.ctim.es/covid19/USA_Rt_shift.csv"),sep=";")
  UK_Rt_shift <- read.csv(url("https://www.ctim.es/covid19/UK_Rt_shift.csv"),sep=";")

Next, we present 2 tables with the mean and standard deviation of s(R,R’) and d(R,R’,s(R,R’)) for the different countries.


Table 1 : mean and standard deviation of s(R,R’) for the different countries

EI EE WT EN
EI 0 \(\pm\) 0 7.36 \(\pm\) 0.02 3.02 \(\pm\) 0.19 -8.92 \(\pm\) 0.43
EE -7.36 \(\pm\) 0.02 0 \(\pm\) 0 -4.32 \(\pm\) 0.03 -16.71 \(\pm\) 0.41
WT -3.02 \(\pm\) 0.19 4.32 \(\pm\) 0.03 0 \(\pm\) 0 -12.08 \(\pm\) 0.44
EN 8.92 \(\pm\) 0.43 16.71 \(\pm\) 0.41 -12.08 \(\pm\) 0.44 0 \(\pm\) 0

Table 2 : mean and standard deviation of d(R,R’,s(R,R’)) for the different countries

EI EE WT EN
EI 0 \(\pm\) 0 0.034 \(\pm\) 0.011 0.026 \(\pm\) 0.009 0.057 \(\pm\) 0.021
EE 0.034 \(\pm\) 0.011 0 \(\pm\) 0 0.025 \(\pm\) 0.007 0.058 \(\pm\) 0.016
WT 0.026 \(\pm\) 0.009 0.025 \(\pm\) 0.007 0 \(\pm\) 0 0.046 \(\pm\) 0.015
EN 0.057 \(\pm\) 0.021 0.058 \(\pm\) 0.016 0.046 \(\pm\) 0.015 0 \(\pm\) 0

In PNAS (2021), we show that the expected value of s(EI,EE) is about 3 +m where m is closed to a central value (median or mean) of the serial interval distribution. Since the used serial interval has a median of 4 and a mean of 4.7, the expected value of s(EI,EE) is between 7 and 7.7. Notice that, as shown in table 1, the obtained mean value of s(EI,EE) for the 4 countries we use is 7.36 which confirms our expectations. On the other hand, in
PNAS (2021), we also show that if R and R’ are the Rt estimations for WT and EE respectively, then

\(R(t)=\sum_k\Phi(k)R'(t+k)\)

the second term of this equation can be approximated in the following way:

\(\sum_k\Phi(k)R'(t+k) \approx R'(t+m)\)

where m is a central value of the serial interval \(\Phi\). Therefore, the expected value of s(WT,EE) is between 4 and 4.7 which is confirmed by the obtained value, 4.32, showed in table 1. On the other hand, as it is also expected, we have that, using the values of table 1, s(EI,EE)=7.36 is closed to s(EI,WT)+s(WT,EE)=3.02+4.32=7.34.

Concerning the shift of EN with respect to the other methods, we have that the most significant part of this shift is due to the 7-day forecast included in the EN estimation. In spite of the method EN is quite different to the other ones, the stability of the shift value between EN and the other methods for the different countries suggests that it must exist a theoretical reason which justify such shift value.

In the following plot we show the optimal shift s(R,R’) versus the average error d(R,R’,s(R,R’)) for any pair of methods and the different countries. As stated numerically in tables 1 and 2, we can observe that the value of s(R,R’) is very stable for the different countries and that the largest values of d(R,R’,s(R,R’)) corresponds to the comparison of EN with the other methods and the lowest values to the comparison of WT with EE and EI.

Regularity of the different Rt estimates

To measure the regularity of the different Rt estimates, we use the following averages of the absolute values of the first and second derivatives of Rt.

\((2) \ \ \ \ \text{FIRST DERIVATIVE AVERAGE:} \ \ ||dR/dt||=\sum_{t=30}^{T-30}\frac{|R(t+h)-R(t-h)|}{2(T-60)}\)

\((3) \ \ \ \ \text{SECOND DERIVATIVE AVERAGE:} \ \ ||d^2R/dt^2||=\sum_{t=30}^{T-30}\frac{|R(t+h)-2R(t)+R(t-h)|}{T-60}\)

In the following plot we compare the regularity of the different Rt estimates. We observe that by far, EN provides the smoother estimate, EE provides the most irregular estimates and EI and WT provide similar results. Concerning the regularity we have to take into account that we can tune the regularity of the methods changing the method parameters. For instance, in the case of WT and EE we can change the number of weeks to average the incidence and in the case of EI there exists a parameter to fit the regularity in the estimation. In this comparison we use the default parameters provided by the methods.


Comparison of the Rt estimations when we approach the current time

Next, we study, in more details, the behavior of the shifted Rt estimates when we approach the current time. In the next figure we show, for the different countries the shifted Rt estimates after March 1, 2022. We observe, as expected, that the precision of WT deteriorates as the current time approaches. The reason is that QT is designed to calculate Rt retrospectively, that is, to calculate Rt at time t it uses the incidence data up to time t+max(support(\(\Phi\))) (where support(\(\Phi\))=\(\{k: \Phi(k)>0\}\)). Therefore, the quality of the estimate deteriorates as we approach the current time. Regarding the methods EI, EE and EN, when the estimate varies smoothly in the 3 cases, as is the case for France, the final estimate of Rt is quite similar in the 3 cases. When the estimation of EE, which is the most irregular, has strong oscillations, the estimation of the 3 methods can be quite different. In the case of Germany, we can observe that the estimates of EE and EI are decreasing until reaching the last value of EE , but then the estimate of EI grows in the following days approaching the value of EN, this may be due, as showed in PNAS, 2021, to the fact that EI reacts before EE to a change in trend. As can be seen, the estimation of EN is very smooth. A smooth estimate has the advantage that it is more stable but on the other hand we can expect it to take longer to react to a change in trend. Therefore, it is necessary to seek a certain balance between the stability of the estimate and its ability to react to a change in trend. In any case, as already mentioned, by modifying the parameters of the methods it is possible to tune the regularity of the estimation.