We developed a mathematical model to provide epidemic predictions for the COVID-19 epidemic in Tunisia. We use reported case data up to 12 April 2020 from the Tunisian Center for Disease Control and Prevention to parameterize the model. We identify the number of unreported cases. We then use the model to project the epidemic forward with varying levels of public health interventions. The model predictions emphasize the importance of major public health interventions in controlling COVID-19 epidemics.
We followed the research work of Liu et al (2020). We were guided by its reasoning, procedures, and conclusions.
corona virus; reported and unreported cases; isolation; quarantine; public closings; epidemic mathematical model
Our objective is to develop a mathematical model, which recovers from data of reported cases, the number of unreported cases for the COVID-19 epidemic in Tunisia. For this epidemic, a modeling approach has been developed in Tang et al. 2020, which did not consider unreported cases. Our work continues the investigation in Magal et al. (2020), Ducrot et al. (2020) and Liu et al. (2020) of the fundamental problem of parameter identification in mathematical epidemic models. We address the following fundamental issues concerning this epidemic: How will the epidemic evolve in Tunisia with respect to the number of reported cases and unreported cases? How will the number of unreported cases influence the severity of the epidemic? How will public health measures, such as isolation, quarantine, and public closings, mitigate the final size of the epidemic?
Our model consists of the following system of ordinary differential equations:
\[\begin{eqnarray} S'(t) &= &-\tau S(t) [I(t) + U(t)], \\ I'(t) &=& \tau S(t) [I(t) + U(t)] - \nu I(t), \\ R'(t) &=& \nu_1 I(t) - \eta R(t), \\ U'(t) &=& \nu_2 I(t) - \eta U(t). \\ \end{eqnarray}\]Here, \(t \geq t_0\) is time in days, \(t_0\) is the beginning date of the epidemic, \(S(t)\) is the number of individuals susceptible to infection at time \(t\), \(I(t)\) is the number of asymptomatic infectious individuals at time \(t\), \(R(t)\) is the number of reported symptomatic infectious individuals (i.e. symptomatic infectious with sever symptoms) at time \(t\), and \(U(t)\) is the number of unreported symptomatic infectious individuals (i.e. symptomatic infectious with mild symptoms) at time \(t\). This system is supplemented by initial data \[S(t_0) = S0 > 0,\; I(t_0) = I_0 > 0,\; R(t_0) = 0\qquad and\qquad U(t_0) = U_0 \geq 0.\]
The flow chart of the model and the parameters are listed below:
We use a set of reported data to model the epidemic in Tunisia: data from the Tunisian Health Commission for Tunisia. It represents the epidemic transmission in Tunisia. The first case was detected on March 2, 2020.
Infected <- c(1,1,1,1,2,5,6,7,13,16,18,20,24,29)
Infected <- c(Infected,39,54,60,75,89,114,173,197,229,257,312,362,394,423,455)
Infected <- c(Infected,495,553,574,596,623,628,643,671,685,707,726,747,780,822,864,866,879,884,901,909,918,922,949, 967, 975, 980,994,998,1009,1013,1018,1022,1025,1026)
#Infected <- c(0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 110.0, 110.0, 110.0, 330.0, 330.0, 330.0, 550.0, 660.0, 660.0, 880.0, 880.0, 990.0, 1100.0, 1320.0, 1540.0, 1980.0, 2090.0, 2420.0, 2420.0, 2530.0, 2640.0, 2750.0, 2750.0, 3080.0, 3410.0, 3740.0, 3740.0, 3850.0, 4070.0, 4070.0, 4070.0)
death <- c(0,0,0,0,0,0,0,0,0,0,0,0,0,0)
death <- c(death,1,1,1,3,3,3,5,6,6,8,8,9,10,12,14)
death <- c(death,18,19,22,22,23,24,25,25,28,31,34,34,35,37,37,37,38,38,38,38,38,38,38,39, 40,40,41,41,42,42,43,43,43,44)
Day <- 1:(length(Infected))
day <- 1:(length(death))
S0 <- 11694720
data <- data.frame(Day,Infected)
plot1 <-ggplot(data = data, aes(x=Day, y=Infected))+geom_point()
plot1 <- plot1 + geom_smooth(method="loess",formula=y~x,se=T)+xlab('')+ ylab('Count')
plot1 <- plot1 + labs(title='Total infections COVID-19 Tunisia')
plot1 <- plot1 + theme(axis.text.x=element_text(angle=45, hjust=1))
dat <- data.frame(day,death)
plot2 <-ggplot(dat,aes(x=day, y=death))+geom_point()
plot2 <- plot2 + geom_smooth(method='loess',formula=y~x,se=T)
plot2 <- plot2 +xlab('')+ ylab('Count')+ labs(title='Local Death Cases')
plot2 <- plot2+theme(axis.text.x=element_text(angle=45, hjust=1))
## show two plots side by side
grid.arrange(plot1, plot2, ncol=2)
We assume that the removal rate \(\nu\) takes the following form: \(\nu = \nu_1 + \nu_2\), where \(\nu_1\) is the removal rate of reported symptomatic infectious individuals, and \(\nu_2\) is the removal rate of unreported symptomatic infectious individuals due to all other causes, such as mild symptom, or other reasons.
The cumulative number of reported symptomatic infectious cases at time \(t\) is denoted by \(CR(t)\). We assume that \(CR(t)\) has the following special form: \[CR(t) = \chi_1 exp (\chi_2t) - \chi_3.\] We obtain the model starting time of the epidemic \(t_0\): \[CR(t_0) = 0 \qquad\Leftrightarrow\qquad \chi_1exp(\chi_2t_0)-\chi_3 = 0 \qquad\Rightarrow\qquad t_0 = \frac{1}{\chi_2} (ln(\chi_3)-ln(\chi_1)).\] We fix S0 = 11694720, which corresponds to the total population of Tunisia. We assume that the variation in \(S(t)\) is small during the period considered, and we fix \(\nu, \eta, f\). We can estimate the parameters \(\nu_1, \nu_2, \tau\) and the initial conditions \(U_0\) and \(I_0\) from the cumulative reported cases \(CR(t)\). We then construct numerical simulations and compare them with data.
We obtain \(I(t) = I_0\; exp (\chi_2 (t - t_0))\) and \(I_0 = \frac{\chi_3 \chi_2}{f \nu}\). We must have \(U(t) = U_0\; exp(\chi_2(t-t_0))\). So, by substituting these expressions into previous identities, we obtain: \(\chi_2\; I_0 = \tau S_0\; (I_0 + U_0) - \nu I_0\), \(\chi_2\; U_0 = \nu_2\; I_0 - \eta U\), \[\tau = \frac{\chi_2+\nu}{S_0} \frac{\eta +\chi_2}{(1 -f) * \nu + \eta + \chi_2}\] and \(U_0 = \frac{(1-f)\nu}{\eta + \chi_2} I_0\).
Here, we fix \(\tau\) in such a way that the value \(\chi_2\) becomes the dominant eigenvalue of \[A = \left(\begin{matrix} \tau S_0 - \nu & \tau S_0\\ \nu (1-f) & - \eta \end{matrix}\right)\] and \((I_0, U_0)\) is the positve eigenvector associated with this dominant eigenvalue \(\chi_2\). Thus, we apply implicitly the Perron–Frobenius theorem. Moreover, the exponentially growing solution \((I(t), U(t))\) that we consider (which is starting very close to \((0,0)\)) follows the direction of the positive eigenvector associated with the dominant eigenvalue \(\chi_2\).
The basic reproductive number becomes: \[ R_0 = \frac{\chi_2+\nu}{\nu} \frac{\eta + \chi_2}{(1-f)\nu+\eta+\chi_2} (1+\frac{(1-f)\nu}{\eta}).\]
We can find multiple values of \(\eta\), \(\nu\) and \(f\) which provide a good fit for the data. For application of our model, \(\eta\), \(\nu\) and \(f\) must vary in a reasonable range. For the corona virus COVID-19 epidemic in Tunisia at its current stage, the values of \(\eta\), \(\nu\) and \(f\) are not known. From preliminary information, we use the value
f = 0.8
eta = 1/7
nu = 1/7
mod <- nlsLM(Infected ~ (a * exp(b * Day) - c), start=list(a=0, b=0, c=0))
## Warning in nls.lm(par = start, fn = FCT, jac = jac, control = control, lower = lower, : lmdif: info = -1. Number of iterations has reached `maxiter' == 50.
(RSS.p <- sum(residuals(mod)^2)) # Residual sum of squares
## [1] 372677.8
(TSS <- sum((Infected - mean(Infected))^2)) # Total sum of squares
## [1] 9557633
1 - (RSS.p/TSS) # R-squared measure
## [1] 0.9610073
chi <- coef(mod)
fit_mod <- chi["a"] * exp(chi["b"] * Day) - chi["c"]
head(fit_mod)
## [1] -132.88688 -113.25244 -93.57678 -73.85982 -54.10145 -34.30161
df <- data.frame(Day, fit_mod)
head(df)
plot3 <- ggplot(data=df, aes(x=Day, y=fit_mod), col="red")+geom_point(col="red")
#grid.arrange(plot1, plot3, ncol=2)
plot3 <- plot1 + geom_point(data=df,aes(x=Day, y=fit_mod), color="red")
plot3
chi2 <- chi["b"]
chi2
## b
## 0.002097267
chi1 <- chi["a"]
chi2 <- chi["b"]
chi3 <- chi["c"]
tau = (chi2+nu)/S0 * (eta + chi2)/((1 - f) * nu + eta + chi2)
print(tau)
## b
## 1.035402e-08
Opt_par <- c(tau, f, nu, eta)
t0 <- 1/chi2 * (log2(chi3) - log2(chi1))
print(Opt_par)
## b
## 1.035402e-08 8.000000e-01 1.428571e-01 1.428571e-01
print(c(chi1, chi2, chi3, t0))
## a b c b
## 9.332510e+03 2.097267e-03 9.484990e+03 1.114838e+01
In date of April 13th and according to data set \(t = 0\) of the epidemy will correspond to 20 Februray, in fact the value \(t_0 = 10.97\) means that the starting time of the epidemic is 2 Mars + 11 days. In the other hand, as long as the number of reported cases follows, we can predict the future values of \(CR(t)\). For \(\chi_1 = 76.7\), \(\chi_2 = 0.07\), and \(\chi_3 = 128\), we obtain:
n <- length(Day)
n
## [1] 63
Dday <- c(37:45)
print(Dday)
## [1] 37 38 39 40 41 42 43 44 45
fit_mod <- chi["a"] * exp(chi["b"] * Dday) - chi["c"]
fit_mod
## [1] 600.5514 621.7257 642.9444 664.2077 685.5156 706.8683 728.2657 749.7081
## [9] 771.1956
We have to deal carefully: the exponential formula overestimates the number reported after day 30.
A <- matrix(c(tau*S0-nu, tau*S0, nu*(1-f), -eta), nrow = 2)
eigen(A)
## eigen() decomposition
## $values
## [1] -0.166724220 0.002097267
##
## $vectors
## [,1] [,2]
## [1,] -0.1933855 0.7674604
## [2,] 0.9811229 0.6410963
By using formula for the basic reproduction number, we obtain from the data in Infected that:
x <- seq(0.01, 1, by=0.1)
y <- seq(1, 10, by=1)
z <- tau*S0*y * (1 + (1-x)/(y*eta))
library(scatterplot3d)
scatterplot3d(x, y, z, highlight.3d = TRUE, col.axis = "blue",
col.grid = "lightblue", main = "Evolution of basic reproductive number R0", pch = 20,
xlab="f", ylab="1/nu", zlab="R0")
#library(rgl)
#myColorRamp <- function(colors, values) {
# v <- (values - min(values))/diff(range(values))
# x <- colorRamp(colors)(v)
# rgb(x[,1], x[,2], x[,3], maxColorValue = 255)
#}
#cols <- myColorRamp(c("red", "blue"), z)
#plot3d(x = x, y = y, z = z, col = cols)
I0 <- chi3 * chi2 / (f * nu)
print(I0)
## c
## 174.0599
U0 <- I0 * (1 - f) * nu / (eta + chi2)
print(U0)
## c
## 34.3083
R0 <- tau*S0/nu * (1 + (nu*(1-f))/eta)
print(head(R0))
## b
## 1.017134
print(c(S0, I0, U0, R0))
## c c b
## 1.169472e+07 1.740599e+02 3.430830e+01 1.017134e+00
#crt <- I0 * exp(chi2 * (Day-t0))
crt <- chi1 * exp(chi2 * Day) - chi3
ut <- U0 * exp(chi2 * (Day-t0))
df <- data.frame(Day, crt, ut)
#head(df)
plot3 <- ggplot(data=df, aes(x=Day, y=crt), col="red", colour="Infected")+geom_point(col="red")
plot3 <- plot3 + geom_line(aes(x = seq(1,n,1), y = Infected), alpha = .5, col="black")
plot3 <- plot3 + geom_point(aes(x = seq(1,n,1), y = Infected), alpha = .2, col="black")
plot3 <- plot3 + geom_point(data=df,aes(x=Day+t0, y=U0*ut), color="green")
plot3
See ON Bjørnstad (2018) Epidemics: Models and Data using R. Springer.
In the following, we take into account the fact that very strong isolation measures have been imposed for all Tunisia since 24 Mars. Specifically, since 24 Mars, families in Tunisia are required to stay at home. In order to take into account such a public intervention, we assume that the transmission of COVID-19 from infectious to susceptible individuals stopped after 26 Mars.
N <- S0
SEIR <- function(time, state, parameters) {
par <- as.list(c(state, parameters))
with(par, {
dS <- -Rt *S * I / N
dI <- alpha * U - gamma * I
dU <- Rt * S * I / N - alpha * U
dR <- gamma * I
list(c(dS, dI, dU, dR))
})
}
times<-seq(0,240,1)
parms<-c(alpha=0.6458206, gamma=0.24919, Rt=1.85908113
)
xstart<-c(S=S0, I=Infected[1], U=0, R=0)
model<-function(t,x,parms){
#taut <- ifelse(t<26, 9, (chi2+nu)/S0 * (eta + chi2)/((1 - f) * nu + eta + chi2))
S <-x[1]
I <-x[2]
U <-x[3]
R <-x[4]
with(as.list(parms),{
dS <- -Rt *S * I / N
dI <- alpha * U - gamma * I
dU <- Rt * S * I / N - alpha * U
dR <- gamma * I
list(c(dS, dI, dU, dR))
})
}
out<-as.data.frame(lsoda(xstart,times,model,parms))
p <- ggplot()
p <- p + geom_line(out, mapping=aes(x=times,y=out$S), col="black")
p <- p + geom_line(out, mapping=aes(x=times,y=out$I), col="red")
p <- p + geom_line(out, mapping=aes(x=times,y=out$U), col="blue")
#p <- p + scale_color_discrete(name = "Y series", labels = c("Y2", "Y1"))
p <- p + geom_line(out, mapping=aes(x=times,y=out$R), col="green")
p <- p + scale_color_manual(labels =c("Sensitive", "Infected", "Exposes", "Removed"), values = c("black", "red", "blue", "green"))
p <- p + scale_fill_discrete(name="Legend",
breaks=c("out$S", "out$I", "out$U", "out$R"),
labels=c("Sensitive", "Infected", "Exposed", "Removed"))
p <- p + labs(title="Evolution of Infected cases (red) and Exposed (blue)", x ="Days from 02 March", y = "Estimated number in Tunisia")
#p <- p + theme_bw() + guides(color=guide_legend("my title"))
p + theme(legend.position="top")
This is a strange year! In Tunisia since the beginning of March, and the appearance of the first case of COVID-19, we are more than ever worried. Here are some simulations following an SIR model. Data are missing, since for this model to be relevant it would require at least 10 more days of data’s evolution.
I will use the same model already used in Epidemiology: How contagious is Novel Coronavirus (2019-nCoV)?; https://blog.ephorie.de/epidemiology-how-contagious-is-novel-coronavirus-2019-ncov
library(deSolve)
# I followed the SIR used in https://www.r-bloggers.com/covid-19-the-case-of-germany/
SIR <- function(time, state, parameters) {
par <- as.list(c(state, parameters))
with(par, {
dS <- -beta/N * I * S
dI <- beta/N * I * S - gamma * I
dR <- gamma * I
list(c(dS, dI, dR))
})
}
N <- S0
init <- c(S = N-Infected[1], I = Infected[1], R = 0)
RSS <- function(parameters) {
names(parameters) <- c("beta", "gamma")
out <- ode(y = init, times = Day, func = SIR, parms = parameters)
fit <- out[ , 3]
sum((Infected - fit)^2)
}
Opt <- optim(c(0.5, 0.5), RSS, method = "L-BFGS-B", lower = c(0, 0), upper = c(1, 1)) # optimize with some sensible conditions
Opt$message
## [1] "CONVERGENCE: REL_REDUCTION_OF_F <= FACTR*EPSMCH"
## [1] "CONVERGENCE: REL_REDUCTION_OF_F <= FACTR*EPSMCH"
Opt_par <- setNames(Opt$par, c("beta", "gamma"))
Opt_par
## beta gamma
## 0.5600396 0.4399604
## beta gamma
t <- 1:120 # time in days
fit <- data.frame(ode(y = init, times = t, func = SIR, parms = Opt_par))
col <- 1:3 # colour
matplot(fit$time, fit[ , 3:3], type = "l", xlab = "Day", ylab = "Number of subjects", lwd = 2, lty = 1, col = col)
matplot(fit$time, fit[ , 2:4], type = "l", xlab = "Day", ylab = "Number of subjects", lwd = 2, lty = 1, col = col, log = "y")
## Warning in xy.coords(x, y, xlabel, ylabel, log = log): 1 y value <= 0
## omitted from logarithmic plot
## Warning in xy.coords(x, y, xlabel, ylabel, log = log): 1 y value <= 0
## omitted from logarithmic plot
points(Day, Infected)
legend("bottomright", c("Susceptible", "Infected", "Recovered"), lty = 1, lwd = 2, col = col, inset = 0.05)
title("SIR model Covid-19 Tunisia", outer = TRUE, line = -2)
R0 <- setNames(Opt_par["beta"] / Opt_par["gamma"], "R0")
print(R0)
## R0
## 1.272932
fit[fit$I == max(fit$I), "I", drop = FALSE] # height of pandemic
max_infected <- max(fit$I)
max_infected / 5 # severe cases
## [1] 58068.04
max_infected * 0.06 # cases with need for intensive care
## [1] 17420.41
# https://www.newscientist.com/article/mg24532733-700-why-is-it-so-hard-to-calculate-how-many-people-will-die-from-covid-19/
max_infected * 0.005 # deaths with supposed 0.7% fatality rate
## [1] 1451.701
An epidemic outbreak of a new human coronavirus COVID-19 will occur in Tunisia. For this outbreak, the unreported cases and the disease transmission rate are not known. In order to recover these values from reported medical data, we present the mathematical model for outbreak diseases. By knowledge of the cumulative reported symptomatic infectious cases, and assuming the fraction \(f\) of asymptomatic infectious that become reported symptomatic infectious cases, the average time \(1/\nu\) asymptomatic infectious are asymptomatic, and the average time \(1/\eta\) symptomatic infectious remain infectious, we estimate the epidemiological parameters in model. We then make numerical simulations of the model to prodict forward in time the severity of the epidemic. We observe that public health measures, such as isolation, quarantine, and public closings, greatly reduce the final size of the epidemic, and make the turning point much earlier than without these measures.
We observe that the predictive capability of our model requires valid estimates of the parameters \(f\), \(\nu\), and \(\eta\), which depend on the input of medical and biological epidemiologists. Our results can contribute to the prevention and control of the COVID-19 epidemic in Tunisia. As a consequence of our study, we note that public health measures, such as isolation, quarantine, and public closings, greatly reduce the final size of this epidemic, and make the turning point much earlier than without these measures. With our method, we fix \(f\), \(\nu\), and \(\eta\) and get the same turning point for the data set. We choose \(f = 0.8\), which means that around \(80\%\) of cases are reported in the model, since cases are very well documented in China. Thus, we only assume that a small fraction, \(20\%\), were not reported. This assumption may be confirmed later on.
We also vary the parameters \(f\), \(\nu\), and \(\eta\), and we do not observe a strong variation of the turning point. Nevertheless, the number of reported cases are very sensitive to the data set, as shown in the figures. Formula for \(CR(t)\) is very descriptive until 26 Mars for the reported case data for Tunisia data. This suggests that the turning point is very robust, while the number of cases is very sensitive. We can find multiple values of \(f\), \(\nu\), and \(\eta\) that provide a good fit for the data. This means that \(f\), \(\nu\), and \(\eta\) should also be evaluated using other methods. The values \(1/\eta = 7\) days and \(1/\nu = 7\) days are taken from information concerning earlier corona viruses, and are used now by medical authorities.
Liu, Zhihua, et al. “Understanding unreported cases in the COVID-19 epidemic outbreak in Wuhan, China, and the importance of major public health interventions.” Biology 9.3 (2020): 50.
Magal, P.; Webb, G. The parameter identification problem for SIR epidemic models: Identifying Unreported Cases. J. Math. Biol. 2018, 77, 1629–1648.
Tang, B.; Wang, X.; Li, Q.; Bragazzi, N.L.; Tang, S.; Xiao, Y.; Wu, J. Estimation of the Transmission Risk of 2019-nCov and Its Implication for Public Health Interventions. J. Clin. Med. 2020, 9, 462. Available online: https://papers.ssrn.com/sol3/papers.cfm?abstract_id=3525558 (accessed on 30 January 2020).
MJ Keeling and P Rohani (2008) Modeling Infectious Diseases in Humans and Animals. Princeton University Press.