Forecasting COVID-19 Deaths

A simple model for the estimation of the maximum percentage of deaths due to COVID-19

============================================================================

NOTE:

  1. I am not an epidemiologist.
  2. The predictions are not official, nor backed up by any organization nor government.
  3. I am providing them to get an idea of when the COVID-19 peaks may occur.
  4. I will update the predictions after 10:30pm CT every day
  5. They will be based on CSSE at Johns Hopkins University data release. Hence, the data may reflect the day before.

“Disclaimer: Content from this website is STRICTLY ONLY for educational and research purposes and may contain errors. The model and data are inaccurate to the complex, evolving, and heterogeneous realities of different countries. Predictions are uncertain by nature. Readers must take any predictions with caution. Over-optimism based on some predicted end dates is dangerous because it may loosen our disciplines and controls and cause the turnaround of the virus and infection, and must be avoided.”

Statement copied from: https://ddi.sutd.edu.sg/

============================================================================

The prediction based on a simple logistic model and:

70% of SARS-CoV-2 exposure
10% Infection efficacy i.e. 10% of the exposure subjects turns into a Covid-19 case
2.0% of Mortality

For each country we will use the 100% urban populationand 30% of the rural population

I will estimate the death rates based on current trends fitted to the two-logistic model function.
The last days will be used for the peak estimations.

I’ll show the model peak and the observed peak

=====================================================================================
1/5/2021 Update:
Change in algorithms to handle second waves less cluttered the plots

1/1/2021 Update:
Change in algorithms to handle second waves, and CI based on bootstrapping

12/10/2020 Update:
I’m changing the probability of getting covid-19 from 10% to 15%

8/7/2020 Update:
The modeling function now has two logistics One for uptrend and the second for down trending. The two functions are blended for the final fitting.

6/17/2020 Update:
Changed intial fit conditions and fit stop criteria.

5/24/2020 Update:

I’m updating the second wave estimations It seems that we have learnead something so I’ll change from 4.0% to 2.0% the death rate.

5/23/2020 Update:

I’m getting ready for the second wave. So I have to adjust the mortality rate: From 1.0% to 4.0% based on NYC data

5/8/2020 Update:

Code Update. I simplify the code and corrected minor bugs.

5/7/2020 Update:

I replaced lowess smoothing to Friedman’s SuperSmoother

4/28/2020 Update:

The deaths per day now are smoothed by a median filter

4/16/2020 Update:

Now the plots include the 95% confidence intervals of the expected peak. The estimations are done for the original estimations and the optimistic ones.

Notes:

  1. If the current observations do not match the estimated CDF, do not trust the predictions.
  2. If predicted CDF does not reach 1.0, then there is the potential for more infection in the future or COVID-19 is not as deadly as I estimated

Source code:

https://github.com/joseTamezPena/COVID_Forecasting

# The expetec %  of deaths in each country

expectedtotalFatalities = 0.70*0.15*0.02
optGain <- 3
# The number of observations used for the trends
daysWindow <- 17

today <- Sys.Date()

currentdate <- paste(as.character(today),":") 

Loading the data

The data is the time_covid19ing set from CSSE at Johns Hopkins University:

https://github.com/CSSEGISandData/COVID-19/tree/master/csse_covid_19_data/csse_covid_19_time_series/time_series_covid19_deaths_global.csv

Ploting some trends

Country.Region <- rownames(time_covid19Country)
totaldeaths <- as.numeric(time_covid19Country[,ncol(time_covid19Country)])
names(totaldeaths) <- Country.Region
totaldeaths <- totaldeaths[order(-totaldeaths)]


ydata <- as.numeric(time_covid19Country[names(totaldeaths[1]),])
ydata <- ydata[ydata > 1e-6]
plot(ydata,main="# Fatalities",xlab="Days",ylab="Fatalities",xlim=c(1,ncol(time_covid19Country)))
text(length(ydata)-1,ydata[length(ydata)],names(totaldeaths[1]))

for (ctr in names(totaldeaths[1:30]))
{
  ydata <- as.numeric(time_covid19Country[ctr,])
  ydata <- ydata[ydata > 1e-6]
  lines(ydata)
  text(length(ydata)-1,ydata[length(ydata)],ctr)
}


totaldeaths  <- totaldeaths[!is.na(totaldeaths)]
totaldeaths  <- totaldeaths[totaldeaths > 5000]

Estimating the peaks