Purpose

Predict the number of COVID-19 cases in North Dakota using epidemiological modelling: Kermack-McKendrick (KM) SIR Model.

Current Observations

Number of new cases per day.

Number of cumulative cases per day.

Observations based on: https://www.health.nd.gov/diseases-conditions/coronavirus/north-dakota-coronavirus-cases-mobile-friendly

Model Limitations and Assumptions

Closed SIR Model Limitiations | Things to Keep in Mind:

Kermack-McKendrick (KM) SIR Model Assumptions:

  1. Large scale population size: if populations are small, stochastic effects dominate

  2. Exponentially-distributed waiting times in epidemiological compartments

  3. With closed population: no entry into or departure from the population, except possibly by disease induced death)

  4. Time scale of disease is assumed faster than the time scale of birth and deaths (so that the impact of demographic effects on the population may be ignored)

  5. Homogenous mixing: each individual in the population has an equal chance of interacting with any other

Calculating R0

Using available incidence data (June 18, 2020) to estimate of R0.

## 
## Call:
## lm(formula = log(cumulative) ~ date, data = cases)
## 
## Coefficients:
## (Intercept)         date  
## -1189.02144      0.06502
## 
## Call:
## lm(formula = log(cumulative) ~ date, data = subset(cases, status == 
##     "confined"))
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -3.2632 -0.4468  0.3386  0.7837  0.9655 
## 
## Coefficients:
##                 Estimate   Std. Error t value            Pr(>|t|)    
## (Intercept) -1189.021443    65.600856  -18.12 <0.0000000000000002 ***
## date            0.065021     0.003569   18.22 <0.0000000000000002 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 1.03 on 98 degrees of freedom
## Multiple R-squared:  0.7721, Adjusted R-squared:  0.7697 
## F-statistic: 331.9 on 1 and 98 DF,  p-value: < 0.00000000000000022
##                                date                                     
## estimate of R0 is   1.4876552614195                 |            CI = ( 
##              date                                date                   
## 0.434538294548643                 ,  1.54077222829037                 )

Estimated R0 is 1.4877 | 97.5% CI = (0.4345, 1.5408)

# According to Kermack-McKendrick model: if you know R0, you will know how many people will be infected in the end (1- z(inf)/N = e^(-R_0 z(infty)/N) -> R_0 = log(1 - z(infty)/N))
# If you know R0 from the beginning of the epidemic, and you assume Kermack-McKendrick is true: if R0=1.8, 75% of population will be infected at the end)

SIR Model Projections (Based on Closed SIR System)

Plot showing predicted vs. observed # of cases with time in days (since first known case).

“Predicting the Storm”

A table with estimated predictions of # cases, and susceptibles with time in days.

Estimated R0 is 1.4877 | 97.5% CI = (0.4345, 1.5408)

Predictions

Projected maximum number of cases: 7132

Projected days to reach maximum number of cases: ~105

Projected day for maximum number of cases: 2020-03-11 + 105 = 2020-06-24

Discussion

The difference in prediction and observed, could be due to a multitude of explanations including:

  • Differences in actual number of positive, not counting for asympomatic
  • Mitigation (wearing masks, 6-ft distancing, cleaning surfaces, “Stay at Home” Orders, etc.)
  • Lab and data quality

Acknowledgements

Thank you for doing your part in protecting yourself and others.

North Dakota Department of Health (2020), “Coronavirus Cases,” Diseases & Conditions, Retrieved from: https://www.health.nd.gov/diseases-conditions/coronavirus/north-dakota-coronavirus-cases-mobile-friendly


For R code or questions or ideas to collaborate, I am happy to hear from you: Rasha Elnimeiry, MPH, MAS, CPH | Relnime1(at)jhu.edu