Abstract

The COVID-19 pandemic began in December of 2019 in Wuhan, China. Due to its high transmission rate, the SARS-CoV-2 virus has infamously spread from nation-to-nation, and overtook all major countries by March 2020. The disease itself overtook hospitals and other medical care centers due to its once-unknown and under-researched effects on the respiratory system. Whether it infects its host alone or is compounded with another disease, COVID-19 quickly proved itself to be unpredictable and dangerous in 2020.

The month of March in 2020 marked the first major exponential growth in the number of newly confirmed cases on a day-by-day basis. As stated before, this virus was already existing for three months prior to this time. However, due to international travel and the lack of personal protective equipment of the time, it was able to be transmitted very easily between persons, causing the trend analyzed in this report.

Quantitative analysis of the effects of various factors on the spread of the epidemic will help people better understand the transmission characteristics of SARS-CoV-2, thus providing a theoretical basis for governments to develop epidemic prevention and control strategies. Epidemiological data, such as that sourced for this research, is vital to effectively creating mathematical models to predict the spread of the virus and to better combat its effects.

The results of this analysis yielded a rate constant (k) value of 0.04874147 algebraically, and a value of 0.1149080 using the NLS method with a standard error of 0.0009938.

Data and Calculations

Source:

The data used for this calculation was sourced from OurWorldInData.org, which is a relevant website for academically sourced data and research. Click on the hyperlink to see the raw data itself.

Since the COVID-19 pandemic has been ongoing since December 2019, there has been a lot of epidemiological data collected both nationally and internationally. However, in March 2020, the world saw a spike in the number of global cases, causing the societal shut down of schools, work, and other establishments. Therefore, the data utilized for this project is specific to what many consider to be the first effective month of the pandemic, ranging from March 1, 2020 to April 1, 2020 (n=32 days), as seen below.

Day <- c(1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20,
         21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32)

Cases <- c(2379, 1980, 2612, 2322, 2711, 3930, 4131, 3854, 4323, 4786, 7504,
               6756, 13194, 10888, 11233, 14567, 15174, 15560, 27087, 29532, 32433,
               34178, 42551, 41888, 51480, 60810, 63913, 69564, 56503, 65003,
               78439, 86322) 

Comment: Originally, the data from the first three-six months of the pandemic were used. However, epidemiological data presents itself linearly over extended time periods (t>30 days). This is due to a multitude of factors, such as virus adaptation, virus evolution (i.e. new variants), and non-pharmaceutical interventions (i.e. national lock-downs).

Finding the rate constant, k:

The NLS function (non-linear squares approach) was utilized in order to determine the rate constant, k. This is an optimization technique that can be used to build regression models for data sets that contain nonlinear features.

A <- 2379*exp(-0.04874147*Cases) 

where: A(final) = 86322, A(initial) = 2379, and k was algebraically found to be 0.04874147.

The standard error estimate of k was found to be 0.01149080 to 0.0009938.

The residual standard error was found to be 5871 cases on 31 degrees of freedom.

Including Plots

The plot of the number of New Confirmed COVID-19 Cases are seen in the plot below. The line of best fit is also indicated.

plot(Day, Cases,main="New Confirmed COVID-19 Cases in the World Between
     March-April 2020",xlab="Day",ylab="Number of Cases")

A <- 2379*exp(-0.04874147*Cases)

tryfit<-nls(Cases~2379*exp(k*Day), start=c(k=0.04874147))
summary(tryfit)
## 
## Formula: Cases ~ 2379 * exp(k * Day)
## 
## Parameters:
##    Estimate Std. Error t value Pr(>|t|)    
## k 0.1149080  0.0009938   115.6   <2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 5871 on 31 degrees of freedom
## 
## Number of iterations to convergence: 5 
## Achieved convergence tolerance: 1.817e-06
lines(Day, predict(tryfit), col='red')

Comment: As seen by the graph above, there is a clear exponential trend in the number of confirmed COVID-19 cases in the world between March-April 2020.

Conclusion

This analysis could be taken a step further by conducting quantitative research on the trend that existed in individual countries at the time. Additionally, looking at the total number of cases each day, as opposed to only the number of confirmed cases, may also allow epidemiologists and other researchers to gain valuable insight regarding the transmission of COVID-19 from March 2020 to April 2020. The Monte Carlo method may also be used in future analysis to further quantify the data.