Since cases of a respiratory illness caused by a novel Corona virus were first reported from Wuhan, China, late in 2019, the disease nCovid19 has grown into a pandemic.
Almost every country in the World is affected to a greater or lesser degree.
Data on the daily count of new cases of the infection and of deaths from it are reported by health authorities and collated by various international agencies and Universities. This report draws on a publicly available data set that is updated every day and published on its website by the European Centre for Disease Control, downloadable from here.
I used R and Rstudio to download the data, load it into R and carry out the data manipulation in order to produce summaries and charts that describe the global picture. This report was created in RMarkdown. The charts were produced in ggplot2 (credit Wickham H (2016). ggplot2: Elegant Graphics for Data Analysis. Springer-Verlag New York. ISBN 978-3-319-24277-4.)
The data file comprises 11 columns and upwards of 14,000 rows (growing by the day as data is added to the file each day). The main columns (or fields) of interest are:
\(Reporting Date\): The date of reporting
\(cases\): The number of cases reported in the last 24 hours
\(deaths\): The number of deaths reported in the last 24 hours
\(country\): The official name of the Country or Territory
\(popdata2018\): The 2018 or most recent population figure for the country
The Report is structured as follows:
Global headlines
Country-wise comparison.
Time trends for number of cases by country.
Time trends for number of deaths by country.
Daily incidence of cases by country
Daily incidence of deaths by country
Doubling time for cases by country
7a. Most recent daily growth rate by country
Doubling time for deaths by country
The situation in India
Across the 210 Countries and Territories of the World there were a total of 5,460,055 cases and 347,912 deaths.
In India there have been 145,380 cases reported, and 4,167 deaths
America has been excluded because it’s huge number of cases would have distorted the chart.
The 16 countries with the most cases account for 79.2% of the world’s tally.The following chart plots the time trend of cumulative cases for these 16 countries. It’s important to note that the numbers vary greatly and so the y axis is scaled differently for each. The trend - steeply or gently upward, or flattening - should be the focus of attention.
The next chart is similar to the previous chart; but shows the growth in the number of reported deaths from Covid19. The same caveats apply; the y-axis is scaled differently
The total number of cases upto the present time measures how many people have been affected. The daily number of cases is a measure of how active the epidemic continues to be. Tracking the day-by-day incidence is a good indicator of the effectiveness of control measures. Remember, the data are based on the reporting date, not the date of onset of symptoms or the date of a positive test. Due to the usual administrative problems of weekends and holidays, cases may be reported with some lag. These fluctuations in day to day numbers are ironed out by taking a 3-day moving average.
This is similar to the previous chart, it plots the 3-day moving average of reported daily deaths; the same caveats apply.
The time taken for the number of cases to double is a measure of the how infectious the virus is and how effective the control measures have been. I worked out a notional daily doubling time in days by taking the growth rate from one day to the next and applying a standard algebraic formula:
\(D = log(2) / log (Nb/Na)\) where \(D\) is the Doubling time, \(Nb\) is the number on a given day and \(Na\) is the number the previous day.
Because this number fluctuates widely from one day to the next, I have used 3-day moving average to smooth out the trend. The y-axis is scaled differently for each country and so the charts need to be interpreted with caution.
Countries are at differet stages of the epidemic curve. In section 1 above we looked saw a bar chart comparing countries according to how many people have been infected to date. It might be useful also to compare countries according to the most recent daily growth rate. This transaltes to the current doubling time. The higher the growth rate, the lower the doubling time. India at 11th position in terms of number of cases reported to date, is witnessing a high growth rate,exceeded only by Peru, among the 16 countries with the most cases.
India has had relatively few cases and deaths thus far; and this is even before we take into accont the size of the population. It is not unlikely that the reported cases and deaths may underestimate the true state of affairs. The following 5 charts present trends in India using the reported data.
Note: The doubling times are computed for each day but because they tend to fluctuate day to day the charts are 3 day moving averages of the computed numbers
How are the different States in India performing? Are there difference that may reveal lessons to be learned. I looked at the data for a selection of states with the most Covid19 cases, after correcting for population size.