Since cases of a respiratory illness caused by a novel Corona virus were first reported from Wuhan, China, late in 2019, the disease nCovid19 has grown into a pandemic.
Almost every country in the World is affected to a greater or lesser degree.
Data on the daily count of new cases of the infection and of deaths are reported by health authorities and collated by various international agencies and Universities. This report draws on data from two sources:
International country-specific data is from a publicly available data set that is updated every day and published on its website by the European Centre for Disease Control, downloadable from here. https://opendata.ecdc.europa.eu/covid19/casedistribution/csv
The data on Indian States is from here https://t.co/lfRdu7epRj?amp=1
I used R and RStudio to download the data, load it into R and carry out the data manipulation in order to produce the charts that describe the picture. This report was created in RMarkdown. The charts were produced in ggplot2 (credit Wickham H (2016). ggplot2: Elegant Graphics for Data Analysis. Springer-Verlag New York. ISBN 978-3-319-24277-4.)
The Report is structured as follows:
Global headlines and country wide comparisons
Country-wise comparison.
The situation in India
Across the 210 Countries and Territories of the World there were a total of 11,455,325 cases and 534,329 deaths.
In India there have so far been 697,413 cases reported, and 19,693 deaths.
The 10 worst affected countries have a combined population of 2.41 billion people - 31.4 % of the world’s total- and account for 65.8 percent of the total infections and 71.1 percent of all deaths.
America and Brazil have been excluded from thias chart because the huge number of cases in these two countries distorts the chart by squashing all the other countries’into the left of the chart.
Comparing the number of cases across countries is problematic because countries differ greatly in their population characteristics. Large countries wil naturally have more cases. The proportion of total cases that occurred in the last 14 days is a measure that I call the Recency Quotient. It is a measure of how ‘young’ a country’s epidemic is - whether it is still growing or is petering out due to effective control measures. The measure is internally referenced and so allows comparisons independent of population characteristics. In essence it measures the on-going performace of each country’s control measures.
The recency quotient can also be calculated for every day to generate a time series for each country. A time series plot can reveal the rate at which a country’s epidemic is growing or slowing down. For clarity this chart shows the time trends in the recency quotient for the 8 most affected countries only.
Many countries are reporting a drop in the daily incidence of new infection. Not so in India where the reported daily number of cases have been mounting ever since the start and are still on an upward trend.
The epidemic in India started later than many other countries that were badly affected at the start of the pandemic.
India has reported relatively few deaths, given both the size of its population and the number of infections. The unusual spike in the data for India is due to a data problem. On June 17 Maharashtra and Delhi reported an unusual number of deaths. This was a data correction to account for earlier under-reporting.
Comparisons across countries are potentially misleading unless they take account of differences in population sizes. It is possible to calculate a crude population incidence (cases per million population) and a crude mortality indicator (deaths per 100 cases). It is important to note that this is not the same as the case fatality rate for which a defined cohort needs to be followed up.
The picture within India varies greatly across the States. Maharashtra is the most affected state by far followed by Tamil Nadu, Delhi and Gujarat. These 4 States make up 65 % of the total for India.