This blog came about from discussions between Tim Morris, George Davey Smith and Nic Timpson (of the MRC-IEU, Bristol), Danny Dorling, School of Geography, Oxford, Sam Watson (Warwick) and Richard Lilford (Birmingham). However, all errors are my own and any dubious conclusions should be blamed squarely at my feet. The code errors are me too. GitHub link for the code is at the bottom. All of this is based on today’s ONS data (21/04/20), that covers the period ending 10/04/20
Update: I have rerun the analysis on 28/04/20 - including ONS data from 17/10. I haven’t had time to do further wrk (including some sensible Twitter suggestions - I will do)
Introduction:
The COVID-19 pandemic has caused excess levels of mortality in many affected areas, including the highest ever weekly reported mortality in England and Wales last week. Similar, dramatically large increases in all all-cause mortality have been noted recorded in many regions of other countries, including Lombardy, New York, and Madrid. Today, the ONS reported the highest weekly mortality figures (18,516 deaths in England and Wales) since 2000.
Given the lack of routine testing and potential for errors on death certificates, it is difficult part is drawing to reliably identify apart what is causing this increased mortality in the UK.? Clearly, the most obvious likely explanation is that these deaths all result from undiagnosed cases of COVID-19. We know that death certificates are not always 100% accurate, and that many patients may not have been tested (especially those in community settings, such as care homes). But are there any other potential explanations for this rise?
At first glance - this seems unlikely. However, until you realise there have been the dramatic changes in health behaviour and access to care over the last few months. Anecdotally, there have been significant changes in health behaviour, such as reduced attendances at emergency departments. Alongside that, huge reductions of nearly all elective and semi urgent work has reduced hospital bed occupancy to nearly ~50% in some regions, from recent all- time highs. The lockdown has led to dramatic changes in behaviour, which are likely to have significant positive and negative long-term effects on health.
So, there are two plausible explanations
Deaths are directly linked to COVID-19 but have been miscoded; Deaths are related or linked some way to policy interventions/behavioural change.
So, firstly, let’s look at cases of COVID-19 regionally, and compare it to ONS reported deaths. In this plot, we compare cases of COVID-19 (lagged by one week) to deaths. The lag is introduced as cases tend to occur before deaths, and we would expect regional cases to rise before regional deaths.
The similarity is striking, especially in the early part of the outbreak. It is clear that COVID-19 reported deaths map almost perfectly onto cases early on in the outbreak, although deaths then seem to increase. As the cases are reported daily, this rather interestingly suggests a fairly stable 7:1 relationship between cases and deaths initially, which suggests that the initial testing strategy was similar in all regions. This would fit with the PHE case definitions being applied broadly similarly across all hospitals.
However, now the West Midlands and London have diverted from this, with significant increases in COVID deaths above testing incidence.
It is also clear that overall deaths officially unrelated to COVID-19 have risen. In this plot, we simply take the 10 year average for the March and April for each region as a baseline. There are more complex methods, but actually, this gives us a pretty good idea of what the ‘normal’ deaths should be. We then calculate the overall deaths, and subtract the known COVID deaths.
Plotting this, here, for overall deaths, shows a clear increase in overall deaths.
Nearly all regions have seen an increase in overall deaths, hence the highest ever deaths figure reported today by the ONS.
When we remove the reported COVID deaths, however, the relationship regionally becomes more interesting:
In some regions, particularly Wales and the North East, the data is almost flat! Once COVID is removed, these regions do not appear to have much ‘excess mortality’, compared to the fairly clunky ten year average. In comparison, many regions have hundreds more deaths than predicted; even removing the COVID-19 deaths. This is a bit odd. Are there really excess (non-COVID) deaths in London and the West Midlands, but none in Wales?
If we compare COVID and non-COVID excess deaths, suddenly, it becomes a bit more clear.
What we can see is that in general, that although COVID and non-COVID excess deaths were pretty well matched last week, there has been a general increase in COVID deaths this week, with excess non-COVID deaths falling. There are many possible explanations - but perhaps the most plausible is a change in coding practices. Given the extensive media coverage, it is plausible that clinicians are more comfortable to attribute likely COVID cases as COVID on the death certificate recently. For example, Wales and the South West have both seen decreases in excess non-COVID deaths, with comparable increases in COVID deaths.
So, if the coding changes, can we simply look at total excess deaths per region?
Here are excess deaths (of any kind), compared to the ten year average for that month, compared with lagged cases. At first glance, it seems the excess mortality for each region is pretty standard, except perhaps the North East and Wales. However, this graph does not clearly demonstrate that the baseline level of deaths in each region varies significantly (~600 in Wales, ~1,400) in the North West. Therefore, it is important to consider excess deaths as a proportion of total deaths for each region.
So, for the last week recorded (ending 10th April 2020), the proportion of excess deaths (compared to baseline), with the weekly sum of cases for the week prior are reported here. In this case, we simply calculated the proportion of excess cases in the most recent week, and compared it to baseline figures:
London, with the most cases, has the most proportional increase in deaths, and there does appear to be a fairly linear relationship. Let’s plot this, firstly for the most recent week (I have also fitted a linear model to the data here), and then including the week before (the first week where we really started getting death data).
## `geom_smooth()` using formula 'y ~ x'
So, there appears to be a linear relationship between overall deaths and cases in the week prior, which is not a huge surprise, but the Y intercept is slightly above 1. If we include the week prior, the same relationship exists.
## `geom_smooth()` using formula 'y ~ x'
How do we interpret this? Well, firstly, it adds more weight that incidence of disease and excess mortality in a region are closely linked, and I think this is the key feature of this graph.
Secondly, there is a hint (and I am very conscious that this may be wrong and there are alternative explanations), that there is a general increase in deaths such that for a fictional region where there were no cases, we would still expect an increase in mortality above baseline. It may be that more data adds substantially to this theory, or that this simply reflects that the purported 1 week lag between cases and deaths is not true in all regions, or some other explanation.. As the epidemic wanes, and deaths start to drop in some regions, we will end up with a more clear picture of how much of this proportional increase is related to COVID-19 disease.
As always, further ONS data will be valuable - and I will update the blog next week with further data from all the regions.
Summary:
In summary, there has been an excess of death across all regions of the UK related to COVID-19. The classification of these deaths has likely changed over the weeks (suggesting an increase in deaths coded as COVID-19). Early evidence is unclear as to whether the lockdown is having an effect on overall mortality independent of caes.
Update 27/04/20:
What has garnered most interest is the prop-increases vs cases data. I will replicate hre with a bit more interest. So, first, the key graph:
## `geom_smooth()` using formula 'y ~ x'
So this seems to show that there is a general increase in deaths regardless, but that they are mostly related to cases. Formally, if we fit a linear model, and measure the y-intercept it is:
tidy(m1.1)
So, that’s quite a significant increase. If this result was ‘true’ - what would that mean. That would mean, that the best explanation of the data (using a simple linear model), is that there is a general 20% (ish) increase in deaths across the board, and there is a small but strong relationship of a small percentage increase deaths in each region related to cases. So, for a fictional region with NO cases, there would still be a 20% increase in deaths. Sobering stuff. However, is this model robust? All we have done is add up the cases in the week before and then summed them per region and plotted against increased case incidence.
Why chose the week before? It did fit nicely with the data, and it seems that cases in a region match nicely with deaths a week later, so it seems reasonable:
But what happens if we chose, say, two weeks - does that mess up our estimates? I mean, we just chose it sort of empirically:
## `geom_smooth()` using formula 'y ~ x'
Ok, so the shape is very different, but what about the intercept?
tidy(m1.2)
Yeah, a bit different. Still significantly above 1, but quite different. A fundamental flaw in this methodology is that it relies on the premise that cases tie to deaths in a region. We know that is all true, but calculating excess deaths based on this simple ‘one-week’ gap is not really appropriate. Some deaths take a while after cases, some occur shortly. Regardless, there is still some evidence of excess mortality across the board.
Data sources
ONS data was extracted at weekly level from the ONS website for the last ten years (2010 to 2020) for each of the 10 regions of England and Wales. Since January 2020, regional COVID-19 deaths per region have been reported per week, and this data was also extracted. Case level data is reported by Public Health England (in England) and Public Health Wales (in Wales). This is reported at a daily level, and was extracted directly from the website (for PHE), and from (https://github.com/tomwhite/covid-19-uk-data/blob/master/data/covid-19-totals-wales.csv) for Wales .
All my code is available at https://github.com/gushamilton/ons_c19. Please don’t @me for my terrible coding. Please do @me to improve it.