Creative Commons License
This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.

Annual mortality, 1838-2020

path <- paste0(
  "https://www.ons.gov.uk/file?uri=/",
  "peoplepopulationandcommunity/",
  "birthsdeathsandmarriages/",
  "deaths/",
  "adhocs/",
  "12735annualdeathsandmortalityrates1938to2020provisional/"
)
file <- "2020annualprovisionaladhoc2.xls"
getFile(path,file)

These data are provided by the Office for National Statistics in response to a user request, reference 12735 [link]. Data are for England and Wales, combined. Total deaths and mortality rates are given from 1838; age-standardised mortality rates are given from 1942. Rates for 2020 were provisional at the time of publication. The data were originally published on 12 January 2021; an update was made on 8 March 2021 solely to correct some typographical errors in the supporting text, without affecting the data.

if (!file.exists(file)) {
  warning(paste("Couldn't find",file,"on disk"))
} else {
  message(paste("Reading",file,"from disk"))
  Annual <- read_excel(
    file,
    sheet="Table",
    col_types=c("text",
                "numeric",
                "numeric",
                "numeric",
                "numeric"
                ),
    col_names=c("date","deaths","population","raw","age standardised"),
    skip=15,
    na=":"
  )
  Annual$date <- dmy(paste("31 12",Annual$date))
  ggplot(melt(Annual,measure.vars = c('raw','age standardised'),variable.name = 'mortality'))+
    geom_step(aes(x=date,y=value,colour=mortality),direction="vh",na.rm=TRUE)+
    scale_x_date(date_labels="%Y")+
    scale_y_continuous("mortality /100,000",limits = c(0,NA))+
    labs(
      title = "Mortality by year of registration",
      subtitle = 'England and Wales',
      caption = acknowledgement
      )
}

Age-standardisation

knitr::kable(ESP2013[c('age','population')],caption = "European Standard Population 2013",padding = 1)
European Standard Population 2013
age population
<1 1000
1-4 4000
5-9 5500
10-14 5500
15-19 5500
20-24 6000
25-29 6000
30-34 6500
35-39 7000
40-44 7000
45-49 7000
50-54 7000
55-59 6500
60-64 6000
65-69 5500
70-74 5000
75-79 4000
80-84 2500
85-89 1500
90-94 800
95+ 200

Age-standardised mortality is standardised to the European Standard Population given by populations \(P_i\) (1000, 4000 etc.) for each of the age groups \(i\) (<1, 1-4 etc.) according to the following equation:

\[ a = \sum_i\frac{M_i}{N_i}P_i\] where the real number of people in age group \(i\) is \(N_i\) and the observed number of deaths in age group \(i\) is \(M_i\). The quantity \(M_i/N_i\) is the age-specific mortality for age group \(i\); when weighted by \(P_i\) and summed over \(i\) the result is the age-standardised mortality, \(a\) (as the \(P_i\) sum to 100,000, \(a\) is “per hundred thousand population”).

Except that it isn’t really “per hundred thousand population”, although it always seems to be expressed as such, by convention, because \(M_i\) is, implicitly, a rate - the number of people in age group \(i\) who die in a year. Where mortality is given at intervals other than a year this is, implicitly, a per annum rate. It’s a pity that the convention doesn’t make that clear. It would probably be better to describe, for example, an age-standardised mortality of 1000 as “1% p.a.” That’s what it means, although that seems not to be how it’s done.

If the population distribution matched the standard population, such that \(N_i/\sum_iN_i=P_i/\sum_iP_i\quad\forall i\), the age-standardised mortality would match the raw mortality:

\[r = \frac{\sum_iM_i\sum_iP_i}{\sum_iN_i}\] This is close to the current situation, because the current population of England and Wales is close to the 2013 European Standard Population. In contrast, if the population distribution were younger than the standard population the age-standardised mortality would exceed the actual mortality. This was the case in decades past; with advances in life expectancy and reductions in fertility the population has grown older having been younger before. This is why age-standardised mortality is useful as we can factor out the effect of an aging population on changes in mortality. Without it, mortality would appear to get worse, or not get better as fast as it really had, if advances in health care had led to increased life expectancy. This is most clearly visible in the data for the late 20th century: raw mortality was more or less constant even as the population aged - aging driven in part by advances in health care which led to reduced age-standardised mortality. It’s also visible in the 2010s: raw mortality increased slightly and that is explicable in terms of an aging population as is shown by the fact that age-standardised mortality was flat.

ONS also quote a “95% confidence interval” on age-standardised mortality, as a function of the total number of deaths, as follows:

\[ a = \left( 1 \pm \frac{1.96}{\sum_iM_i} \right)\sum_i\frac{M_i}{N_i}P_i \] This interval is not useful. It assumes that the only source of uncertainty is Poisson counting statistics on the total number of deaths. This is an underestimate because age-specific mortality is not a Poisson-distributed variable because death rates are inevitably not constant and perhaps also because counting intervals, being a variable number of days worked by registrars, etc., also vary, but the underestimate is probably not large. The resulting uncertainty estimates are very small, however - inevitably well under 1% - and do not capture the effect of uncertainty in the population estimates, \(N_i\). Mid-year estimates of population are used. The uncertainty in these estimates is unknown (to me, at least, and probably to everyone; besides, the population, by definition, varies across the year) but it is hard to believe that it is not at least a few percent and therefore much larger than the counting uncertainty. An uncertainty of so-many-percent in \(N_i\) propagates to the same percentage absolute uncertainty in age-specific mortality and contributes to the uncertainty in total mortality - both raw and age-standardised. It is almost certain that the uncertainty in both raw and age-standardised mortality is dominated by the unknown uncertainties in the total population and its age distribution. If you only trust population estimates to a few percent, you should also only trust mortality rate estimates to a few percent - and you should ignore the confidence intervals given by ONS because they are too small.

Age-standardised mortality by month, from January 2001

path <- paste0(
  "https://www.ons.gov.uk/file?uri=/",
  "peoplepopulationandcommunity/",
  "birthsdeathsandmarriages/",
  "deaths/",
  "datasets/",
  "monthlymortalityanalysisenglandandwales/",
  "september2021/"
  )
file <- "monthlymortalityanalysissep2021.xlsx"

getFile(path,file)

These data are provided on a monthly schedule by the Office for National Statistics [link]. They provide monthly mortality data from January 2001, by month of registration. Data are given separately for England and for Wales, for men and for women, and for different age groups. The total number of deaths are given as well as the age-standardised rate; here, I’m interested in the latter, in order to be able to make a comparison over time. The latest data are for September 2021 [link].

Monthly <- NULL
if (!file.exists(file)) {
  warning(paste("Couldn't find",file,"on disk"))
} else {
  message(paste("Reading",file,"from disk"))
  
  table <- c(
    paste("Table",c(rep(1,3),rep(2,3),rep(4,3),rep(5,3),rep(6,3),rep(7,3))),
    paste0("Table ",c(rep("8",12),rep("9",12)),
           rep(c(rep("a",4),rep("b",4),rep("c",4)),2))
    )
  country <- c(rep(c(rep("England",3),rep("Wales",3)),3),
               c(rep("England",12),rep("Wales",12)))
  age <- c(rep("all ages",6),rep("<75",6),rep("75+",6),
           rep(c("75-79","80-84","85-89","90+"),6))
  sex <- c(rep(c("men","women","both sexes"),6),
           rep(c(rep("men",4),rep("women",4),rep("both sexes",4)),2))
  cols <- list(
    c("text","numeric","numeric",rep("skip",15)),
    c("text",rep("skip",6),"numeric","numeric",rep("skip",9)),
    c("text",rep("skip",12),"numeric","numeric",rep("skip",3)),
    
    c("text","numeric","numeric",rep("skip",15)),
    c("text",rep("skip",6),"numeric","numeric",rep("skip",9)),
    c("text",rep("skip",12),"numeric","numeric",rep("skip",3)),
    
    c("text","numeric","numeric",rep("skip",15)),
    c("text",rep("skip",6),"numeric","numeric",rep("skip",9)),
    c("text",rep("skip",12),"numeric","numeric",rep("skip",3)),
    
    c("text","numeric","numeric",rep("skip",15)),
    c("text",rep("skip",6),"numeric","numeric",rep("skip",9)),
    c("text",rep("skip",12),"numeric","numeric",rep("skip",3)),
    
    c("text","numeric","numeric",rep("skip",15)),
    c("text",rep("skip",6),"numeric","numeric",rep("skip",9)),
    c("text",rep("skip",12),"numeric","numeric",rep("skip",3)),
    
    c("text","numeric","numeric",rep("skip",15)),
    c("text",rep("skip",6),"numeric","numeric",rep("skip",9)),
    c("text",rep("skip",12),"numeric","numeric",rep("skip",3)),
    
    c("text","numeric","numeric",rep("skip",21)),
    c("text",rep("skip",6),"numeric","numeric",rep("skip",15)),
    c("text",rep("skip",12),"numeric","numeric",rep("skip",9)),
    c("text",rep("skip",18),"numeric","numeric",rep("skip",3)),
    
    c("text","numeric","numeric",rep("skip",21)),
    c("text",rep("skip",6),"numeric","numeric",rep("skip",15)),
    c("text",rep("skip",12),"numeric","numeric",rep("skip",9)),
    c("text",rep("skip",18),"numeric","numeric",rep("skip",3)),
    
    c("text","numeric","numeric",rep("skip",21)),
    c("text",rep("skip",6),"numeric","numeric",rep("skip",15)),
    c("text",rep("skip",12),"numeric","numeric",rep("skip",9)),
    c("text",rep("skip",18),"numeric","numeric",rep("skip",3)),
    
    c("text","numeric","numeric",rep("skip",21)),
    c("text",rep("skip",6),"numeric","numeric",rep("skip",15)),
    c("text",rep("skip",12),"numeric","numeric",rep("skip",9)),
    c("text",rep("skip",18),"numeric","numeric",rep("skip",3)),
    
    c("text","numeric","numeric",rep("skip",21)),
    c("text",rep("skip",6),"numeric","numeric",rep("skip",15)),
    c("text",rep("skip",12),"numeric","numeric",rep("skip",9)),
    c("text",rep("skip",18),"numeric","numeric",rep("skip",3)),
    
    c("text","numeric","numeric",rep("skip",21)),
    c("text",rep("skip",6),"numeric","numeric",rep("skip",15)),
    c("text",rep("skip",12),"numeric","numeric",rep("skip",9)),
    c("text",rep("skip",18),"numeric","numeric",rep("skip",3))
    )
  for (i in 1:length(table)) {
    df <- read_excel(
      file,
      sheet=table[i],
      col_types = cols[[i]],
      col_names=c("date","deaths","rate"),
      skip=5
      )
    df[["country"]] <- country[i]
    df[['sex']] <- sex[i]
    df[['age']] <-age[i]
    Monthly <- rbind(Monthly,df)
  }
  Monthly$date <- ceiling_date(dmy(paste("1",Monthly$date),quiet=T),"month")-1
}
Monthly<-na.omit(Monthly)
age <- "all ages"
sex <- "both sexes"
plotdata <- Monthly[Monthly[['sex']]==sex&Monthly[['age']]==age,]
ggplot(plotdata)+
  geom_step(aes(x=date,y=rate,colour=country),direction="vh")+
  #scale_colour_discrete(breaks=c('men','women','both sexes'),type=c('blue','deeppink','black'))+
  scale_x_date(limits=c(as.Date("2001-01-01"),NA),date_breaks="5 years",date_labels="%b %Y",date_minor_breaks="1 year")+
  scale_y_continuous("age-standardised mortality /100,000",limits = c(0,2500))+
  labs(
    title = "Age-standardised mortality by month of registration",
    subtitle = paste0(sex,', ',age),
    caption = acknowledgement
    )

country='England'
sex='both sexes'
for (age in list(c("all ages","<75","75+"),
                 c("all ages","<75","75-79","80-84","85-89","90+"))) {
  plotdata <- Monthly[Monthly[['country']]==country&
                      Monthly[['sex']]==sex&
                      Monthly[['age']]%in%age,]
  p<-ggplot(plotdata)+
    geom_step(aes(x=date,y=rate,colour=age),direction="vh")+
    scale_y_continuous("age-standardised mortality /100,000",limits = c(0,NA))+
    scale_colour_discrete(breaks=age)+
    scale_x_date(limits=c(as.Date("2001-01-01"),NA),
                 date_breaks="5 years",
                 date_labels="%b %Y",date_minor_breaks="1 year")+
    labs(
      title = "Age-standardised mortality by month of registration",
      subtitle = paste0(country,', ',sex),
      caption = acknowledgement
      )
  print(p)
}

country='England'
age='all ages'
plotdata <- Monthly[Monthly[['country']]==country&
                    Monthly[['age']]==age,]
ggplot(plotdata)+
  geom_step(aes(x=date,y=rate,colour=sex),direction="vh")+
  scale_colour_discrete()+
  scale_y_continuous("age-standardised mortality /100,000",limits = c(0,2500))+
  scale_x_date(limits=c(as.Date("2001-01-01"),NA),
                 date_breaks="5 years",
                 date_labels="%b %Y",date_minor_breaks="1 year")+
  labs(
    title = "Age-standardised mortality by month of registration",
    subtitle = paste0(country,', ',age),
    caption = acknowledgement
    )

Previous year’s mortality by month

Here, for each available month, the average mortality for the previous twelve months is shown.

plotdata <- Monthly[Monthly[['country']]=='England'&
                    Monthly[['sex']]=='both sexes'&
                    Monthly[['age']]=='all ages',]
N <- (length(plotdata[[1]])-11)
plotdata <- plotdata[1:N,]
for (i in 1:N) {
 plotdata[['rate']][i] <- mean(plotdata[['rate']][i:(i+11)])
}
ggplot(plotdata)+
  geom_step(aes(x=date,y=rate,colour=sex),direction="vh",na.rm=TRUE)+
  scale_x_date(limits=c(as.Date("2001-01-01"),NA),date_breaks="5 years",date_labels="%b %Y",date_minor_breaks="1 year")+
  scale_y_continuous("age-standardised mortality /100,000",limits = c(0,2500))+
  labs(
    title = "Previous year's mortality by month of registration",
    subtitle = paste0(country,', ',age),
    caption = acknowledgement
    )

Previous season’s mortality by month

Here, for each available month, the average mortality for the previous three months is shown.

plotdata <- Monthly[Monthly[['country']]=='England'&
                    Monthly[['sex']]=='both sexes'&
                    Monthly[['age']]=='all ages',]
N <- (length(plotdata[[1]])-2)
plotdata <- plotdata[1:N,]
for (i in 1:N) {
 plotdata[['rate']][i] <- mean(plotdata[['rate']][i:(i+2)])
}
ggplot(plotdata)+
  geom_step(aes(x=date,y=rate,colour=sex),direction="vh",na.rm=TRUE)+
  scale_x_date(limits=c(as.Date("2001-01-01"),NA),date_breaks="5 years",date_labels="%b %Y",date_minor_breaks="1 year")+
  scale_y_continuous("age-standardised mortality /100,000",limits = c(0,2500))+
  labs(
    title = "Previous season's mortality by month of registration",
    subtitle = paste0(country,', ',age),
    caption = acknowledgement
    )

Excess mortality by month

Excess mortality is often defined as the difference between the current mortality and a baseline given by the average of the last five years’ mortality (for monthly data, given by the average of the same month’s mortality over the five previous years). Here, for each available month, current, baseline and excess mortality are shown for the whole population in England. Data for men only and women only are similar and only clutter the graph.

country <- 'England'
sex <- 'both sexes'
age <- 'all ages'
plotdata <- Monthly[Monthly[['country']]==country&
                    Monthly[['sex']]==sex&
                    Monthly[['age']]==age,]
N <- (length(plotdata[[1]])-60)
plotdata <- data.frame(date=plotdata[['date']],
                 current=plotdata[['rate']],
                 baseline=NA,
                 excess=NA)

for (i in 1:N) {
 plotdata[i,'baseline'] <- mean(plotdata[i+seq(12,60,12),'current'])
}
plotdata['excess'] <- plotdata['current']-plotdata['baseline']
ggplot(melt(plotdata,measure.vars = c('current','baseline','excess'),variable.name = 'mortality'))+
  geom_step(aes(x=date,y=value,colour=mortality),direction="vh",na.rm=TRUE)+
  scale_x_date(limits=c(as.Date("2001-01-01"),NA),date_breaks="5 years",date_labels="%b %Y",date_minor_breaks="1 year")+
  scale_y_continuous("age-standardised mortality /100,000",limits = c(-500,2000))+
  labs(
    title = "Excess mortality by month of registration",
    subtitle = paste0(country,', ',sex,', ',age),
    caption = acknowledgement
    )

Excess mortality for a given month and over the preceding three and twelve months, inclusive, show seasonal and annual trends.

country <- 'England'
sex <- 'both sexes'
age <- 'all ages'
plotdata <- Monthly[Monthly[['country']]==country&
                    Monthly[['sex']]==sex&
                    Monthly[['age']]==age,]
N <- (length(plotdata[[1]])-60)
plotdata <- data.frame(date=plotdata[['date']],
                 current=plotdata[['rate']],
                 baseline=NA,
                 `given month`=NA,
                 `previous quarter`=NA,
                 `previous year`=NA,
                 `previous two years`=NA)

for (i in 1:N) {
 plotdata[i,'baseline'] <- mean(plotdata[i+seq(12,60,12),'current'])
}
plotdata['given month'] <- plotdata['current']-plotdata['baseline']
plotdata['previous quarter'] <- c(stats::filter((plotdata['given month']),rep(1/3,3),sides=1)[3:length(plotdata[[1]])],rep(NA,2))
plotdata['previous year'] <- c(stats::filter((plotdata['given month']),rep(1/12,12),sides=1)[12:length(plotdata[[1]])],rep(NA,11))
plotdata['previous two years'] <- c(stats::filter((plotdata['given month']),rep(1/24,24),sides=1)[24:length(plotdata[[1]])],rep(NA,23))
ggplot()+
  geom_step(aes(x=date,y=value,colour=`excess mortality`),data=melt(plotdata,measure.vars = c('given month','previous quarter','previous year','previous two years'),variable.name = 'excess mortality'),direction="vh",na.rm=TRUE)+
  scale_x_date(limits=c(as.Date("2006-01-01"),NA),date_breaks="5 years",date_labels="%b %Y",date_minor_breaks="1 year")+
  scale_y_continuous("age-standardised mortality /100,000",limits = c(-500,1000))+
  labs(
    title = "Excess mortality by month of registration",
    subtitle = paste0(country,', ',sex,', ',age),
    caption = acknowledgement
    )

---
# Analysis of age-standardised mortality data from the Office for
# National Statistics
# Copyright (C) 2021  Simon Platt

# This program is free software: you can redistribute it and/or modify
# it under the terms of the GNU General Public License as published by
# the Free Software Foundation, either version 3 of the License, or
# (at your option) any later version.
#
# This program is distributed in the hope that it will be useful,
# but WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
# GNU General Public License for more details.
#
# You should have received a copy of the GNU General Public License
# along with this program.  If not, see <https://www.gnu.org/licenses/>.

title: "Analysis of age-standardised mortality data from the Office for National Statistics"
author: "Simon Platt"
date: "`r format(Sys.time(), '%e %B %Y')`"
output: 
  html_document:
    code_folding: hide
    code_download: true
---

```{=html}
<a rel="license" href="http://creativecommons.org/licenses/by-sa/4.0/"><img alt="Creative Commons License" style="border-width:0" src="https://i.creativecommons.org/l/by-sa/4.0/88x31.png" /></a><br />This work is licensed under a <a rel="license" href="http://creativecommons.org/licenses/by-sa/4.0/">Creative Commons Attribution-ShareAlike 4.0 International License</a>.
```

```{r setup, include=FALSE}
knitr::opts_chunk$set(echo = TRUE)

library(readxl)
library(lubridate)
library(reshape2)
library(dplyr)
library(ggplot2)
theme_set(theme_bw())

readUrl <- function(path,filename) {
  warn <- err <- NULL
  status <- withCallingHandlers(
    tryCatch(
      {
        tmp <- tempfile()
        download.file(
          paste0(path,filename),
          tmp, 
          quiet=T, 
          mode="wb"
          )
      },
      error = function(e) {
        err <<- conditionMessage(e)
        1
        }
      ),
    warning = function(w) {
      warn <<- append(warn, conditionMessage(w))
      invokeRestart("muffleWarning")
      1
      }
    )
  if (status==0) {
    status <- !file.rename(tmp,filename)
  }
  return(list(success=!as.logical(status),warning=warn,error=err))
}

getFile <- function(path,file) {
  md5file <- paste0(file,'.md5')
  if (!file.exists(file) || 
      !(file.exists(md5file) && 
        readLines(md5file,1)==tools::md5sum(file))) {
    download <- readUrl(path,file)
    if (download$success) {
      writeLines(tools::md5sum(file),md5file)
      message(paste("Got data from",paste0(path,file)))
    }
  }
}

ESP2013 <- data.frame(
  age=c(
    "<1",
    "1-4",
    "5-9",
    "10-14",
    "15-19",
    "20-24",
    "25-29",
    "30-34",
    "35-39",
    "40-44",
    "45-49",
    "50-54",
    "55-59",
    "60-64",
    "65-69",
    "70-74",
    "75-79",
    "80-84",
    "85-89",
    "90-94",
    "95+"
    ),
  breaks=c(0,1,5,10,15,20,25,30,35,40,45,50,55,60,65,70,75,80,85,90,95),
  population=c(
    1000,
    4000,
    5500,
    5500,
    5500,
    6000,
    6000,
    6500,
    7000,
    7000,
    7000,
    7000,
    6500,
    6000,
    5500,
    5000,
    4000,
    2500,
    1500,
    800,
    200
    )
  )

acknowledgement <- 'Source: ONS licensed under the Open Government Licence. © S. P. Platt 2021 CC BY-SA 4.0'

```

## Annual mortality, 1838-2020

```{r get-annual-mortality-data,message=FALSE}
path <- paste0(
  "https://www.ons.gov.uk/file?uri=/",
  "peoplepopulationandcommunity/",
  "birthsdeathsandmarriages/",
  "deaths/",
  "adhocs/",
  "12735annualdeathsandmortalityrates1938to2020provisional/"
)
file <- "2020annualprovisionaladhoc2.xls"
getFile(path,file)
```

These data are provided by the Office for National Statistics in response to a user request, reference 12735 [[link](`r "https://www.ons.gov.uk/peoplepopulationandcommunity/birthsdeathsandmarriages/deaths/adhocs/12735annualdeathsandmortalityrates1938to2020provisional"`)].
Data are for England and Wales, combined.
Total deaths and mortality rates are given from 1838; age-standardised mortality rates are given from 1942.
Rates for 2020 were provisional at the time of publication.
The data were originally published on 12 January 2021; an update was made on 8 March 2021 solely to correct some typographical errors in the supporting text, without affecting the data.

```{r plot-annual-mortality-data, message=FALSE}
if (!file.exists(file)) {
  warning(paste("Couldn't find",file,"on disk"))
} else {
  message(paste("Reading",file,"from disk"))
  Annual <- read_excel(
    file,
    sheet="Table",
    col_types=c("text",
                "numeric",
                "numeric",
                "numeric",
                "numeric"
                ),
    col_names=c("date","deaths","population","raw","age standardised"),
    skip=15,
    na=":"
  )
  Annual$date <- dmy(paste("31 12",Annual$date))
  ggplot(melt(Annual,measure.vars = c('raw','age standardised'),variable.name = 'mortality'))+
    geom_step(aes(x=date,y=value,colour=mortality),direction="vh",na.rm=TRUE)+
    scale_x_date(date_labels="%Y")+
    scale_y_continuous("mortality /100,000",limits = c(0,NA))+
    labs(
      title = "Mortality by year of registration",
      subtitle = 'England and Wales',
      caption = acknowledgement
      )
}
```

### Age-standardisation

```{r ESP2013,results='asis'}
knitr::kable(ESP2013[c('age','population')],caption = "European Standard Population 2013",padding = 1)
```
Age-standardised mortality is standardised to the European Standard Population given by populations $P_i$ (`r ESP2013$population[1:2]` etc.) for each of the age groups $i$ (`r ESP2013$age[1:2]` etc.) according to the following equation:

$$ a = \sum_i\frac{M_i}{N_i}P_i$$
where the real number of people in age group $i$ is $N_i$ and the observed number of deaths in age group $i$ is $M_i$.
The quantity $M_i/N_i$ is the _age-specific_ mortality for age group $i$; when weighted by $P_i$ and summed over $i$ the result is the _age-standardised_ mortality, $a$ (as the $P_i$ sum to `r format(sum(ESP2013['population']),scientific=F,big.mark=',')`, $a$ is "per hundred thousand population").

Except that it isn't really "per hundred thousand population", although it always seems to be expressed as such, by convention, because $M_i$ is, implicitly, a rate - the number of people in age group $i$ who die _in a year_.
Where mortality is given at intervals other than a year this is, implicitly, a per annum rate.
It's a pity that the convention doesn't make that clear.
It would probably be better to describe, for example, an age-standardised mortality of 1000 as "1% p.a."
That's what it means, although that seems not to be how it's done.

If the population distribution matched the standard population, such that $N_i/\sum_iN_i=P_i/\sum_iP_i\quad\forall i$, the age-standardised mortality would match the raw mortality:

$$r = \frac{\sum_iM_i\sum_iP_i}{\sum_iN_i}$$
This is close to the current situation, because the current population of England and Wales is close to the 2013 European Standard Population.
In contrast, if the population distribution were younger than the standard population the age-standardised mortality would exceed the actual mortality.
This was the case in decades past; with advances in life expectancy and reductions in fertility the population has grown older having been younger before.
This is why age-standardised mortality is useful as we can factor out the effect of an aging population on changes in mortality.
Without it, mortality would appear to get worse, or not get better as fast as it really had, if advances in health care had led to increased life expectancy.
This is most clearly visible in the data for the late 20th century: raw mortality was more or less constant even as the population aged - aging driven in part by advances in health care which led to reduced age-standardised mortality.
It's also visible in the 2010s: raw mortality increased slightly and that is explicable in terms of an aging population as is shown by the fact that age-standardised mortality was flat. 

ONS also quote a "95% confidence interval" on age-standardised mortality, as a function of the total number of deaths, as follows:

$$ a = \left( 1 \pm \frac{1.96}{\sum_iM_i} \right)\sum_i\frac{M_i}{N_i}P_i $$
This interval is not useful. 
It assumes that the only source of uncertainty is Poisson counting statistics on the total number of deaths.
This is an underestimate because age-specific mortality is not a Poisson-distributed variable because death rates are inevitably not constant and perhaps also because counting intervals, being a variable number of days worked by registrars, etc., also vary, but the underestimate is probably not large.
The resulting uncertainty estimates are very small, however - inevitably well under 1% - and do not capture the effect of uncertainty in the population estimates, $N_i$.
Mid-year estimates of population are used.
The uncertainty in these estimates is unknown (to me, at least, and probably to everyone; besides, the population, by definition, varies across the year) but it is hard to believe that it is not at least a few percent and therefore much larger than the counting uncertainty.
An uncertainty of so-many-percent in $N_i$ propagates to the same percentage absolute uncertainty in age-specific mortality and contributes to the uncertainty in total mortality - both raw and age-standardised.
It is almost certain that the uncertainty in both raw and age-standardised mortality is dominated by the unknown uncertainties in the total population and its age distribution.
If you only trust population estimates to a few percent, you should also only trust mortality rate estimates to a few percent - and you should ignore the confidence intervals given by ONS because they are too small.

## Age-standardised mortality by month, from January 2001
```{r get-monthly-mortality-data,message=FALSE}
path <- paste0(
  "https://www.ons.gov.uk/file?uri=/",
  "peoplepopulationandcommunity/",
  "birthsdeathsandmarriages/",
  "deaths/",
  "datasets/",
  "monthlymortalityanalysisenglandandwales/",
  "september2021/"
  )
file <- "monthlymortalityanalysissep2021.xlsx"

getFile(path,file)
```

These data are provided on a monthly schedule by the Office for National Statistics [[link](`r "https://www.ons.gov.uk/peoplepopulationandcommunity/birthsdeathsandmarriages/deaths/datasets/monthlymortalityanalysisenglandandwales"`)].
They provide monthly mortality data from January 2001, by month of registration.
Data are given separately for England and for Wales, for men and for women, and for different age groups.
The total number of deaths are given as well as the age-standardised rate;
here, I'm interested in the latter, in order to be able to make a comparison over time.
The latest data are for September 2021 [[link](`r paste0(path,file)`)].

```{r read-monthly-data,message=FALSE}
Monthly <- NULL
if (!file.exists(file)) {
  warning(paste("Couldn't find",file,"on disk"))
} else {
  message(paste("Reading",file,"from disk"))
  
  table <- c(
    paste("Table",c(rep(1,3),rep(2,3),rep(4,3),rep(5,3),rep(6,3),rep(7,3))),
    paste0("Table ",c(rep("8",12),rep("9",12)),
           rep(c(rep("a",4),rep("b",4),rep("c",4)),2))
    )
  country <- c(rep(c(rep("England",3),rep("Wales",3)),3),
               c(rep("England",12),rep("Wales",12)))
  age <- c(rep("all ages",6),rep("<75",6),rep("75+",6),
           rep(c("75-79","80-84","85-89","90+"),6))
  sex <- c(rep(c("men","women","both sexes"),6),
           rep(c(rep("men",4),rep("women",4),rep("both sexes",4)),2))
  cols <- list(
    c("text","numeric","numeric",rep("skip",15)),
    c("text",rep("skip",6),"numeric","numeric",rep("skip",9)),
    c("text",rep("skip",12),"numeric","numeric",rep("skip",3)),
    
    c("text","numeric","numeric",rep("skip",15)),
    c("text",rep("skip",6),"numeric","numeric",rep("skip",9)),
    c("text",rep("skip",12),"numeric","numeric",rep("skip",3)),
    
    c("text","numeric","numeric",rep("skip",15)),
    c("text",rep("skip",6),"numeric","numeric",rep("skip",9)),
    c("text",rep("skip",12),"numeric","numeric",rep("skip",3)),
    
    c("text","numeric","numeric",rep("skip",15)),
    c("text",rep("skip",6),"numeric","numeric",rep("skip",9)),
    c("text",rep("skip",12),"numeric","numeric",rep("skip",3)),
    
    c("text","numeric","numeric",rep("skip",15)),
    c("text",rep("skip",6),"numeric","numeric",rep("skip",9)),
    c("text",rep("skip",12),"numeric","numeric",rep("skip",3)),
    
    c("text","numeric","numeric",rep("skip",15)),
    c("text",rep("skip",6),"numeric","numeric",rep("skip",9)),
    c("text",rep("skip",12),"numeric","numeric",rep("skip",3)),
    
    c("text","numeric","numeric",rep("skip",21)),
    c("text",rep("skip",6),"numeric","numeric",rep("skip",15)),
    c("text",rep("skip",12),"numeric","numeric",rep("skip",9)),
    c("text",rep("skip",18),"numeric","numeric",rep("skip",3)),
    
    c("text","numeric","numeric",rep("skip",21)),
    c("text",rep("skip",6),"numeric","numeric",rep("skip",15)),
    c("text",rep("skip",12),"numeric","numeric",rep("skip",9)),
    c("text",rep("skip",18),"numeric","numeric",rep("skip",3)),
    
    c("text","numeric","numeric",rep("skip",21)),
    c("text",rep("skip",6),"numeric","numeric",rep("skip",15)),
    c("text",rep("skip",12),"numeric","numeric",rep("skip",9)),
    c("text",rep("skip",18),"numeric","numeric",rep("skip",3)),
    
    c("text","numeric","numeric",rep("skip",21)),
    c("text",rep("skip",6),"numeric","numeric",rep("skip",15)),
    c("text",rep("skip",12),"numeric","numeric",rep("skip",9)),
    c("text",rep("skip",18),"numeric","numeric",rep("skip",3)),
    
    c("text","numeric","numeric",rep("skip",21)),
    c("text",rep("skip",6),"numeric","numeric",rep("skip",15)),
    c("text",rep("skip",12),"numeric","numeric",rep("skip",9)),
    c("text",rep("skip",18),"numeric","numeric",rep("skip",3)),
    
    c("text","numeric","numeric",rep("skip",21)),
    c("text",rep("skip",6),"numeric","numeric",rep("skip",15)),
    c("text",rep("skip",12),"numeric","numeric",rep("skip",9)),
    c("text",rep("skip",18),"numeric","numeric",rep("skip",3))
    )
  for (i in 1:length(table)) {
    df <- read_excel(
      file,
      sheet=table[i],
      col_types = cols[[i]],
      col_names=c("date","deaths","rate"),
      skip=5
      )
    df[["country"]] <- country[i]
    df[['sex']] <- sex[i]
    df[['age']] <-age[i]
    Monthly <- rbind(Monthly,df)
  }
  Monthly$date <- ceiling_date(dmy(paste("1",Monthly$date),quiet=T),"month")-1
}
Monthly<-na.omit(Monthly)
```

```{r plot-monthly-data-by-country,message=FALSE}
age <- "all ages"
sex <- "both sexes"
plotdata <- Monthly[Monthly[['sex']]==sex&Monthly[['age']]==age,]
ggplot(plotdata)+
  geom_step(aes(x=date,y=rate,colour=country),direction="vh")+
  #scale_colour_discrete(breaks=c('men','women','both sexes'),type=c('blue','deeppink','black'))+
  scale_x_date(limits=c(as.Date("2001-01-01"),NA),date_breaks="5 years",date_labels="%b %Y",date_minor_breaks="1 year")+
  scale_y_continuous("age-standardised mortality /100,000",limits = c(0,2500))+
  labs(
    title = "Age-standardised mortality by month of registration",
    subtitle = paste0(sex,', ',age),
    caption = acknowledgement
    )
```

```{r plot-monthly-data-by-age,message=FALSE}
country='England'
sex='both sexes'
for (age in list(c("all ages","<75","75+"),
                 c("all ages","<75","75-79","80-84","85-89","90+"))) {
  plotdata <- Monthly[Monthly[['country']]==country&
                      Monthly[['sex']]==sex&
                      Monthly[['age']]%in%age,]
  p<-ggplot(plotdata)+
    geom_step(aes(x=date,y=rate,colour=age),direction="vh")+
    scale_y_continuous("age-standardised mortality /100,000",limits = c(0,NA))+
    scale_colour_discrete(breaks=age)+
    scale_x_date(limits=c(as.Date("2001-01-01"),NA),
                 date_breaks="5 years",
                 date_labels="%b %Y",date_minor_breaks="1 year")+
    labs(
      title = "Age-standardised mortality by month of registration",
      subtitle = paste0(country,', ',sex),
      caption = acknowledgement
      )
  print(p)
}
```

```{r plot-monthly-data-by-sex,message=FALSE}
country='England'
age='all ages'
plotdata <- Monthly[Monthly[['country']]==country&
                    Monthly[['age']]==age,]
ggplot(plotdata)+
  geom_step(aes(x=date,y=rate,colour=sex),direction="vh")+
  scale_colour_discrete()+
  scale_y_continuous("age-standardised mortality /100,000",limits = c(0,2500))+
  scale_x_date(limits=c(as.Date("2001-01-01"),NA),
                 date_breaks="5 years",
                 date_labels="%b %Y",date_minor_breaks="1 year")+
  labs(
    title = "Age-standardised mortality by month of registration",
    subtitle = paste0(country,', ',age),
    caption = acknowledgement
    )
```

## Previous year's mortality by month

Here, for each available month, the average mortality for the previous twelve months is shown.

```{r plot-previous-year-mortality}
plotdata <- Monthly[Monthly[['country']]=='England'&
                    Monthly[['sex']]=='both sexes'&
                    Monthly[['age']]=='all ages',]
N <- (length(plotdata[[1]])-11)
plotdata <- plotdata[1:N,]
for (i in 1:N) {
 plotdata[['rate']][i] <- mean(plotdata[['rate']][i:(i+11)])
}
ggplot(plotdata)+
  geom_step(aes(x=date,y=rate,colour=sex),direction="vh",na.rm=TRUE)+
  scale_x_date(limits=c(as.Date("2001-01-01"),NA),date_breaks="5 years",date_labels="%b %Y",date_minor_breaks="1 year")+
  scale_y_continuous("age-standardised mortality /100,000",limits = c(0,2500))+
  labs(
    title = "Previous year's mortality by month of registration",
    subtitle = paste0(country,', ',age),
    caption = acknowledgement
    )
```

## Previous season's mortality by month

Here, for each available month, the average mortality for the previous three months is shown.

```{r plot-previous-season-mortality}
plotdata <- Monthly[Monthly[['country']]=='England'&
                    Monthly[['sex']]=='both sexes'&
                    Monthly[['age']]=='all ages',]
N <- (length(plotdata[[1]])-2)
plotdata <- plotdata[1:N,]
for (i in 1:N) {
 plotdata[['rate']][i] <- mean(plotdata[['rate']][i:(i+2)])
}
ggplot(plotdata)+
  geom_step(aes(x=date,y=rate,colour=sex),direction="vh",na.rm=TRUE)+
  scale_x_date(limits=c(as.Date("2001-01-01"),NA),date_breaks="5 years",date_labels="%b %Y",date_minor_breaks="1 year")+
  scale_y_continuous("age-standardised mortality /100,000",limits = c(0,2500))+
  labs(
    title = "Previous season's mortality by month of registration",
    subtitle = paste0(country,', ',age),
    caption = acknowledgement
    )
```

## Excess mortality by month

Excess mortality is often defined as the difference between the current mortality and a baseline given by the average of the last five years' mortality (for monthly data, given by the average of the same month's mortality over the five previous years).
Here, for each available month, current, baseline and excess mortality are shown for the whole population in England.
Data for men only and women only are similar and only clutter the graph.

```{r plot-excess-mortality}
country <- 'England'
sex <- 'both sexes'
age <- 'all ages'
plotdata <- Monthly[Monthly[['country']]==country&
                    Monthly[['sex']]==sex&
                    Monthly[['age']]==age,]
N <- (length(plotdata[[1]])-60)
plotdata <- data.frame(date=plotdata[['date']],
                 current=plotdata[['rate']],
                 baseline=NA,
                 excess=NA)

for (i in 1:N) {
 plotdata[i,'baseline'] <- mean(plotdata[i+seq(12,60,12),'current'])
}
plotdata['excess'] <- plotdata['current']-plotdata['baseline']
ggplot(melt(plotdata,measure.vars = c('current','baseline','excess'),variable.name = 'mortality'))+
  geom_step(aes(x=date,y=value,colour=mortality),direction="vh",na.rm=TRUE)+
  scale_x_date(limits=c(as.Date("2001-01-01"),NA),date_breaks="5 years",date_labels="%b %Y",date_minor_breaks="1 year")+
  scale_y_continuous("age-standardised mortality /100,000",limits = c(-500,2000))+
  labs(
    title = "Excess mortality by month of registration",
    subtitle = paste0(country,', ',sex,', ',age),
    caption = acknowledgement
    )
```

Excess mortality for a given month and over the preceding three and twelve months, inclusive, show seasonal and annual trends.

```{r plot-excess-mortality-filtered}
country <- 'England'
sex <- 'both sexes'
age <- 'all ages'
plotdata <- Monthly[Monthly[['country']]==country&
                    Monthly[['sex']]==sex&
                    Monthly[['age']]==age,]
N <- (length(plotdata[[1]])-60)
plotdata <- data.frame(date=plotdata[['date']],
                 current=plotdata[['rate']],
                 baseline=NA,
                 `given month`=NA,
                 `previous quarter`=NA,
                 `previous year`=NA,
                 `previous two years`=NA)

for (i in 1:N) {
 plotdata[i,'baseline'] <- mean(plotdata[i+seq(12,60,12),'current'])
}
plotdata['given month'] <- plotdata['current']-plotdata['baseline']
plotdata['previous quarter'] <- c(stats::filter((plotdata['given month']),rep(1/3,3),sides=1)[3:length(plotdata[[1]])],rep(NA,2))
plotdata['previous year'] <- c(stats::filter((plotdata['given month']),rep(1/12,12),sides=1)[12:length(plotdata[[1]])],rep(NA,11))
plotdata['previous two years'] <- c(stats::filter((plotdata['given month']),rep(1/24,24),sides=1)[24:length(plotdata[[1]])],rep(NA,23))
ggplot()+
  geom_step(aes(x=date,y=value,colour=`excess mortality`),data=melt(plotdata,measure.vars = c('given month','previous quarter','previous year','previous two years'),variable.name = 'excess mortality'),direction="vh",na.rm=TRUE)+
  scale_x_date(limits=c(as.Date("2006-01-01"),NA),date_breaks="5 years",date_labels="%b %Y",date_minor_breaks="1 year")+
  scale_y_continuous("age-standardised mortality /100,000",limits = c(-500,1000))+
  labs(
    title = "Excess mortality by month of registration",
    subtitle = paste0(country,', ',sex,', ',age),
    caption = acknowledgement
    )
```
