HMD

Harold Nelson

3/14/2022

Human Mortality Database

This is an initial exploration of the Human Mortality Database, which is at https://www.mortality.org/.

Setup

library(tidyverse)
## ── Attaching packages ─────────────────────────────────────── tidyverse 1.3.1 ──
## ✓ ggplot2 3.3.5     ✓ purrr   0.3.4
## ✓ tibble  3.1.5     ✓ dplyr   1.0.7
## ✓ tidyr   1.1.4     ✓ stringr 1.4.0
## ✓ readr   2.0.2     ✓ forcats 0.5.1
## ── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
## x dplyr::filter() masks stats::filter()
## x dplyr::lag()    masks stats::lag()
library(plotly)
## 
## Attaching package: 'plotly'
## The following object is masked from 'package:ggplot2':
## 
##     last_plot
## The following object is masked from 'package:stats':
## 
##     filter
## The following object is masked from 'package:graphics':
## 
##     layout

USA Data

Load the data for USA males. Add a variable country and set it to “USA”.

Select country, Year, Age and qx.

Make Age numeric.

Eliminate any missing data.

Solution

USA <- read_table("~/Dropbox/HMD/lt_male/mltper_1x1/USA.mltper_1x1.txt", skip = 2) %>% 
mutate(country = "USA") %>% 
select(country, Year, Age, qx) %>% 
mutate(Age = as.numeric(Age)) %>% 
drop_na()
## 
## ── Column specification ────────────────────────────────────────────────────────
## cols(
##   Year = col_double(),
##   Age = col_character(),
##   mx = col_double(),
##   qx = col_double(),
##   ax = col_double(),
##   lx = col_double(),
##   dx = col_double(),
##   Lx = col_double(),
##   Tx = col_double(),
##   ex = col_double()
## )
## Warning in mask$eval_all_mutate(quo): NAs introduced by coercion

Canada

Do the same for Canada.

Solution

CAN <- read_table("~/Dropbox/HMD/lt_male/mltper_1x1/CAN.mltper_1x1.txt", skip = 2) %>% 
mutate(country = "Canada") %>% 
select(country, Year, Age, qx) %>% 
mutate(Age = as.numeric(Age)) %>% 
drop_na()
## 
## ── Column specification ────────────────────────────────────────────────────────
## cols(
##   Year = col_double(),
##   Age = col_character(),
##   mx = col_double(),
##   qx = col_double(),
##   ax = col_double(),
##   lx = col_double(),
##   dx = col_double(),
##   Lx = col_double(),
##   Tx = col_double(),
##   ex = col_double()
## )
## Warning in mask$eval_all_mutate(quo): NAs introduced by coercion

Combine

Combine the two dataframes into USA_CAN using rbind().

Solution

USA_CAN = rbind(USA, CAN)

Infant Mortality USA and Canada

Create this graph beginning in 1940.

Solutiom

USA_CAN %>% 
  filter(Age == 0 & Year > 1940) %>% 
  ggplot(aes(x = Year, y = qx, color = country)) +
  geom_point() +
  ggtitle("Male Infant Mortality - USA and Canada")

USA/Canada 2

Create a graph comparing USA and Canadian mortality at age 79.

Solution

USA_CAN %>% 
  filter(Age == 79 & Year > 1940) %>% 
  ggplot(aes(x = Year, y = qx, color = country)) +
  geom_point() +
  ggtitle("Age 75 Male Mortality - USA and Canada")

Questions

Imagine the questions we could answer with the HMD.

Do other countries have the same pattern of excess male deaths?

Has the USA always had this pattern?

What should we look at for Friday?