Standardised Mortality Ratio

This is a reminder to myself (and to whoever finds this usefull) on the calculation of:

the Standardised Mortality Ratio (SMR)
The Confidence Interval of the SMR

What you can learn in this reminder

Some use of regular expressions
creating functions and use them to avoid repeat myself
Read (multiple) excel data into R
Function programming basics using the package purrr, intro, usefull
The rowwise function
list.files function and the set_names from purrr to make it easy to map the files into functions and keep a record of what comes from

The SMR

The SMR is calculated as the ratio of observed hospital mortality over predicted hospital mortality.

I have data on ~1000 patients of the iCU and the following info:

SAPS II score
ICU mortality (death/alive)

The data are in sepate excel workbooks named 2012.xlsx, 2013.xlsx etc.. until 2017, in the folder “ECDC data” in my working directory (my project)

library(tidyverse)
library(readxl)

# create a vector of the file names
list_of_files = 
  list.files(path = "ECDC data", pattern = ".xlsx$", full.names = TRUE) %>% 
  # name the vectors. It will be usefull to map the names to the read_xlsx function to get all data at once (purrr magic..)
  set_names()


# Map the file names to the readxl:: read_xlsx function to read all excel files and once. 
# Combine all the workbooks into a dataframe. 
# Also, capture the filename as an `id` variable in the dataframe

all_data = 
map_df( list_of_files,
        ~ readxl::read_excel(path = .,
                             # get some columns only (i have a lot)
                             range = readxl::cell_cols("A:M")),
        .id = "dataset") %>% 
  #get the year from the dataset name - use some regular expressions
  mutate(year = str_extract(dataset, pattern ="[0-9]+" )) %>% 
  # Do some cleaning 
  janitor::clean_names() %>% # the column names
  # remove no saps score and no info on death
  filter(!is.na(saps_ii),
         !is.na(outcome_at_icu_discharge)) %>% 
  mutate(death = if_else(outcome_at_icu_discharge == "Alive", 0, 1)) %>% 
  # Select relevant variables only
  select(year,death, saps_ii, type_of_admision)

Here is a bit of an output

all_data %>% 
  head(10) %>% 
  knitr::kable()

year	death	saps_ii	type_of_admision
2012	0	49	medical
2012	0	24	Unschedule surgical
2012	0	84	medical
2012	0	50	medical
2012	0	12	medical
2012	1	64	medical
2012	0	31	Unschedule surgical
2012	0	35	medical
2012	0	63	medical
2012	0	26	medical

Lets aggregate by year, the observed mortality, and the median SAPS II score

  all_data_aggr = 
  all_data %>% 
    group_by(year) %>% 
    summarise(n = n(),
              obs_mortality = sum(death),
              mortality_rate = mean(death),
              # I get the median SAPS score
              med_saps = median(saps_ii)
              ) 

  all_data_aggr %>% knitr::kable()

year	n	obs_mortality	mortality_rate	med_saps
2012	144	26	0.1805556	48
2013	165	33	0.2000000	47
2014	159	28	0.1761006	48
2015	198	41	0.2070707	49
2016	184	32	0.1739130	60
2017	216	47	0.2175926	57

Now I need to calculate the predicted mortality by year.
The predicted mortality is derived from the SAPS II score as follows:

\(logit = -7.7631 + 0.0737*Score + 0.9971*ln(Score+1)\)

and then

\(Mortality = \frac{e^{logit}}{1+e^{logit}}\)

Here I need a function that has the saps II score as an input and returns the predicted mortality

predict_mortality = function(score){
  logit = -7.7631 + 0.0737 * score + 0.9971 * log(score+1)
  mortality = exp(logit)/ (1+exp(logit))
  return(mortality)
}

Now lets use our function predict_mortality to calculate the predicted mortality out of the SAPS II score

  smr_table = 
  all_data_aggr %>% 
    mutate(pred_mortality_rate = predict_mortality(med_saps),
           # get expecte counts of deaths
           pred_mortality = pred_mortality_rate * n,
           # The SMR 
           smr = obs_mortality/ pred_mortality) 

  #Have  a look
  smr_table %>% knitr::kable(digits = 2)

year	n	obs_mortality	mortality_rate	med_saps	pred_mortality_rate	pred_mortality	smr
2012	144	26	0.18	48	0.41	59.70	0.44
2013	165	33	0.20	47	0.39	64.67	0.51
2014	159	28	0.18	48	0.41	65.92	0.42
2015	198	41	0.21	49	0.44	86.63	0.47
2016	184	32	0.17	60	0.68	125.28	0.26
2017	216	47	0.22	57	0.62	133.76	0.35

The Confidence Interval

An approximate 95% confidence interval (CI) for the SMR was calculated by using the method proposed by Vandenbroucke JP. A shortcut method for calculating the 95 percent confidence interval of the standardized mortality ratio. (Letter). Am J Epidemiol 1982; 115:303-4.

I found the formulas of all possible CI calculations for the SMR here. It is the documentation of the this and this online calculators

I tried out a few formulas in the documentation, specifically the ones that are called approximations

I haven’t tried the Exact Tests calculations - I think they need more programming + I couldn’t figure out how I am supposed to do the iterative process. Perhaps its easy and don’t have time to think about it :)

Well i tried this approximation by Vanderbroucke

Beware that the \(\sqrt(α)\) refers to the observed mortality and the \(λ\) to the predicted mortality. DO NOT confuse with the \(α\) of the Z score (i.e. the 1.96)

So I created this function, that reads the observed and predicted mortality (actual counts) and returns a vector of 2 values. The 1st is the lower limit, and the 2nd the upper limit

smr_conf =  function(observed, predicted){
  
  lower = ((sqrt(observed) - 1.96*0.5)^2)/ predicted
  upper = ((sqrt(observed) + 1.96*0.5)^2)/ predicted
  
  return(c(lower, upper))
  
}

Now lets use the smr_table we calculated earlier

NOTE HERE the use of rowwise function of dplyr

The rowwise is needed since the inputs to the the function are taken rowwise

smr_table %>% 
  rowwise() %>% 
  mutate( lower_95 = smr_conf(obs_mortality, pred_mortality)[1],
          upper_95 = smr_conf(obs_mortality, pred_mortality)[2]) %>% 
  knitr::kable(digits = 2)

year	n	obs_mortality	mortality_rate	med_saps	pred_mortality_rate	pred_mortality	smr	lower_95	upper_95
2012	144	26	0.18	48	0.41	59.70	0.44	0.28	0.62
2013	165	33	0.20	47	0.39	64.67	0.51	0.35	0.70
2014	159	28	0.18	48	0.41	65.92	0.42	0.28	0.60
2015	198	41	0.21	49	0.44	86.63	0.47	0.34	0.63
2016	184	32	0.17	60	0.68	125.28	0.26	0.17	0.35
2017	216	47	0.22	57	0.62	133.76	0.35	0.26	0.46

Standardised Mortality Ratio

Lefkios Paikousis

10 May 2019

The SMR

The Confidence Interval