ED Service times: Calculating percentiles from observed averages

Intro: Service times in Emergency Departments
Input data
Reshape the data for plotting
Final plot to find percentiles
Footnotes

Intro: Service times in Emergency Departments

I came across the plot below in a paper on modeling wait times in Emergency Departments (EDs) in the UK (Mayhew and Smith, 2007).¹

Apparently this is called a “ready-reckoner”?² Here’s how the authors describe it:

We needed a convenient method of moving between average completion times and the inferred probability distributions [of Emergency Department service times]. This can be done using ready-reckoners which plot various percentiles calculated using different average completion times.

This is an interesting idea, because if you’re measuring service standards in an Emergency Department (ED), one of the simplest metrics you could use is the average service time. It’s easy to understand, and it’s easy to calculate on an ongoing basis.

However, you know that there can be large deviations from the average, and you’re concerned about the patients who have the longest service times - say, 90th percentile and upwards. Is there any simple way to understand how changes in the average will impact the service times of these patients?

Yes, that’s exactly what the plot above is supposed to show. Let’s try to recreate it with simulated data.

Input data

We’ll create a set of exponential distributions with different rates/means.

Note that the kind of graph we’re aiming for only makes sense for single-parameter distributions, or a set of multi-parameter distributions that vary only along a single parameter.

df1.exponentials <- 
    data.frame(rate = seq(.01, .05, .001)) %>%
    mutate(mean = 1/rate)

Next we’ll write a simple function to extract the required percentiles from each individual exponential distribution.

# create function for adding percentiles: 
perc_function <- function(rate, quantile.param = 0.5){
    # generate sample from exponential dist: 
    random.vars <- rexp(100000, rate)
    
    # return specified quantile: 
    return(quantile(random.vars, quantile.param) %>% 
               unname %>% unlist)
}

Next, map the function across the rows:

# add in the percentiles: 
df1.exponentials %<>% 
    mutate(`5th percentile` = map_dbl(rate,  # 1st arg - 1 for each row in df1
                           perc_function,  # function to map 
                           quantile.param = 0.05),  # 2rd arg - fixed
                     
           `25th percentile` = map_dbl(rate,  
                            perc_function,  
                            quantile.param = 0.25),  
           
           `50th percentile` = map_dbl(rate,  
                            perc_function,  
                            quantile.param = 0.50),  
           
           `75th percentile` = map_dbl(rate,  
                            perc_function,  
                            quantile.param = 0.75),  
           
           `95th percentile` = map_dbl(rate,  
                            perc_function,  
                            quantile.param = 0.95)  
           )

Here’s what the input data looks like in the end:

df1.exponentials %>% 
    head %>% 
    kable() %>% 
    kable_styling(bootstrap_options = c("striped",
                                        "condensed", 
                                        "responsive"), 
                  full_width = FALSE, 
                  position = "left")

rate	mean	5th percentile	25th percentile	50th percentile	75th percentile	95th percentile
0.010	100.00000	5.138027	28.61727	69.20532	139.57056	299.7246
0.011	90.90909	4.629355	25.90980	62.86759	126.27764	274.8270
0.012	83.33333	4.342354	24.00929	57.06597	114.64599	249.6673
0.013	76.92308	3.951129	22.00624	53.64135	106.85737	227.4449
0.014	71.42857	3.702087	20.60945	49.30826	98.91373	213.1688
0.015	66.66667	3.411454	19.24347	46.02070	91.80310	200.7153

Reshape the data for plotting

Although it was convenient to generate the data in “wide” form, it will be easier to plot it if it’s in “long” form.³

df2.reshaped <- 
    df1.exponentials %>% 
    gather(key = "key", 
           value = "value", 
           -c(rate, mean)) %>% 
    
    # set factor levels: 
    mutate(key = factor(key, 
                        levels = c("5th percentile", 
                                   "25th percentile",
                                   "50th percentile",
                                   "75th percentile",
                                   "95th percentile")))

Here’s what it looks like now:

df2.reshaped %>% 
    head(5) %>%
    kable() %>% 
    kable_styling(bootstrap_options = c("striped",
                                        "condensed", 
                                        "responsive"), 
                  full_width = FALSE, 
                  position = "left")

rate	mean	key	value
0.010	100.00000	5th percentile	5.138027
0.011	90.90909	5th percentile	4.629355
0.012	83.33333	5th percentile	4.342354
0.013	76.92308	5th percentile	3.951129
0.014	71.42857	5th percentile	3.702087

Final plot to find percentiles

Note that for any given decrease in the mean service time, there is a larger decrease in the service times of patients at the 95th percentile. For example, decreasing the mean from 80 min to 70 min will cause the 95th percentile of wait times to fall from 248 min to 213 min.

# create plot: --------
p1.ready.reckoner <- 
    df2.reshaped %>% 
    
    ggplot(aes(x = value, 
               y = mean, 
               group = key, 
               colour = key)) + 
    
    geom_line() + 
    gghighlight(label_params = list(segment.colour = "grey60",
                                    segment.size = .1)) + 
    
    scale_y_continuous(limits = c(20, 120), 
                       breaks = seq(20, 120, 20)) +
    
    labs(title = "ED service times: calculating percentiles from observed averages", 
         subtitle = "Assuming service time is exponentially distributed, we can use the observed mean to \ninfer the full probability distribution \n", 
         y = "Observerd mean service time (minutes)", 
         x = "Service time associated with given percentile (minutes)", 
         caption  = "\n\nNote: Plot is based on simulated data") + 
    
    theme_classic(base_size = 12) + 
    theme(plot.caption = element_text(hjust = -0, 
                                      size = 8)); p1.ready.reckoner

## label_key: key

Footnotes

Mayhew and Smith. Using queueing theory to analyse the Government’s 4-hour completion time target in Accident and Emergency departments. Health Care Management Science. 2007.↩
I suspect that’s a weird Britishism that no one else uses.↩
Update from future me: this may have been stupid; I probably could have just generated the data in long form by adding a column with the percentiles and using purrr::map2↩