I came across the plot below in a paper on modeling wait times in Emergency Departments (EDs) in the UK (Mayhew and Smith, 2007).1
Apparently this is called a “ready-reckoner”?2 Here’s how the authors describe it:
We needed a convenient method of moving between average completion times and the inferred probability distributions [of Emergency Department service times]. This can be done using ready-reckoners which plot various percentiles calculated using different average completion times.
This is an interesting idea, because if you’re measuring service standards in an Emergency Department (ED), one of the simplest metrics you could use is the average service time. It’s easy to understand, and it’s easy to calculate on an ongoing basis.
However, you know that there can be large deviations from the average, and you’re concerned about the patients who have the longest service times - say, 90th percentile and upwards. Is there any simple way to understand how changes in the average will impact the service times of these patients?
Yes, that’s exactly what the plot above is supposed to show. Let’s try to recreate it with simulated data.
We’ll create a set of exponential distributions with different rates/means.
Note that the kind of graph we’re aiming for only makes sense for single-parameter distributions, or a set of multi-parameter distributions that vary only along a single parameter.
df1.exponentials <-
data.frame(rate = seq(.01, .05, .001)) %>%
mutate(mean = 1/rate)
Next we’ll write a simple function to extract the required percentiles from each individual exponential distribution.
# create function for adding percentiles:
perc_function <- function(rate, quantile.param = 0.5){
# generate sample from exponential dist:
random.vars <- rexp(100000, rate)
# return specified quantile:
return(quantile(random.vars, quantile.param) %>%
unname %>% unlist)
}
Next, map the function across the rows:
# add in the percentiles:
df1.exponentials %<>%
mutate(`5th percentile` = map_dbl(rate, # 1st arg - 1 for each row in df1
perc_function, # function to map
quantile.param = 0.05), # 2rd arg - fixed
`25th percentile` = map_dbl(rate,
perc_function,
quantile.param = 0.25),
`50th percentile` = map_dbl(rate,
perc_function,
quantile.param = 0.50),
`75th percentile` = map_dbl(rate,
perc_function,
quantile.param = 0.75),
`95th percentile` = map_dbl(rate,
perc_function,
quantile.param = 0.95)
)
Here’s what the input data looks like in the end:
df1.exponentials %>%
head %>%
kable() %>%
kable_styling(bootstrap_options = c("striped",
"condensed",
"responsive"),
full_width = FALSE,
position = "left")
| rate | mean | 5th percentile | 25th percentile | 50th percentile | 75th percentile | 95th percentile |
|---|---|---|---|---|---|---|
| 0.010 | 100.00000 | 5.138027 | 28.61727 | 69.20532 | 139.57056 | 299.7246 |
| 0.011 | 90.90909 | 4.629355 | 25.90980 | 62.86759 | 126.27764 | 274.8270 |
| 0.012 | 83.33333 | 4.342354 | 24.00929 | 57.06597 | 114.64599 | 249.6673 |
| 0.013 | 76.92308 | 3.951129 | 22.00624 | 53.64135 | 106.85737 | 227.4449 |
| 0.014 | 71.42857 | 3.702087 | 20.60945 | 49.30826 | 98.91373 | 213.1688 |
| 0.015 | 66.66667 | 3.411454 | 19.24347 | 46.02070 | 91.80310 | 200.7153 |
Although it was convenient to generate the data in “wide” form, it will be easier to plot it if it’s in “long” form.3
df2.reshaped <-
df1.exponentials %>%
gather(key = "key",
value = "value",
-c(rate, mean)) %>%
# set factor levels:
mutate(key = factor(key,
levels = c("5th percentile",
"25th percentile",
"50th percentile",
"75th percentile",
"95th percentile")))
Here’s what it looks like now:
df2.reshaped %>%
head(5) %>%
kable() %>%
kable_styling(bootstrap_options = c("striped",
"condensed",
"responsive"),
full_width = FALSE,
position = "left")
| rate | mean | key | value |
|---|---|---|---|
| 0.010 | 100.00000 | 5th percentile | 5.138027 |
| 0.011 | 90.90909 | 5th percentile | 4.629355 |
| 0.012 | 83.33333 | 5th percentile | 4.342354 |
| 0.013 | 76.92308 | 5th percentile | 3.951129 |
| 0.014 | 71.42857 | 5th percentile | 3.702087 |
Note that for any given decrease in the mean service time, there is a larger decrease in the service times of patients at the 95th percentile. For example, decreasing the mean from 80 min to 70 min will cause the 95th percentile of wait times to fall from 248 min to 213 min.
# create plot: --------
p1.ready.reckoner <-
df2.reshaped %>%
ggplot(aes(x = value,
y = mean,
group = key,
colour = key)) +
geom_line() +
gghighlight(label_params = list(segment.colour = "grey60",
segment.size = .1)) +
scale_y_continuous(limits = c(20, 120),
breaks = seq(20, 120, 20)) +
labs(title = "ED service times: calculating percentiles from observed averages",
subtitle = "Assuming service time is exponentially distributed, we can use the observed mean to \ninfer the full probability distribution \n",
y = "Observerd mean service time (minutes)",
x = "Service time associated with given percentile (minutes)",
caption = "\n\nNote: Plot is based on simulated data") +
theme_classic(base_size = 12) +
theme(plot.caption = element_text(hjust = -0,
size = 8)); p1.ready.reckoner
## label_key: key
Mayhew and Smith. Using queueing theory to analyse the Government’s 4-hour completion time target in Accident and Emergency departments. Health Care Management Science. 2007.↩
I suspect that’s a weird Britishism that no one else uses.↩
Update from future me: this may have been stupid; I probably could have just generated the data in long form by adding a column with the percentiles and using purrr::map2↩