Antibiograms

Setup

This section contains the setup and the various utility functions used throughout.

Libraries used:

library(data.table)
library(ggplot2)
library(ggrepel)
library(mvtnorm)
library(parallel)
library(extraDistr)

Introduction

An antibiogram is a \(J \times K\) matrix (see figure below for example) that tells you the local resistance pattern \(j\) for a specific pathogen \(k\). For example, an antibiogram might indicate that Escherichia coli are relatively sensitive (84% susceptibility) to fluoroquinolones. However, an antibiogram does not give disease-specific advice.

Figure 1: Figure: Example antibiogram

The lack of disease-specific advice is problematic because many different bacterian may cause the same illness. Moreover, clinicians typically have to prescribe antibiotics before test results are available to confirm which pathogen (or pathogens) are implciated in the disease. Combined, these factors can lead to excessive use of broad-spectrum agents in order to achieve effective treatment.

An alternative to the antibiogram is the weighted-incidence syndromic combination antiobiogram (WISCA) which answers the question “Which antibiotic regimen will work for this particular syndrome/disease?”

For each infection/disease considered, a WISCA indicates the covereage for a specific antibiotic regimen and relevant organisms. Coverage is thus defined as the proportion of infections that would be treated by the treatment regimen at a stage where the causative bacteria is still unknown. This measure is conditional on both the susceptibility of pathogen to regimens and their relative distribution in the population. In other words, the WISCA gives a weighted mean of the susceptibilities of the bacteria with the weights defined by the relative incidence of each pathogen.

For example, if pathogen \(X\) was 100% susceptible to Ampicillin, but an illness \(Y\) was only caused by \(X\) in 55% of the cases, then treating all the cases with Ampicillin would only result in successful treatment rate of around 55%. Conversely, if illness \(Y\) was only caused by pathogen \(X\) and \(X\) was 55% susceptible to Ampicillin, then treating all cases with Ampicillin would again result in success around 55% of the time.

An example of a WISCA tool summarising the conditional probability of coverage (a precursor of successful treatment) for a series of common antibiotic regimens used to treat unrinary tract infections (UTI) and abdominal-biliary infections (ABI) is shown below.

Figure 2: Figure: Example WISCA

Again, the WISCA gives the coverage as a weighted average of the pathogen susceptibility with the weights defined by the relative incidence of the pathogens.

A nice way to visualise the WISCA is via a decision tree, see below snippet.

Figure 3: Figure: Decision tree (conditional probability tree)

At the root of the tree is what the clinician must attend to. For example, in this situation we have a suspected BSI requiring antibiotics and the clinician must decide between 3 different antibiotics. The decision node is denoted by a square.

The next level is a chance node, denoted by a circle, which characterises the distribution (relative occurrence) of pathogens causing BSI. Finally, is the conditional susceptibility for each pathogen and antibiotic regimen.

To build a WISCA decision aid, the minimum information required includes:

the total number of bacterial isolates from relevant blood cultures
the number of isolates of specific bacterial species or genera
the number of isolates tested for susceptibility to the antibiotics relevant for establishing coverage offered by the pre-specified regimens of interest, and
the number of isolates found to be susceptible to these antibiotics.

Simulation

Simulating the data can be a great tool for understanding. Consider a situation whether there are 3 pathogens that cause a disease and there are two possible treatment regimens (antibiotics 1 and 2). The three pathogens respectively cause the disease 30%, 50% and 20% of the time.

Assume that antibiotic 2 is a better treatment across all pathogens, specifically assume:

Pathogen 1 is 20% and 40% susceptible to antibiotics 1 and 2 respectively.
Pathogen 2 is 70% and 80% susceptible to antibiotics 1 and 2 respectively.
Pathogen 3 is 10% and 70% susceptible to antibiotics 1 and 2 respectively.

And for simplicitly, assume that both treatments were used 50% of the time.

set.seed(11)
N_isolates <- 10000
# relative occurrence of causative pathogen
p_pathogen <- c(0.3, 0.5, 0.2)
# pathogen j susceptible under regimen k
p_susceptible1 <- c(0.2, 0.4)
p_susceptible2 <- c(0.7, 0.8)
p_susceptible3 <- c(0.1, 0.7)
# prob trt
p_regimen = c(0.5, 0.5)

d <- data.table(
  id = 1:N_isolates,
  pathogen = max.col(t(rmultinom(N_isolates, 1, p_pathogen))),
  regimen = rbinom(N_isolates, 1, p_regimen) + 1
)
d[, successful_trt := NA_integer_]
d[pathogen == 1, successful_trt := rbinom(.N, 1, p_susceptible1[regimen])]
d[pathogen == 2, successful_trt := rbinom(.N, 1, p_susceptible2[regimen])]
d[pathogen == 3, successful_trt := rbinom(.N, 1, p_susceptible3[regimen])]
head(d)

##    id pathogen regimen successful_trt
## 1:  1        2       1              0
## 2:  2        2       1              1
## 3:  3        3       2              1
## 4:  4        2       2              1
## 5:  5        1       1              0
## 6:  6        2       1              1

We can summarise the simulated data to see if things make sense. First, frequency of occurrence of each pathogen:

d[, .(.N), keyby = .(pathogen)]

##    pathogen    N
## 1:        1 3037
## 2:        2 4937
## 3:        3 2026

Susceptibility conditional on treatment:

dtmp1 <- dcast(
  d[, .(y=.N), keyby = .(regimen, pathogen, successful_trt)],
  regimen + pathogen ~ successful_trt, value.var = "y")
dtmp1[, p_susc := `1`/(`1`+`0`)]
dtmp1

##    regimen pathogen    0    1     p_susc
## 1:       1        1 1222  286 0.18965517
## 2:       1        2  771 1712 0.68948852
## 3:       1        3  899   89 0.09008097
## 4:       2        1  900  629 0.41137999
## 5:       2        2  470 1984 0.80847596
## 6:       2        3  302  736 0.70905588

Use all the data to derive the observed distribution of the pathogens

dtmp2 <- d[, .(wgt = .N/N_isolates), keyby = .(pathogen)]
dtmp2 <- merge(dtmp1, dtmp2, by = c("pathogen") )
setkey(dtmp2, "regimen")
dtmp2

##    pathogen regimen    0    1     p_susc    wgt
## 1:        1       1 1222  286 0.18965517 0.3037
## 2:        2       1  771 1712 0.68948852 0.4937
## 3:        3       1  899   89 0.09008097 0.2026
## 4:        1       2  900  629 0.41137999 0.3037
## 5:        2       2  470 1984 0.80847596 0.4937
## 6:        3       2  302  736 0.70905588 0.2026

We can compute the coverage under each regimen as a weighted combination of the occurrence and susceptibility. Here we can see what we already know now reflected in the coverage, namely that generally, regimen 2 (Y) is the better choice.

dtmp2[, .(p_cov = sum(wgt * p_susc)), by = regimen ]

##    regimen     p_cov
## 1:       1 0.4162492
## 2:       2 0.6677354

Compared to the true values of:

sum(p_pathogen * c(p_susceptible1[1], p_susceptible2[1], p_susceptible3[1]))

## [1] 0.43

sum(p_pathogen * c(p_susceptible1[2], p_susceptible2[2], p_susceptible3[2]))

## [1] 0.66

Analysis

Prior to considering Bielicki 2019, I will introduce the relevant distributional assumptions that they adopt.

Multinomial

The multinomial distribution is a generalisation of the Binomial. Instead of counts of successes and failures, the multinomial keeps track of the number of events associated with each possible category. We might consider the pathogens that can cause BSI such a set where we keep track of the number of isolates of each particular bacterial species.

Specifically, the story for the multinomial is that we have \(K\) categories and each individual independently “selects” category \(k\) with probability \(\pi_k\) such that \(\sum_{k=1}^K \pi_k = 1\). If \(X_1\) is the number of individuals in category 1, \(X_2\) the number of individuals in category 2 etc, we have \(X_1 + \dots + X_k = n\) and \(X \sim \mathsf{Mult}(n, \pi_k)\) where the the pmf for \(X\) is

\[ \begin{equation} \begin{split} p(x) &= \frac{n!}{x_1! \dots x_K!} \pi_1^{x_1} \dots \pi_K^{x_K} \\ &= n! \prod_{k=1}^K \frac{\pi_k^{x_k}}{y_k!} \end{split} \tag{1} \end{equation} \]

and marginally \(X_k \sim \mathsf{Binomial}(n, \pi_k)\). Similar to the Beta-Binomial conjugate prior, the multinomial has a Dirichlet conjugate prior (where the Dirichlet is simply a multivariate version of the Beta distribution). Specifically, if \(\pi_k\) is assigned a Dirichlet prior with concentration \(a = (a_1, \dots, a_K)\) with \(a_k > 0\) for all \(k\) and \(a_0 = \sum_{k=1}^K a_k\) then

\[ \begin{equation} \begin{split} p(\pi) &= \frac{1}{\mathsf{Beta}(a)} \prod_{k=1}^K \pi_k^{a_k - 1} \end{split} \tag{2} \end{equation} \]

with \(\sum_{k = 1}^K \pi_k = 1\) and \(\mathsf{Beta}\) being the beta function

\[ \begin{equation} \begin{split} \mathsf{Beta}(a) = \frac{\prod_{k = 1}^K \Gamma(a_k)}{\Gamma(\sum_{k = 1}^K a_k)} \end{split} \tag{3} \end{equation} \]

and predictably, the marginal is \(\pi_k \sim \mathsf{Beta}(a_k, a_0 - a_k)\). So, under a Dirichlet prior, the posterior for \(X \sim \mathsf{Mult}(n, \pi_k)\) is

\[ \begin{equation} \begin{split} p(\pi | x) &\propto p(x|\pi)p(\pi) \\ &\propto \biggl( \prod_{k=1}^K \pi_k^{x_k} \biggr) \biggl( \prod_{k=1}^K \pi_k^{a_k - 1} \biggr) \\ &= \prod_{k=1}^K \pi_k^{a_k + x_k - 1} \end{split} \tag{4} \end{equation} \]

and after normalisation this leads to the posterior \(p(\pi | x) \sim \mathsf{Dirichlet}(a + x)\).

Binomial

The Binomial distribution is most likely familiar. It has a conjugate Beta prior which leads to a similar straight forward update to derive the posterior. Specifically, if \(Y \sim \mathsf{Binomial}(n, \theta)\) where \(n\) and \(\theta\) are the number of (independent) trials and probability of success respectively then with \(p(\theta) \sim \mathsf{Beta}(\alpha, \beta)\) we have

\[ \begin{equation} \begin{split} p(\theta|y) \sim \mathsf{Beta}(\alpha + y, \beta + n - y) \end{split} \tag{5} \end{equation} \]

Bielicki 2019

Bielicki 2019 use conjugate distributions to sample for relative incidence and susceptibility conditional on treatment. A simplified (and slow) implementation based on the dataset simulated above would look something like this:

# None of this guarantees vector lengths that you would need to incorporate 
# for small n, nor does it account for differential priors for e.g resistant 
# pathogens.

nsim <- 10000
# susceptible and resistant isolates
m1 <- cbind(
  d[successful_trt == 1, .(N1 = .N), keyby = .(regimen, pathogen, successful_trt)],
  d[successful_trt == 0, .(N0 = .N), keyby = .(regimen, pathogen, successful_trt)][, .(N0)]
)
  
coverage <- rbindlist(parallel::mclapply(1:nsim, FUN = function(i){
  # non-informative prior
  a <- rep(1, 3)
  # use the population view of relative incidence?
  X <- d[regimen == 1, .(.N), keyby = .(pathogen)][, N]
  # weights
  relative_incidence <- rdirichlet(1, a + X)
  
  # posterior probability of susceptibility
  m2 <- matrix(rbeta(nrow(m1), 1 + m1[, N1], 1 + m1[, N0]),
               nrow = 2, byrow = TRUE)
  # coverage - product of relative incidence and probability of susceptible
  data.table(relative_incidence %*% t(m2))
}, mc.cores = 10))
names(coverage) <- c("regimen1", "regimen2")

From the simulation, you can plot the posterior density for coverage, see below. I have added a vertical dashed line to show the mean coverage based on the product of the probability of each pathogen and their susceptibility under each treatment regimen. Note that the results are based on a single dataset so you would expect some difference between the mean of the empirical result and the mean of the data generating parameters (true coverage). However, if the data simulation step were repeated many times then we would expect that the population parameters for coverage would be approximately recovered. For the particular example here, the posterior and true means line up quite well, but you should not expect this to be the case for other random data sets.

dfig <- melt(coverage, measure.vars = names(coverage))
dfig[, variable:= as.character(variable)]
dfig[, variable := as.numeric(substr(variable, nchar(variable), nchar(variable)))]

dtmp <- data.table(
  variable = 1:2,
  coverage = c(
    sum(p_pathogen * p_susceptible1),
    sum(p_pathogen * p_susceptible2)
  )
)

## Warning in p_pathogen * p_susceptible1: longer object length is not a multiple
## of shorter object length

## Warning in p_pathogen * p_susceptible2: longer object length is not a multiple
## of shorter object length

ggplot(data = dfig, aes(x = value)) +
  geom_density() +
  theme_bw() +
  geom_vline(data = dtmp, 
         aes(xintercept = coverage),
         lty = 2) +
  scale_y_continuous("Density") +
  scale_x_continuous("Coverage") +
  facet_wrap(~ paste0("Regimen ", variable), scales = "free")

Figure 4: Figure: posterior density for coverage by regimen

Alternatives

An alternative, probably preferable, approach might investigate the use of a more complex model to induce shrinkage on both the pathogens and treatment regimes.

Summary

A weighted-incidence syndromic combination antiobiogram (WISCA) is an alternative to traditional decision aids that better guides a clinician towards superior treatment strategies based on coverage probabilities for given antibiotic regimens for specific diseases. Bielicki 2019 use a decision tree in conjunction with conjugate priors and then simulates directly from the posteriors to derive a view on coverage by regimen. An alternative strategy might use hierarchical modeling to give shrinkage over both pathogens and regimens. From a predictive standpoint, this would be more robust than the analysis adopted by Bielicki.

References

Bielicki 2019 Evaluation of the Coverage of 3 Antibiotic Regimens for Neonatal Sepsis in the Hospital Setting Across Asian Countries