Setup

This section contains the setup and the various utility functions used throughout.

Libraries used:

library(data.table)
library(ggplot2)
library(parallel)

Various support functions

Introduction

In this post I will:

Here we go.

Exponential distribution

The exponential distribution assumes a fixed hazard \(h(t) = \lambda\), which implies \(H(t) = \lambda t\). Now, since \(S(t) = \exp(-H(t))\) we have \(S(t) = \exp(-\lambda t)\). So, to find the parameters needed to simulate data with a specific median time to event (say 4 days) do:

\[ \begin{align*} S(t) &= \exp(-4\lambda) = 0.5 \\ \lambda &= -\frac{\log(0.5)}{4} \\ &= 0.173 \end{align*} \]

Simulate data

The following function generates time to event data for two groups.

get_data <- function(N = 100, h0 = -log(0.5)/4, hr = 1, admin_censor = 20){
  
  y0 <- rexp(N, h0)
  y1 <- rexp(N, h0*hr)
  
  d <- data.table(
    id = 1:(2*N),
    trt = rep(0:1, each = N),
    treatment = rep(c("Control", "Active"), each = N),
    y = c(y0, y1),
    evt = 1
  )
  d[y > admin_censor, `:=`(evt = 0, y = admin_censor)]
  d
}
d <- get_data(1000)
ggplot(d, aes(x = y, group = treatment, col = treatment)) + 
  geom_density() +
  theme_bw() +
  theme(legend.pos= "bottom")

Exponential-Gamma model

The Gamma distribution is the conjugate prior to the Exponential likelihood. If the prior for the rate parameter is \(\lambda \sim \mathcal{G}(a, b)\), the likelihood is exponential, the number of observations is \(N\), events and censoring are represented as \(\delta = (0, 1)\) and the total observed time is \(\sum_i^N y_i\) then the posterior is

\[ \begin{align*} p(\lambda | y_i) \sim \mathcal{G}(a + \sum_{i=1}^N \delta_i, b + \sum_i^N y_i) \end{align*} \]

So, for example, here is a little simulation based inference and visualisation of the prior and posterior distribution for the rate parameter \(\lambda\):

N <- 100
l_tru <- 0.1
censor <- 20
y <- rexp(N, l_tru)
evt <- as.integer(y < censor)
y[evt == 0] <- censor
a <- 1; b <- 3
l_pri <- rgamma(1e4, a , b )
l_post <- rgamma(1e4, a + sum(evt), b + sum(y))

d <- data.table(
  Prior = l_pri, Posterior = l_post
)
d <- melt(d, measure.vars = names(d))
ggplot(d, aes(x  = value)) +
  geom_density() +
  geom_vline(xintercept = l_tru) +
  theme_bw() + 
  facet_wrap(~variable, scales = "free")

For convenience here is a function to wrap up the posterior computations for two groups (per our experiment setup).

fit <- function(a = 1, b = 3, delta0 = 100, delta1 = 100, y0tot, y1tot){
  l_post0 <- rgamma(1000, a + delta0, b + y0tot)
  l_post1 <- rgamma(1000, a + delta1, b + y1tot)
  hr <- l_post1/l_post0
  hr
}

Power

With a data generation mechanism and the model in place, we can run a formal analysis. For example we could estimate the hazard ratio as follows

d <- get_data(hr = 2)
dfig <- data.table(
  post_hr = fit(
    delta0 = d[evt == 1 & trt == 0, .N],
    delta1 = d[evt == 1 & trt == 1, .N],
    y0tot = d[trt == 0, sum(y)], 
    y1tot=d[trt == 1, sum(y)]
    )
)
ggplot(dfig, aes(x  = post_hr)) +
  geom_density() +
  theme_bw() 

In order to determine the frequentist power we need to construct some form of null and alternative hypothesis. For simplicity, we assume that the null hypothesis is rejected if there is a 0.975 or more probability that the hazard ratio (denoted by \(\gamma\)) exceeds 1, i.e.

\[ \begin{align*} Pr(\gamma > 1) > 0.975 \end{align*} \]

Now, in order to estimate the power, we just repeat this simulation process a lot of times and calculate the proportion for which we reject the null hypothesis

dsim <- data.table(
  hr_sim = c(1.00, 1.14, 1.33, 1.60, 2.00)
)
dsim[, pwr := numeric(.N)]
dsim[, E_E_hr := numeric(.N)]
threshold <- 0.975
for(i in 1:nrow(dsim)){
  res <- do.call(rbind, mclapply(1:1000, FUN = function(j){
    d <- get_data(hr = dsim[i, hr_sim])
    post_hr <- fit(
      delta0 = d[evt == 1 & trt == 0, .N],
      delta1 = d[evt == 1 & trt == 1, .N],
      y0tot = d[trt == 0, sum(y)], 
      y1tot=d[trt == 1, sum(y)])
    win <- mean(post_hr > 1) > threshold
    hr <- mean(post_hr)
    c(win, hr)
    }, mc.cores = 8))
  dsim[i, pwr := mean(res[, 1])]
  dsim[i, E_E_hr := mean(res[, 2])]
}
ggplot(dsim, aes(x = hr_sim, y = pwr))+
  geom_line() +
  geom_hline(yintercept = 0.8, lty = 2) +
  geom_hline(yintercept = 0.05, lty = 2) +
  scale_x_continuous("True HR", breaks = seq(1, 2, by = 0.1))+
  theme_bw()

ggplot(dsim, aes(x = hr_sim, y = E_E_hr))+
  geom_point() +
  geom_line(aes(x = hr_sim, y = hr_sim), inherit.aes = F)+
  scale_x_continuous("True HR", breaks = seq(1, 2, by = 0.1))+
  scale_y_continuous("E[E[hr]]", breaks = seq(1, 2, by = 0.1))+
  theme_bw()

When the hazard ratio is simulated as 1, any instance where we reject the null hypothesis is an incorrect decision. Specifically, in a frequentist framework this is characterised as a type-i error and nominally it is set somewhere around 5%. Also, people tend to aim at studies with a power around 80% so that there is a reasonable chance of detecting an effect should one exist. In the first figure above, I have added the 5% and 80% lines for reference. This suggests that if we assumed that the hazard ratio was likely to be about 1.5 then with a sample size of 100 per arm we would probably have a little under 80% power.

In the second figure above I show the expectation of the expected value for the hazard ratio. This shows that, on average, the simulated treatment effect appears to be recovered.

Here is the results of a second power simulation using the same simulation parameters with the exception that the data is subject to more censoring, i.e. our follow up is shortened (because I have only implemented administrative censoring).

dsim <- data.table(
  hr_sim = c(1.00, 1.14, 1.33, 1.60, 2.00)
)
dsim[, pwr := numeric(.N)]
dsim[, E_E_hr := numeric(.N)]
threshold <- 0.975
for(i in 1:nrow(dsim)){
  res <- do.call(rbind, mclapply(1:1000, FUN = function(j){
    d <- get_data(hr = dsim[i, hr_sim], admin_censor = 5)
    post_hr <- fit(
      delta0 = d[evt == 1 & trt == 0, .N],
      delta1 = d[evt == 1 & trt == 1, .N],
      y0tot = d[trt == 0, sum(y)], 
      y1tot=d[trt == 1, sum(y)])
    win <- mean(post_hr > 1) > threshold
    hr <- mean(post_hr)
    c(win, hr)
    }, mc.cores = 8))
  dsim[i, pwr := mean(res[, 1])]
  dsim[i, E_E_hr := mean(res[, 2])]
}
ggplot(dsim, aes(x = hr_sim, y = pwr))+
  geom_line() +
  geom_hline(yintercept = 0.8, lty = 2) +
  geom_hline(yintercept = 0.05, lty = 2) +
  scale_x_continuous("True HR", breaks = seq(1, 2, by = 0.1))+
  theme_bw()

ggplot(dsim, aes(x = hr_sim, y = E_E_hr))+
  geom_point() +
  geom_line(aes(x = hr_sim, y = hr_sim), inherit.aes = F)+
  scale_x_continuous("True HR", breaks = seq(1, 2, by = 0.1))+
  scale_y_continuous("E[E[hr]]", breaks = seq(1, 2, by = 0.1))+
  theme_bw()

In this scenario it is clear that our power has reduced somewhat.