Survival - non-proportional hazards

Introduction

In this post I will:

Generate some data that by design will violate the proportional hazards assumption.
Review the data with Kaplan Meier plot.
Assess for proportional hazards using a heuristic test.
Fit a coxPH model and test for proportional hazards.
Extend the coxPH model to account for time-varying effects.
Compute the restricted mean survival time.

Here we go. Generate data for two groups with each arising from an underlying independent piecewise hazard. For the sake of the narrative I will refer to time in days.

set.seed(5)
hz1 = function(t)  {
  hh <- rep(NA, length(t))
  hh[t<20] <- 0.02
  hh[t>=20] <- 0.05
  hh
}
hz2 = function(t)  {
  hh <- rep(NA, length(t))
  hh[t<10] <- 0.08
  hh[t>=10] <- 0.01
  hh
}
N <- 1000
t1 <- rng_surv_time(hz1, N)
t2 <- rng_surv_time(hz2, N)
d <- data.table(
  trt = rep(c("Control", "Active"), each = N),
  y = c(t1, t2)
)
# 10% random censoring and admin censoring at t = 200
d[, evt := rbinom(.N, 1, 0.8)]
d[evt==0, y := runif(.N, 0, 400)]
d[y > 150, `:=`(evt = 0, y = 150)]
d[, trt := factor(trt, levels = c("Control", "Active"))]

A Kaplan Meier survival plot shows the lines crossing, which is an indicator of a violation of proportional hazards. The survival curve is characterising the probability of surviving beyond time $t$ conditional on having survived up to time $t$ (the crosses denote censored times).

lm1 <- survfit(Surv(y, evt)~trt, data = d)
# get surv at all timepoints
dtmp <- summary(lm1)
n0 <- sum(dtmp$strata == "trt=Control")
n1 <- sum(dtmp$strata == "trt=Active")
dout <- data.table(
  trt = c(rep("Control", n0), rep("Active", n1)),
  t = c(
    dtmp$time[dtmp$strata == "trt=Control"],
    dtmp$time[dtmp$strata == "trt=Active"]),
  s = c(
    dtmp$surv[dtmp$strata == "trt=Control"],
    dtmp$surv[dtmp$strata == "trt=Active"]),
  c = c(
    dtmp$n.censor[dtmp$strata == "trt=Control"],
    dtmp$n.censor[dtmp$strata == "trt=Active"])
)

ggplot(dout, aes(x = t, y = s, group = trt, col = trt)) +
  geom_line() +
  geom_point(
    data = dout[c>0], 
    aes(x = t, y = s, group = trt, col = trt), shape = 4) +
  theme_bw() +
  scale_colour_discrete("Treatment") +
  theme(legend.position = "bottom") +
  scale_x_continuous("Days") +
  scale_y_continuous("Survival")

Figure 1: KM survival curves

The median survival time is the time corresponding to a survival probability of 50% appears quite different between the two groups with the treatment group having a lower value. Note - if you compute the median survival time among those that had the event then you will get an incorrect estimate of the median (unless there is no cesoring).

lm1

## Call: survfit(formula = Surv(y, evt) ~ trt, data = d)
## 
##                n events median 0.95LCL 0.95UCL
## trt=Control 1000    810   31.5    29.5    33.4
## trt=Active  1000    688   34.4    25.8    44.6

The cumulative hazard is related to survival by the negative of the natural logarithm:

\[ \begin{equation} \begin{split} H(t) = -\log(S(t)) \end{split} \tag{1} \end{equation} \]

and thus the proportion hazard assumption (the ratio of the hazards remain constant throughout) is equivalent to saying that the difference between the logarithms of the hazards does not change with time. Equivalently, the difference between the logarithm of the cumulative hazard functions for each group is constant, computed as shown in the following code.

dout[, H := log(-log(s))]

A plot of which shows this is not the case.

ggplot(dout, aes(x = t, y = H, group = trt, col = trt)) +
  geom_line() +
  theme_bw() +
  scale_colour_discrete("Treatment") +
  theme(legend.position = "bottom")

Figure 2: Heuristic check of proportional hazard

While the power of the log-rank test decreases when the proportional hazards assumption is violated, it can still be used to compare the KM curves.

survdiff(Surv(y, evt)~trt, data = d)

## Call:
## survdiff(formula = Surv(y, evt) ~ trt, data = d)
## 
##                N Observed Expected (O-E)^2/E (O-E)^2/V
## trt=Control 1000      810      748      5.18      10.6
## trt=Active  1000      688      750      5.17      10.6
## 
##  Chisq= 10.6  on 1 degrees of freedom, p= 0.001

However, fitting a CoxPH model will give us the dubious results suggesting that the treatment arm is associated with decreased rate of the events over the complete time horizon. This is clearly not the case because the data were simulated such that the treatment group had a higher hazard of the event prior to the time reaching 10 days and a lower hazard thereafter. Nevertheless, it may be the case that the treatment is, overall, beneficial as suggested could be the case by the logrank test.

lm2 <- coxph(Surv(y, evt)~trt, data = d, x = TRUE)
summary(lm2)

## Call:
## coxph(formula = Surv(y, evt) ~ trt, data = d, x = TRUE)
## 
##   n= 2000, number of events= 1498 
## 
##               coef exp(coef) se(coef)      z Pr(>|z|)   
## trtActive -0.17110   0.84274  0.05254 -3.256  0.00113 **
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
##           exp(coef) exp(-coef) lower .95 upper .95
## trtActive    0.8427      1.187    0.7603    0.9342
## 
## Concordance= 0.486  (se = 0.007 )
## Likelihood ratio test= 10.64  on 1 df,   p=0.001
## Wald test            = 10.6  on 1 df,   p=0.001
## Score (logrank) test = 10.63  on 1 df,   p=0.001

If we formally assess the proportional hazards assumption we see that it has been violated. Additionally this test gives a visualisation that characterises the change in the parameter estimate (hazard ratio) over time. True to the data simulated, the hazard suggests increased rate to between day 10 and 20 and then decreased rate afterwards (although the rate is increasing again towards the end of the available data, presumably due to sparsity).

zp <- cox.zph(lm2)
print(zp)

##        chisq df      p
## trt      223  1 <2e-16
## GLOBAL   223  1 <2e-16

plot(zp)
abline(0,0, col =2)
abline(h = lm2$coef[1], col = 3, lwd = 2, lty = 2)

The standard way to address non-proportional hazards is to introduce time into the covariate for which the violation occurs. Formally, the Cox model is extended to allow the hazard to vary over time as follows:

\[ \begin{equation} \begin{split} \lambda(t) = \lambda_0(t) e^{\beta(t)X} \end{split} \tag{2} \end{equation} \]

In this instance, we only have a treatment term. If we want to estimate the variation in the treatment effect over time, we can split the data into time dependent times. Choice of the time at which the split is made is arbitrary, but for the sake of argument, we will create a split at day 15. The variables for episode and id simply create indicators for the time period the record is associated with and the id for the original single participant record.

d2 <- data.table(
  survSplit(Surv(y, evt)~trt, 
            data = d, cut = c(15), episode= "tgrp", id="id")
  )
head(d2, 10)

##         trt id tstart          y evt tgrp
##  1: Control  1      0  11.170611   1    1
##  2: Control  2      0  15.000000   0    1
##  3: Control  2     15  35.117845   1    2
##  4: Control  3      0  15.000000   0    1
##  5: Control  3     15 150.000000   0    2
##  6: Control  4      0  15.000000   0    1
##  7: Control  4     15  58.045788   0    2
##  8: Control  5      0   5.527041   1    1
##  9: Control  6      0  15.000000   0    1
## 10: Control  6     15 101.627319   0    2

Reviewing the above data snippet, you can get a sense of what has happened. And now the model can be refitted as shown below. In the formula we specify that a treatment term and an interaction between treatment and time period be included.

lm3 <- coxph(Surv(tstart, y, evt)~trt*strata(tgrp), data = d2)
summary(lm3)

## Call:
## coxph(formula = Surv(tstart, y, evt) ~ trt * strata(tgrp), data = d2)
## 
##   n= 3345, number of events= 1498 
## 
##                                  coef exp(coef) se(coef)      z Pr(>|z|)    
## trtActive                     0.94050   2.56126  0.08454  11.12   <2e-16 ***
## trtActive:strata(tgrp)tgrp=2 -1.98360   0.13757  0.11338 -17.50   <2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
##                              exp(coef) exp(-coef) lower .95 upper .95
## trtActive                       2.5613     0.3904    2.1702    3.0229
## trtActive:strata(tgrp)tgrp=2    0.1376     7.2689    0.1102    0.1718
## 
## Concordance= 0.625  (se = 0.006 )
## Likelihood ratio test= 343.6  on 2 df,   p=<2e-16
## Wald test            = 314.4  on 2 df,   p=<2e-16
## Score (logrank) test = 340.2  on 2 df,   p=<2e-16

On the log scale, the interpretation is the usual additive fashion such that in the first time period treatment is detrimental with a log hazard ratio equal to 0.94 and in the second period is beneficial with a log hazard ratio equal to 0.94 + -1.98 = -1.04.

Note that strata here is basically the same as interaction in base R. Also note that if you had other terms in the model, you might want to use trt:strata(tgrp) instead of trt*strata(tgrp).

In this instance, it looks like our split has inadequately addressed the violation of proportional hazards.

cox.zph(lm3)

##                  chisq df       p
## trt               3.65  1   0.056
## trt:strata(tgrp) 36.42  1 1.6e-09
## GLOBAL           69.50  2 8.1e-16

Another way to deal with the PH violation is to assume come functional form for the way that the effect changes over time. This is equally as arbitrary as data splitting approach, but is shown below for completeness. See [https://stats.stackexchange.com/questions/534516/survival-curve-from-time-dependent-coefficients-in-the-cox-model] for more of a discussion on tt

d2[trt == "Control", trt2 := 0]
d2[trt == "Active", trt2 := 1]
lm4 <- coxph(Surv(tstart, y, evt)~trt2 + tt(trt2), 
             tt = function(x, t, ...) x * log(t + 20), 
             data = d2)

The fact that both terms are significant supports that there is a time varying effect, which would be written as $(t) = $ 4.54 $+$ -1.26 $\times \log(t + 20)$.

summary(lm4)

## Call:
## coxph(formula = Surv(tstart, y, evt) ~ trt2 + tt(trt2), data = d2, 
##     tt = function(x, t, ...) x * log(t + 20))
## 
##   n= 3345, number of events= 1498 
## 
##             coef exp(coef) se(coef)      z Pr(>|z|)    
## trt2      4.5409   93.7784   0.3854  11.78   <2e-16 ***
## tt(trt2) -1.2642    0.2825   0.1027 -12.31   <2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
##          exp(coef) exp(-coef) lower .95 upper .95
## trt2       93.7784    0.01066    44.064  199.5804
## tt(trt2)    0.2825    3.54032     0.231    0.3454
## 
## Concordance= 0.623  (se = 0.006 )
## Likelihood ratio test= 172.2  on 2 df,   p=<2e-16
## Wald test            = 160.4  on 2 df,   p=<2e-16
## Score (logrank) test = 178.7  on 2 df,   p=<2e-16

d2[, trt2:= NULL]

An alternative to all this carry on is to simply use the restricted mean survival time (RMST) as the basis for comparison across groups. This is defined as the area under the survival curve up to a specified time and is interpreted as when we follow up patients for (specified time $X$), patients will survive for $\mu$ on average where $\mu$ is the RMST. Of course, if there was no censoring, then the simple mean survival time could be used (which integrates the survival function between 0 and infinity).

One of the benefits of the RMST is that it can be estimated even under heavy censoring. From the previous survfit we have

print(lm1, print.rmean = T)

## Call: survfit(formula = Surv(y, evt) ~ trt, data = d)
## 
##                n events rmean* se(rmean) median 0.95LCL 0.95UCL
## trt=Control 1000    810   50.7      1.57   31.5    29.5    33.4
## trt=Active  1000    688   64.0      2.04   34.4    25.8    44.6
##     * restricted mean with upper limit =  150

The survRM2 package contains a routine to compare the restricted means. Here, we set the time window of interest up to 150 days, which was the time of administrative censoring. The results (use plot to visualise if you want) all suggest an increase in survival over the 150 day period in the treatment group.

out <- rmst2(d$y, d$evt, as.numeric(d$trt == "Active"), tau = 150)
print(out)

## 
## The truncation time: tau = 150  was specified. 
## 
## Restricted Mean Survival Time (RMST) by arm 
##                Est.    se lower .95 upper .95
## RMST (arm=1) 63.987 2.038    59.992    67.982
## RMST (arm=0) 50.738 1.573    47.655    53.820
## 
## 
## Restricted Mean Time Lost (RMTL) by arm 
##                Est.    se lower .95 upper .95
## RMTL (arm=1) 86.013 2.038    82.018    90.008
## RMTL (arm=0) 99.262 1.573    96.180   102.345
## 
## 
## Between-group contrast 
##                        Est. lower .95 upper .95 p
## RMST (arm=1)-(arm=0) 13.250     8.203    18.296 0
## RMST (arm=1)/(arm=0)  1.261     1.156     1.376 0
## RMTL (arm=1)/(arm=0)  0.867     0.819     0.916 0

Anyway, that is enough for now. The question is, how to do all this based on Bayesian models?

Survival - non-proportional hazards

2022-02-17

Setup

Introduction