2024-10-01

Overview

  1. Statistical models
  2. Compare two survival curves
  3. Treatment switching

Statistical models

Overview: Time-to-event analyses

Survival data can be described by 4 entities:

  • survival probability
  • hazard probability
  • prob density function
  • cumulative density function
Math entities and transformation between them

Math entities and transformation between them

Overview: Time-to-event analyses

  • Main assumption: non-informative censoring

  • Additional simplifying assumptions:

    • No cohort effect on survival
    • Right censoring only
    • Events are independent of each other

Kaplan-Meier curve

  • Non-parametric method to estimate the survival probability from direct observed survival times
  • Doesn’t require any assumption of distribution.

Proportional hazards models

  • Proportional hazards assumption: Hazard can vary, but hazard ratio of two individuals (at the same time) is constant.

  • Assessing PH assumption

    • Kaplan-Meier plot
    • Plot \(log(-log(S(t)))\) against (function of) \(t\)
    • Schoenfeld residuals:
      • Plot residuals against (function of) time
      • Grambsch-Therneau test
    • Time-by-covariate interactions

Handling violation of PH assumption

  • Time-by-covariate interactions
  • Stratified Cox regression
  • Accelerated failure-time model

Cox PH model

Hazard function: \[h(t|X) = h_0(t)e^{\beta X}\]

  • \(h_0\): baseline hazard
  • \(e^{\beta X}\): function of covariates

Extended Cox model for time-varying covariates

\[h(t|X) = h_0(t)e^{\beta X(t)}\]

Parametric models

  • Parametric propoportional hazards model: Baseline hazard function is specified

  • Accelerated failure-time (AFT) models: \[logT = Y = \beta X + W \] where T is event time, X is covariate vector, W is random error, \(\beta\) is vector of regression parameters - log of time ratios/acceleration factors

    • Hazard-based form: \[\lambda(t|X) = exp(-\beta X) \lambda_0 (exp(\beta X)t)\]

    where \(\lambda_0(t)\) is baseline hazard function corresponding to \(X = 0\)

    • Assumptions:
      • Contanst-over-time log time ratio (i.e. log acceleration factors)
      • Linear relationship between each continuous covariate and the log event time.
    • Model fit assessment:
      • Information criteria: AIC, BIC
      • Plot the model-based cumulative hazard against the KM estimated cumulative hazard.

Compare two survival curves

Compare two survival curves

  • Null hypothesis: the risk of mortality after treatment A is the same as the risk of mortality after treatment B at all time points.

Null hypothesis: the risk of mortality after treatment A is the same as the risk of mortality after treatment B at all time points.

  • Quantify the difference in treatment benefits:
    • hazard ratio (HR)
    • median survival time (MT)
    • the (cumulative) survival rate
    • restricted mean survival time (RMST).

Log-rank test

  • Test statistic: \[Z= \frac{\sum_{j=1}^{k}(O_j-E_j)}{\sqrt{\sum_{j=1}^{k}V_j}} =\frac{\sum_{j=1}^{k}(d_{1,j} - d_j\frac{n_{1,j}}{n_j})} {\sqrt{\sum_{j=1}^{k}\frac{n_{0,j}n_{1,j}d_j(n_j-d_j)}{n_j^2(n_j-1)}}}\]

  • Common and classical choice under proportionality assumption

  • Non-proportionality: power loss

Weighted Log-rank test

  • Test statistic: \[Z=\frac{\sum_{j=1}^{k}w_j(O_j-E_j)}{\sqrt{\sum_{j=1}^{k}w_j^2V_j}}= \frac{\sum_{j=1}^{k}w_j(d_{1,j} - d_j\frac{n_{1,j}}{n_j})} {\sqrt{\sum_{j=1}^{k}w_j^2\frac{n_{0,j}n_{1,j}d_j(n_j-d_j)}{n_j^2(n_j-1)}}}\]

weighted log-rank test statistics take the form of the weighted sum of the differences of the estimated hazard functions at each observed failure time.

  • Test whether the hazard difference is zero between the treatment group and the control group.
  • In standard log-rank test: \(w_j = 1\)
  • In the non-PH setting: the relative differences of the two hazard functions are not constant over time \(\rightarrow\) a differential weighting (compared to equal weighting in the log-rank statistic) at different time points potentially improve the efficiency of the test statistics.

Weighted Log-rank test (cont.)

  • Fleming-Harrington \((\rho, \gamma)\) test use weights: \(FH(\rho, \gamma) = \hat{S}(t_j-)^\rho (1-\hat{S}(t_j-))^\gamma\)

    \(\hat{S}(t)\): Kaplan Meier estimate of the survival curve in pooled data (both treatment arms)

    time \(t_j-\) is the time justbefore \(t_j\)

  • \(FH(0,0)\): the log-rank statistic, most powerful under the proportional hazards assumption

  • \(FH(\rho, 0)\) with \(\rho > 0\): early separation (diminishing effect)

  • \(FH(0, \gamma)\) with \(\gamma > 0\): late separation (delayed effect)

  • \(FH(\rho, \gamma)\) with \(\rho = \gamma > 0\): the biggest separation of two hazard functions occurs in the middle

Max-Combo test

  • Test statistic:
    \[Z_{max} = max_{\rho, \gamma} \{Z_{FH_{(\rho,\gamma)}}\} \] where \(Z_{FH_{(\rho,\gamma)}}\) is the standardized Fleming-Harrington weighted log-rank statistics.

  • Original MaxCombo test is interested in the combination of \(FH(0,0)\), \(FH(0,1)\), \(FH(1,1)\) and \(FH(1,0)\)

  • Modified MaxCombo test:

    • Option 1: \(FH(0,0)\), \(FH(0,0.5)\), \(FH(0.5,0)\), \(FH(0.5,0.5)\): conservative and less sensitive to tail events.

    • Option 2: \(FH(0,0)\), \(FH(0,0.5)\), \(FH(0.5,0.5)\): if delayed effect is only possibility

  • Require appropriate multiplicity control due to the correlation of test statistics

  • The treatment effect estimate is HR obtained from the weighted Cox model corresponding to the weighted log-rank test with the smallest p-value.

Methods based on difference in median survival times

  • Depends on information at the median survival point (when the survival rate is equal to 0.5)
  • Can not apply: large censoring rate or insufficient follow-up time (causing the survival rate does not reach 0.5).
  • A test based on a difference in MT is not applicable when the crossing point of the survival curves is located near 0.5.

Methods based on difference in RMST

  • NOT require the assumption of proportional hazards.
  • Only calculated up to a specified timepoint.
  • Crossing survival curves: invalid estimates of treatment differences, reduced power

Methods based on area between two survival curves (ABS)

  • Reflects the absolute benefit of treatment effects between groups.
  • Robust regardless of non-proportionality or crossing survival curves.

Example

PFS curves from the KEYNOTE-042 trial: compare pembrolizumab with chemotherapy in first-line, metastatic non–small-cell lung cancer.

Mok TSK (2019). Pembrolizumab versus chemotherapy for previously untreated, PD-L1-expressing, locally advanced or metastatic non-small-cell lung cancer (KEYNOTE-042): A randomised, open-label, controlled, phase 3 trial.

Example (cont.)

  • Standard log rank test: not significant, HR = 1.07 (95% CI: 0.94, 1.21)

  • Late-emphasis weighted log-rank test: reject the null hypothesis in favor of pembrolizumab with a one-sided \(P < .0001\)

  • Max-Combo test:

    • reject the null hypothesis in favor of pembrolizumab (one-sided \(P < .0001\))
    • same data, reject the null hypothesis in favor of chemotherapy (one-sided \(P < .0001\))
  • RMST up to 8 months: rejects the null hypothesis in favor of chemotherapy, with a one-sided \(P < .0001\)

Reference: Freidlin B (2019). Methods for accomodating nonproportional hazards in clinical trials: Ready for primary analysis

Recommendations

  • Kaplan-Meier curves comprehensively dislay treatment effects of study arms
  • HR and log-rank test as primary analysis tools. Under non-proportional hazards, the HR from the primary analysis can still be meaningfully interpreted as an average HR over time unless there is extensive crossing of the survival curves.
  • Methods for accommodating nonproportional hazards can be useful secondary analyses

Treatment switching

Estimand framework

Reference: Manitz J (2022). Estimands for Overall Survival in Clinical Trials with Treatment Switching in Oncology

Intercurrent events

Reference: Jin M (2020). Estimand framework: Delineating what to be estimated with clinical questions of interest in clinical trials

Mix of treatment switching scenarios

Reference: Manitz J (2022). Estimands for Overall Survival in Clinical Trials with Treatment Switching in Oncology

Estimands in trials with treatment switching

Reference: Manitz J (2022). Estimands for Overall Survival in Clinical Trials with Treatment Switching in Oncology

Treatment crossover

Reference: Latimer NR (2016). Treatment switching: Statistical and decision-making challenges and approaches

Impact of treatment crossover

Simple methods

  • Intent to treat:
    • Pros: maintain randomisation balance \(\rightarrow\) reducing the possibility of bias affecting results
    • Cons: underestimate the effect of experimental treatment
  • Per-protocol (excluding switchers or censoring at switch):
    • Cons: prone to selection bias. Randomisation balance is broken if patients with a good or poor prognosis are more likely to switch.
  • Treatment as a time-varying covariate: \[\lambda_i(t)= \lambda_0(t) exp[\beta X_i(t)] \] where \(\lambda_0(t)\) is baseline hazard function and \(X_i(t)\) = 0 when patient receive the control treatment and = 1 when patient receives the experimental treatment.
    • Cons: prone to selection bias if switching is related to prognosis.

Rank preserving structural failure time model (RPSFT)

Produce counter-factual event times to estimate a causal treatment effect.

Split observed event time for patient \(i\): \(T_i=T_i^{off}+T_i^{on}\), where \(T_i^{off}\) and \(T_i^{on}\) represent the time spent off and on treatment, respectively.

Rank preserving structural failure time model (RPSFT)

Counterfactual event times: \(U_i = T_i^{off}+T_i^{on}*exp(\psi)\), where \(exp(-\psi)\) is acceleration factor.

RPSFT

Estimation:

  • g-estimation (Grid search) of possible values of \(\psi\)’s to find ‘true’ effect treatment \(\psi_0\)such that \(U_i\) is independent of \(R_i\).
  • After identifying \(\psi_0\), calculate survival times adjusted for treatment switching for the control group.
Estimate treatment effect (g-estimation) and untreated (counterfactual) survival times

Estimate treatment effect (g-estimation) and untreated (counterfactual) survival times

Estimate treatment effect (g-estimation) and untreated (counterfactual) survival times

RPSFT

Re-censoring:

  • Consider a case: A patient has his observed event time \(T_i\) extended and get censored because switching to a superior treatment, whilst he would observe event if not switch. Therefore, when change from \(T_i\) to working on \(U_i\) scale, it requires re-censoring for some patients.

Let \(C_i\) be the administrative censoring time for participant \(i\) on \(T_i\) scale. A participant is recensored (on \(U_i\) scale) at the minimum possible censoring time:

\[D^∗_i(ψ)=min(C_i,C_i exp(ψ))\]

If \(D^∗_i(ψ)<U_i(ψ)\), then update \(U_i\) = \(D^∗_i\) and censoring indicator = 0.

  • For treatment arm where switching does not occur, there can be no informative censoring and so re-censoring is not applied

Reference: Allison A (2017). rpsftm: An R Package for Rank Preserving Structural Failure Time Models.

RPSFT

Illustration of calculation of underlying quantities in estimation procedure:

  • Patients A and B with latent survival time \(U_i\)= 3 months,and administrative censoring time \(C_i\)= 4 months. Beneficial active treatment with \(\psi = ln(0.5)\)

  • Patient A is randomized to control and crosses over at time \(t_i\)= 2 so is exposed to active treatment for 2 months and has an observed survival time of \(T_i\) = 4 months (3 months + 1 month extra)

  • Patient B is randomized to active so is exposed to active treatment from \(t_i\)= 0 to 4 months and would have a survival time \(T_i\) = 5 months (3 months + 2 months extra) which will be administratively censored so we observe \(T_i\)= 4.

  • \(D^∗_i(ψ)=min(C_i,C_i exp(ψ) )= 2\) months, so both patients are recensored at 2 months

Reference: Korhonen P (2012) Correcting Overall Survival for the Impact of Crossover Via a Rank-Preserving Structural Failure Time (RPSFT) Model in the RECORD-1 Trial of Everolimus in Metastatic Renal-Cell Carcinoma, Journal of Biopharmaceutical Statistics

RPSFT

Estimate adjusted hazard ratio

RPSFT

Assumptions and Considerations:

  • Randomization assumption
  • “Common treatment effect” assumption:
    • The treatment effect is the same for all participants no matter when treatment is received.

RPSFT

Assumptions and Considerations:

  • “Common treatment effect” assumption (cont.)
    • clinically implausible: treatment switching is often only permitted after disease progression \(\rightarrow\) the capacity for a patient to benefit may be different compared to before progression
    • approximately true? - whether the treatment effect received by switchers can at least be expected to be similar to the effect received by patients initially randomized to the experimental group
    • Extension: RPSFT with a treatment-effect modifier variable, allowing the treatment effect to vary across participants

References:

  • Latimer NR (2014). Adjusting survival time estimates to account for treatment switching in randomized controlled trials - an economic evaluation context: methods, limitations, and recommendations. Med Decis Making
  • Allison A (2017). rpsftm: An R Package for Rank Preserving Structural Failure Time Models.

RPSFT

Assumptions and Considerations (cont.):

  • Counterfactual survival model requires that patients are either ‘on treatment’ or ‘off treatment’ at any 1 time
    • problematic if the control treatment is active
    • additional assumption: the treatment effect is only received while a patient is ‘on treatment’; it disappears as soon as treatment is discontinued \(\rightarrow\) clinical plausbility?
    • if expect a continuing treatment effect: assume a lagged treatment effect or on ‘treatment group basis’
      • patients randomized in experimental group: always ‘on-treatment’
      • switchers: remain ‘on-treatment’ from time of treatment switching to death

Reference: Latimer NR (2014). Adjusting survival time estimates to account for treatment switching in randomized controlled trials - an economic evaluation context: methods, limitations, and recommendations. Med Decis Making

RPSFT

Example

  • Trial compares two policies (immediate or deferred treatment) of zidovudine treatment in symptom free participants infected with HIV
  • Immediate treatment arm: received treatment at randomisation
  • Deferred arm: received treatment either at onset of AIDS related complex or AIDS (CDC group IV disease) or development of persistently low CD4 count.
  • Analysis endpoint: time from study entry to progression to AIDS, or CDC group IV disease, or death (i.e. progression-free survival)

Reference: Allison A (2017). rpsftm: An R Package for Rank Preserving Structural Failure Time Models.

RPSFT

Example - Compare intent-to-treat vs. RPSFT results | ITT result

Fitting Weibull AFT model to full analysis set shows that getting immediate treatment extends survival time by a factor of 1.158, but the effect is not statistically significant (ETR= 1.158, 95%CI: 0.996, 1.347)

## $HR
##            HR        LB       UB
## imm 0.8043545 0.6437549 1.005019
## 
## $ETR
##         ETR        LB       UB
## imm 1.15844 0.9960953 1.347244

RPSFT

Example - Compare intent-to-treat vs. RPSFT results (cont.)| RPSFT result

Using log-rank test, RPSFTM estimates \(\hat{\psi} = -0.181\), so the acceleration factor is \(exp(-\hat{\psi})= 1.199\). This means getting immediate treatment extends survival time by a factor of 1.199 (95%CI: 0.998, 1.419).

## [1] "formula    3   terms      call   " "terms      3   terms      call   "
## [3] ""

RPSFT

Example - Compare intent-to-treat vs. RPSFT results (cont.)| RPSFT result

RPSFT

Example - Compare intent-to-treat vs. RPSFT results (cont.)

Inverse probability of censoring weighting (IPCW)

  • An extension of the per-protocol censoring approach

  • Treatment switchers: artificially censored at the time of switch.

Censor switchers at the time of switch

Censor switchers at the time of switch

IPCW

Estimate weights for non-switchers [1]

  • Compute separately for each arm: For stayed patient \(i\) for time interval \(t\), weight is given by:

    \[w_{i,t} = \frac{1}{\prod_{k=0}^t P(C(k)_i = 0|C(k-1)_i=0,X_i,Z(k)_i)} \]

    \[sw_{i,t} = \frac{\prod_{k=0}^t P(C(k)_i = 0|C(k-1)_i=0,X_i)}{\prod_{k=0}^t P(C(k)_i = 0|C(k-1)_i=0,X_i,Z(k)_i)} \] where \(X_i\) are baseline covariates, \(Z(k)_i\) are time-dependent prognostic factors.

IPCW

Estimate weights for non-switchers [2]

Estimate  weights for non-censored patients, based on predictors of the probability of switching

Estimate weights for non-censored patients, based on predictors of the probability of switching

IPCW

Estimate adjusted treatment effect

Estimate adjusted treatment effect by incorporating weights within standard survival analysis

Estimate adjusted treatment effect by incorporating weights within standard survival analysis

IPCW

Assumptions & Limitations

  • “No unmeasured confounders” (exchangability) assumption: all factors that influence both switch and survival are included in the weight calculation

  • Problematic in relatively small sample: convergence issue, wide confidence intervals.

  • Substantial error when very few non-switchers

Reference: Latimer NR (2016). Treatment switching: Statistical and decision-making challenges and approaches

IPCW

Example

  • SHIVA clinical trial, comparing molecularly targeted therapy based on tumour molecular profiling (MTA) versus conventional therapy (CT) for advanced cancer.
  • Switch to the other arm was scheduled to be proposed at disease progression for patients in both arms
  • Endpoint of analysis: overall survival
  • Baseline time-fixed covariates: age at randomization, gender, number of previous lines of treatment, the dichotomized Royal Marsden Hospital score (0 or 1 vs. 2 or 3) and the altered molecular pathway (distinguishing 3 pathways, namely hormone receptors pathway, PI3K/ AKT/mTOR pathway, and RAF/MEK pathway).
  • Time-varying confounders: the Eastern Cooperative Oncology Group (ECOG) performance status, the presence of concomitant treatments and the need of platelet transfusions

Reference: Nathalie G (2019). ipcwswitch: An R package for inverse probability of censoring weighting with an application to switches in clinical trials. Computers in Biology and Medicine, 2019

IPCW

Example - Compare intent-to-treat vs. IPCW results

IPCW

Example - Compare intent-to-treat vs. IPCW results (cont.)| ITT result

ITT analysis provides an estimated hazard ratio of (1.19, 95%CI = [0.84, 1.68]),

## Call:
## coxph(formula = Surv(os_time, status) ~ bras.f + agerand + sex.f + 
##     tt_Lnum + rmh_alea.c + pathway.f, data = SHIdat)
## 
##                              coef  exp(coef)   se(coef)      z        p
## bras.fMTA               0.1729732  1.1888343  0.1768705  0.978   0.3281
## agerand                 0.0004777  1.0004778  0.0074874  0.064   0.9491
## sex.fFemale            -0.3758205  0.6867256  0.1832455 -2.051   0.0403
## tt_Lnum                 0.0140618  1.0141612  0.0357184  0.394   0.6938
## rmh_alea.c              0.9274363  2.5280198  0.1846264  5.023 5.08e-07
## pathway.fHR            -0.0593481  0.9423786  0.2794362 -0.212   0.8318
## pathway.fPI3K/AKT/mTOR -0.0284340  0.9719665  0.2820677 -0.101   0.9197
## 
## Likelihood ratio test=34.66  on 7 df, p=1.295e-05
## n= 197, number of events= 134
##     2.5 %    97.5 % 
## 0.8405603 1.6814104

IPCW

Example - Compare intent-to-treat vs. IPCW results (cont.)| IPCW result

IPCW provides an estimated causal hazard ratio of 1.30 (95%CI = [0.81, 2.08])

## Call:
## coxph(formula = Surv(tstart, tstop, event) ~ bras.f + agerand + 
##     sex.f + tt_Lnum + rmh_alea.c + pathway.f, data = SHIres, 
##     weights = SHIres$weights.trunc, cluster = id)
## 
##                             coef exp(coef)  se(coef) robust se      z        p
## bras.fMTA               0.262762  1.300518  0.240393  0.239143  1.099 0.271869
## agerand                -0.001184  0.998816  0.009506  0.009876 -0.120 0.904541
## sex.fFemale            -0.392972  0.675048  0.231436  0.234035 -1.679 0.093130
## tt_Lnum                 0.006429  1.006449  0.044150  0.040456  0.159 0.873742
## rmh_alea.c              0.809997  2.247902  0.237453  0.237956  3.404 0.000664
## pathway.fHR            -0.046975  0.954111  0.335226  0.336144 -0.140 0.888860
## pathway.fPI3K/AKT/mTOR -0.080538  0.922620  0.334524  0.327150 -0.246 0.805544
## 
## Likelihood ratio test=18.09  on 7 df, p=0.01156
## n= 9745, number of events= 83
##     2.5 %    97.5 % 
## 0.8138748 2.0781404

Comparison of RPSFT and IPCW

RPSFT IPCW
Approach
  • Randomization-based approach

  • Counter-factual survival time

  • extended Per-protocol censoring approach

  • Adjust treatment effect for informative censoring

Key assumption
  • Common treatment effect
  • Exchangability
Pros
  • Less sensitive to small patient numbers
  • Flexible switching: active control, 2-way switching

Cons
  • Problematic if active control
  • Small RCT: convergence issue, wide confidence intervals

  • Substantial error in weight estimation

  • Can’t work if having a perfect predictor of switch

Recommendations

Treatment switching possible analyses

Treatment switching possible analyses

Reference: Roche’s Treatment Switching Guidance document.