This validation study comprehensively compares trtswitch::rpsftm() against the reference implementation rpsftm::rpsftm() across three datasets (shilong, immdef, and anonymised_trial_data) and multiple switching scenarios.

1 Executive Summary

  • 1. One-way, with recensoring: Consistent results between trtswitch::rpsftm() and rpsftm::rpsftm() for both shilong and immdef datasets, with very similar psi estimates, hazard ratios, and p-values.
  • 2. One-way, without recensoring: Consistent results between trtswitch::rpsftm() and rpsftm::rpsftm() for both shilong and immdef datasets, with very similar psi estimates, hazard ratios, and p-values.
  • 3. Two-way, with recensoring: Excellent agreement between trtswitch::rpsftm() and rpsftm::rpsftm() for shilong dataset, with very similar psi estimates, hazard ratios, and p-values after validation function correction.
  • 4. Two-way, without recensoring: Excellent agreement between trtswitch::rpsftm() and rpsftm::rpsftm() for shilong dataset, with very similar psi estimates, hazard ratios, and p-values after validation function correction.
Validation Summary: trtswitch::rpsftm() vs rpsftm::rpsftm()
Switching Type Recensoring Dataset Psi HR P-value Counterfactual Time Event Status
1-way YES shilong
1-way NO shilong
1-way YES immdef
1-way NO immdef
1-way YES anonymised_trial_data
1-way NO anonymised_trial_data
2-way YES shilong
2-way NO shilong

2 Data Processing

2.1 Prepare shilong and immdef datasets

library(tidyverse)
library(dplyr)
library(survival)
library(rpsftm)
library(trtswitch) 
library(boot)
library(ggplot2) 
shilong1 <- trtswitch::shilong %>%
  arrange(bras.f, id, ady) %>%
  group_by(bras.f, id) %>%
  slice(n()) %>%
  select(-c("ps", "ttc", "tran")) %>%
  ungroup() %>%
  # two-way switch
    mutate(rx = ifelse(co, ifelse(bras.f == "MTA", dco/ady, 1 - dco/ady),
                     ifelse(bras.f == "MTA", 1, 0))) %>%
  # one-way switch
  mutate(rx1 = ifelse(bras.f == "MTA", 1, 
                     ifelse(co, 1 - dco/ady, 0)))  %>%
  mutate(xo1 = ifelse(bras.f == "MTA",  FALSE, co)) %>%
  mutate(arm = case_when(bras.f == "MTA" ~ 1, TRUE ~ 0)) %>% 
  mutate(survtime = ady, status = event, xoyrs = dco)

immdef1 <- trtswitch::immdef %>% 
  mutate(rx1 = 1-xoyrs/progyrs) %>%
  mutate(xo1 = xo) %>%
  mutate(survtime = progyrs, status = prog, arm = imm)


# Prepare internal_data dataset
anonymised_trial_data <- read_csv("processed_data.csv") %>%
   mutate(event = as.numeric(event)) %>%
  
  # one-way switch  
  mutate(rx1 = ifelse(arm == 1, 1, 
                     ifelse(xo == TRUE, 1 - dco/ady, 0))) %>%
  mutate(xo1 = ifelse(arm == 1, FALSE, xo)) %>%
 mutate(survtime = ady, status = as.integer(event), xoyrs = dco) %>%
mutate(dcut = (ady +30)) 

3 Validation Codes

3.1 Validation code/function for trtswitch::rpsftm()

# input data requires columns:
# ady: survival time
# event: status
# arm: 1/0
# rx: Proportion of duration of treatment switch  
# dcut: the relative day of administrative cutoff (if recensor = TRUE)
library(boot)
validate_rpsftm <- function(data, rx = "rx1", recensor = TRUE, stratify = NULL, indices = 1:nrow(data)) {
  d <- data[indices,]
  use_strata <- !is.null(stratify) && stratify %in% names(data)

  if (use_strata) {
    fml <- as.formula(paste0("Surv(ady, event)~rand(arm,", rx, ")+strata(", stratify, ")"))
  } else {
    fml <- as.formula(paste0("Surv(ady, event)~rand(arm,", rx, ")"))
  }
  if (recensor){
    Result.RPSFT = rpsftm::rpsftm(fml, data = d, censor_time = dcut, low_psi=-3, hi_psi=3)
  } else if (!recensor){
    Result.RPSFT = rpsftm::rpsftm(fml, data = d, low_psi=-3, hi_psi=3)
  }
  tstar2 = ifelse(data$arm == 0, Result.RPSFT$Sstar[,1], Result.RPSFT$Sstar[,1] * exp(-Result.RPSFT$psi))
  event2 = Result.RPSFT$Sstar[,2]
  d_add = data.frame(data,tstar2, event2)

  if (use_strata) {
    fml2 <- as.formula(paste0("Surv(tstar2, event2)~arm+strata(", stratify, ")"))
  } else {
    fml2 <- as.formula("Surv(tstar2, event2)~arm")
  }

  Result.Cox=coxph(fml2 , data = d_add)
  Hazard.PointEstimate = exp(Result.Cox$coefficients)
  Hazard.CI = exp(confint(Result.Cox))
  #Logrank = survdiff(Surv(tstar2, event2)~arm, data = d_add) 
  #Pvalue = 1 - pchisq(Logrank$chisq, 1)
  #p-value directly from Cox model
  Cox_summary <- summary(Result.Cox)
  Pvalue <- Cox_summary$coefficients[, "Pr(>|z|)"]

  return(list("psi" = Result.RPSFT$psi, 
              "hazard"= as.numeric(Hazard.PointEstimate),
              "pvalue"= as.numeric(Pvalue),  
              data_outcome = d_add %>% select(id, tstar2, event2)))
}

# BOOTSTRAP CONFIDENCE INTERVALS FUNCTION
# Comment out the function call below to skip bootstrap (speeds up knitting)


run_bootstrap_ci <- function(data, rx = "rx1", recensor = TRUE, stratify = NULL, R = 100) {
  
  #wrapper function for boot results 
  boot_wrapper <- function(data, indices) {
    result <- validate_rpsftm(data = data, rx = rx, recensor = recensor, 
                              stratify = stratify, indices = indices)
    return(c(result$psi, result$hazard, result$pvalue))
  }
  
  set.seed(123)
  boot_results <- boot(data = data, 
                       statistic = boot_wrapper, 
                       R = R, 
                       strata = data$arm,
                       )
  
  #bootstrap CI for psi
  cat("\n=== Bootstrap CI for psi ===\n")
  ci_psi <- print(boot.ci(boot_results, index = 1, type = c("norm", "basic", "perc"), conf = 0.95))
  
  #bootstrap CI for hazard ratio
  cat("\n=== Bootstrap CI for hazard ratio ===\n")
  ci_hazard <- print(boot.ci(boot_results, index = 2, type = c("norm", "basic", "perc"), conf = 0.95))
  

  return(list(
    boot_results = boot_results, 
    ci_psi = ci_psi, 
    ci_hazard = ci_hazard
  ))
}

NOTE: External package only provides the Sstar (even time) for the untreated scenario for both the control and treatment arms. Therefore, we need to create the unswitched survival time for this package, by muliplying the untreated time with the exponent of -psi. For the control arm, the unswitched is equal to the untreated, therefore no further adjustment is needed.

4 Validation Results

4.1 One-way, with recensoring

4.1.1 shilong Dataset

4.1.1.1 fit1 (trtswitch::rpsftm()) with shilong dataset, recensor is TRUE, one-way switch scenario

fit1 <- trtswitch::rpsftm(
  shilong1, id = "id", time = "ady", event = "event",
  treat = "arm", rx = "rx1", censor_time = "dcut", 
  low_psi = -3, hi_psi = 3, boot = FALSE, recensor = TRUE)

#kaifeng's package with bootstrapping
fit1_boot <- trtswitch::rpsftm(
  shilong1, id = "id", time = "ady", event = "event",
  treat = "arm", rx = "rx1", censor_time = "dcut", 
  low_psi = -3, hi_psi = 3, boot = TRUE, recensor = TRUE)

# censor_time needs to be specified even if recensor = FALSE 
fit1_v <- validate_rpsftm(shilong1, rx = "rx1", recensor = TRUE)

# Bootstrap CI (slow - comment out when not needed)
fit1_v_boot <- run_bootstrap_ci(shilong1, rx = "rx1", recensor = TRUE, R = 50)
## Warning in root(qnorm(alpha/2)): Multiple Roots found
## Warning in root(0): Multiple Roots found
## Warning in root(qnorm(alpha/2)): Multiple Roots found
## Warning in root(qnorm(alpha/2)): Multiple Roots found
## Warning in root(qnorm(alpha/2)): Multiple Roots found
## Warning in root(qnorm(alpha/2)): Multiple Roots found
## Warning in root(qnorm(alpha/2)): Multiple Roots found
## Warning in root(0): Multiple Roots found
## Warning in rpsftm::rpsftm(fml, data = d, censor_time = dcut, low_psi = -3, :
## Evaluation of a limit of the Confidence Interval failed.  It is set to NA
## Warning in root(qnorm(alpha/2)): Multiple Roots found
## Warning in rpsftm::rpsftm(fml, data = d, censor_time = dcut, low_psi = -3, :
## Evaluation of a limit of the Confidence Interval failed.  It is set to NA
## Warning in root(qnorm(alpha/2)): Multiple Roots found
## Warning in root(qnorm(alpha/2)): Multiple Roots found
## Warning in root(0): Multiple Roots found
## Warning in root(qnorm(alpha/2)): Multiple Roots found
## Warning in root(qnorm(alpha/2)): Multiple Roots found
## Warning in root(qnorm(alpha/2)): Multiple Roots found
## Warning in root(qnorm(alpha/2)): Multiple Roots found
## Warning in root(qnorm(alpha/2)): Multiple Roots found
## Warning in root(0): Multiple Roots found
## Warning in root(qnorm(alpha/2)): Multiple Roots found
## Warning in root(qnorm(alpha/2)): Multiple Roots found
## Warning in root(qnorm(alpha/2)): Multiple Roots found
## Warning in root(qnorm(alpha/2)): Multiple Roots found
## Warning in root(0): Multiple Roots found
## Warning in root(qnorm(alpha/2)): Multiple Roots found
## Warning in root(qnorm(alpha/2)): Multiple Roots found
## Warning in root(0): Multiple Roots found
## Warning in rpsftm::rpsftm(fml, data = d, censor_time = dcut, low_psi = -3, :
## Evaluation of a limit of the Confidence Interval failed.  It is set to NA
## 
## === Bootstrap CI for psi ===
## BOOTSTRAP CONFIDENCE INTERVAL CALCULATIONS
## Based on 50 bootstrap replicates
## 
## CALL : 
## boot.ci(boot.out = boot_results, conf = 0.95, type = c("norm", 
##     "basic", "perc"), index = 1)
## 
## Intervals : 
## Level      Normal              Basic              Percentile     
## 95%   (-0.5922,  1.1268 )   (-0.3957,  1.3344 )   (-0.3703,  1.3598 )  
## Calculations and Intervals on Original Scale
## Some basic intervals may be unstable
## Some percentile intervals may be unstable
## 
## === Bootstrap CI for hazard ratio ===
## BOOTSTRAP CONFIDENCE INTERVAL CALCULATIONS
## Based on 50 bootstrap replicates
## 
## CALL : 
## boot.ci(boot.out = boot_results, conf = 0.95, type = c("norm", 
##     "basic", "perc"), index = 2)
## 
## Intervals : 
## Level      Normal              Basic              Percentile     
## 95%   (-0.649,  2.799 )   (-1.059,  2.521 )   ( 0.713,  4.292 )  
## Calculations and Intervals on Original Scale
## Some basic intervals may be unstable
## Some percentile intervals may be unstable
fit1_v_boot$ci_psi        
## BOOTSTRAP CONFIDENCE INTERVAL CALCULATIONS
## Based on 50 bootstrap replicates
## 
## CALL : 
## boot.ci(boot.out = boot_results, conf = 0.95, type = c("norm", 
##     "basic", "perc"), index = 1)
## 
## Intervals : 
## Level      Normal              Basic              Percentile     
## 95%   (-0.5922,  1.1268 )   (-0.3957,  1.3344 )   (-0.3703,  1.3598 )  
## Calculations and Intervals on Original Scale
## Some basic intervals may be unstable
## Some percentile intervals may be unstable
fit1_v_boot$ci_hazard 
## BOOTSTRAP CONFIDENCE INTERVAL CALCULATIONS
## Based on 50 bootstrap replicates
## 
## CALL : 
## boot.ci(boot.out = boot_results, conf = 0.95, type = c("norm", 
##     "basic", "perc"), index = 2)
## 
## Intervals : 
## Level      Normal              Basic              Percentile     
## 95%   (-0.649,  2.799 )   (-1.059,  2.521 )   ( 0.713,  4.292 )  
## Calculations and Intervals on Original Scale
## Some basic intervals may be unstable
## Some percentile intervals may be unstable
  • Compare fit1 and fit1_v: psi, hr, p-value
c(fit1$psi, fit1$hr, fit1$cox_pvalue) 
## [1] 0.48211548 1.61653092 0.00871751
c(fit1_v$psi, fit1_v$hazard, fit1_v$pvalue)
## [1] 0.48205087 1.61653092 0.00871751

The estimated causal parameter psi, the estimated hazard ratio and p-value from the Cox model applied to counterfactual unswitched survival times are very close between the result from trtswitch::rpsftm() and validation result using rpsftm::rpsftm() in one-way switch scenario.

  • Compare fit1 and fit1_v: counterfactual survival time, event
compare_fit1 <- fit1$data_outcome %>%
  left_join(fit1_v$data_outcome, by = "id")
ggplot(compare_fit1, aes(x = t_star, y = tstar2)) +
  geom_point() +
  geom_abline(intercept = 0, slope = 1) + 
  labs(title = "Counterfactual survival time comparison", x = "From trtswitch::rpsftm", y = "From validation")

table(compare_fit1$event2, compare_fit1$d_star) 
##    
##       0   1
##   0  64   0
##   1   0 129

Event Status Comparison for fit1:

# Compare event status between implementations
table(compare_fit1$d_star, compare_fit1$event2, 
      dnn = c("trtswitch::rpsftm event", "rpsftm::rpsftm event"))
##                        rpsftm::rpsftm event
## trtswitch::rpsftm event   0   1
##                       0  64   0
##                       1   0 129
# Calculate agreement percentage
event_agreement <- mean(compare_fit1$d_star == compare_fit1$event2)
cat("Event status agreement:", round(event_agreement * 100, 1), "%\n")
## Event status agreement: 100 %
# Show any discrepancies
event_discrepancies <- compare_fit1 %>% 
  filter(d_star != event2) %>%
  select(id, d_star, event2, t_star, tstar2)

if(nrow(event_discrepancies) > 0) {
  cat("Patients with event status discrepancies:\n")
  print(event_discrepancies)
} else {
  cat("No event status discrepancies found.\n")
}
## No event status discrepancies found.

4.1.1.2 fit1_stratified (trtswitch::rpsftm()) with shilong dataset, recensor is TRUE, one-way switch scenario, stratified by sex

fit1_stratified <- trtswitch::rpsftm(
  shilong1, id = "id", time = "ady", event = "event",
  treat = "arm", rx = "rx1", censor_time = "dcut", 
  stratum = "sex.f",
  low_psi = -3, hi_psi = 3, boot = FALSE, recensor = TRUE)

fit1_stratified_boot <- trtswitch::rpsftm(
  shilong1, id = "id", time = "ady", event = "event",
  treat = "arm", rx = "rx1", censor_time = "dcut", 
  stratum = "sex.f",
  low_psi = -3, hi_psi = 3, boot = TRUE, recensor = TRUE)

# Validation function with sex stratification
fit1_stratified_v <- validate_rpsftm(shilong1, rx = "rx1", recensor = TRUE, stratify = "sex.f")

#Validation function with stratification and bootstrapping 
fit1_stratified_v_boot <- run_bootstrap_ci(shilong1, rx = "rx1", recensor = TRUE, stratify = "sex.f", R = 100)
## Warning in root(qnorm(alpha/2)): Multiple Roots found
## Warning in root(qnorm(alpha/2)): Multiple Roots found
## Warning in root(qnorm(alpha/2)): Multiple Roots found
## Warning in root(0): Multiple Roots found
## Warning in root(0): Multiple Roots found
## Warning in root(0): Multiple Roots found
## Warning in rpsftm::rpsftm(fml, data = d, censor_time = dcut, low_psi = -3, :
## Evaluation of a limit of the Confidence Interval failed.  It is set to NA
## Warning in root(0): Multiple Roots found
## Warning in root(qnorm(alpha/2)): Multiple Roots found
## Warning in root(qnorm(alpha/2)): Multiple Roots found
## Warning in root(qnorm(alpha/2)): Multiple Roots found
## Warning in root(qnorm(alpha/2)): Multiple Roots found
## Warning in root(qnorm(alpha/2)): Multiple Roots found
## Warning in root(0): Multiple Roots found
## Warning in root(0): Multiple Roots found
## Warning in root(qnorm(alpha/2)): Multiple Roots found
## Warning in root(0): Multiple Roots found
## Warning in rpsftm::rpsftm(fml, data = d, censor_time = dcut, low_psi = -3, :
## Evaluation of a limit of the Confidence Interval failed.  It is set to NA
## Warning in root(qnorm(alpha/2)): Multiple Roots found
## Warning in root(0): Multiple Roots found
## Warning in root(qnorm(alpha/2)): Multiple Roots found
## Warning in root(qnorm(alpha/2)): Multiple Roots found
## Warning in root(qnorm(alpha/2)): Multiple Roots found
## Warning in root(qnorm(alpha/2)): Multiple Roots found
## Warning in root(qnorm(alpha/2)): Multiple Roots found
## Warning in root(qnorm(alpha/2)): Multiple Roots found
## Warning in root(qnorm(alpha/2)): Multiple Roots found
## Warning in root(qnorm(alpha/2)): Multiple Roots found
## Warning in root(qnorm(alpha/2)): Multiple Roots found
## Warning in root(qnorm(alpha/2)): Multiple Roots found
## Warning in root(qnorm(alpha/2)): Multiple Roots found
## Warning in root(qnorm(alpha/2)): Multiple Roots found
## Warning in root(qnorm(alpha/2)): Multiple Roots found
## Warning in root(0): Multiple Roots found
## Warning in root(qnorm(alpha/2)): Multiple Roots found
## Warning in root(qnorm(alpha/2)): Multiple Roots found
## Warning in root(qnorm(alpha/2)): Multiple Roots found
## Warning in root(qnorm(alpha/2)): Multiple Roots found
## 
## === Bootstrap CI for psi ===
## BOOTSTRAP CONFIDENCE INTERVAL CALCULATIONS
## Based on 100 bootstrap replicates
## 
## CALL : 
## boot.ci(boot.out = boot_results, conf = 0.95, type = c("norm", 
##     "basic", "perc"), index = 1)
## 
## Intervals : 
## Level      Normal              Basic              Percentile     
## 95%   (-0.3966,  1.1487 )   (-0.3154,  1.2877 )   (-0.4033,  1.1998 )  
## Calculations and Intervals on Original Scale
## Some basic intervals may be unstable
## Some percentile intervals may be unstable
## 
## === Bootstrap CI for hazard ratio ===
## BOOTSTRAP CONFIDENCE INTERVAL CALCULATIONS
## Based on 100 bootstrap replicates
## 
## CALL : 
## boot.ci(boot.out = boot_results, conf = 0.95, type = c("norm", 
##     "basic", "perc"), index = 2)
## 
## Intervals : 
## Level      Normal              Basic              Percentile     
## 95%   (-0.372,  2.881 )   (-0.940,  2.374 )   ( 0.744,  4.057 )  
## Calculations and Intervals on Original Scale
## Some basic intervals may be unstable
## Some percentile intervals may be unstable
  • Compare fit1_stratified and fit1_stratified_v: psi, hr, p-value
c(fit1_stratified$psi, fit1_stratified$hr, fit1_stratified$cox_pvalue) 
## [1] 0.44211921 1.55891399 0.01564371
c(fit1_stratified_v$psi, fit1_stratified_v$hazard, fit1_stratified_v$pvalue)
## [1] 0.44220972 1.55891399 0.01564371
compare_fit1_stratified <- fit1_stratified$data_outcome %>%
  left_join(fit1_stratified_v$data_outcome, by = "id")
ggplot(compare_fit1_stratified, aes(x = t_star, y = tstar2)) +
  geom_point() +
  geom_abline(intercept = 0, slope = 1) + 
  labs(title = "Counterfactual survival time comparison (stratified by sex)", 
       x = "From trtswitch::rpsftm", y = "From validation")

table(compare_fit1_stratified$event2, compare_fit1_stratified$d_star)
##    
##       0   1
##   0  64   0
##   1   0 129

Event Status Comparison for fit1_stratified:

# Compare event status between implementations
table(compare_fit1_stratified$d_star, compare_fit1_stratified$event2, 
      dnn = c("trtswitch::rpsftm event", "rpsftm::rpsftm event"))
##                        rpsftm::rpsftm event
## trtswitch::rpsftm event   0   1
##                       0  64   0
##                       1   0 129
# Calculate agreement percentage
event_agreement <- mean(compare_fit1_stratified$d_star == compare_fit1_stratified$event2)
cat("Event status agreement:", round(event_agreement * 100, 1), "%\n")
## Event status agreement: 100 %
# Show any discrepancies
event_discrepancies <- compare_fit1_stratified %>% 
  filter(d_star != event2) %>%
  select(id, d_star, event2, t_star, tstar2)

if(nrow(event_discrepancies) > 0) {
  cat("Patients with event status discrepancies:\n")
  print(event_discrepancies)
} else {
  cat("No event status discrepancies found.\n")
}
## No event status discrepancies found.

4.1.2 immdef Dataset

4.1.2.1 (trtswitch::rpsftm()) with immdef dataset, recensor = TRUE, one-way switch

# RUN TRTSWITCH RPSFTM ON IMMDEF DATASET
fit5 <- trtswitch::rpsftm(
  immdef1, id = "id", time = "progyrs", event = "prog", 
  treat = "imm", rx = "rx1", censor_time = "censyrs", 
  low_psi = -3, hi_psi = 3, boot = FALSE, recensor = TRUE)

# RUN REFERENCE IMPLEMENTATION ON IMMDEF DATASET
# First need to prepare data with correct column names for validation function
immdef_rpsftm <- immdef1 %>%
  transmute(
    id = id,
    ady = progyrs, 
    event = prog, 
    arm = imm, 
    dcut = censyrs,
    rx1 = rx1
  )

fit5_v <- validate_rpsftm(immdef_rpsftm, rx = "rx1", recensor = TRUE)

Compare RPSFTM results on immdef dataset: psi, hazard ratio, p-value

c(fit5$psi, fit5$hr, fit5$cox_pvalue) 
## [1] -0.18117698  0.76109922  0.02408344
c(fit5_v$psi, fit5_v$hazard, fit5_v$pvalue)
## [1] -0.18145378  0.76834346  0.02967807

Compare counterfactual survival times and events

compare_fit5 <- fit5$data_outcome %>%
  left_join(fit5_v$data_outcome, by = "id")

# PLOT COMPARISON OF COUNTERFACTUAL SURVIVAL TIMES
ggplot(compare_fit5, aes(x = t_star, y = tstar2)) +
  geom_point() +
  geom_abline(intercept = 0, slope = 1) + 
  labs(title = "Counterfactual survival time comparison - immdef dataset", 
       x = "From trtswitch::rpsftm", 
       y = "From validation")

# COMPARE EVENT INDICATORS
table(compare_fit5$event2, compare_fit5$d_star) 
##    
##       0   1
##   0 714   1
##   1   0 285

Event Status Comparison for fit5:

# Compare event status between implementations
table(compare_fit5$d_star, compare_fit5$event2, 
      dnn = c("trtswitch::rpsftm event", "rpsftm::rpsftm event"))
##                        rpsftm::rpsftm event
## trtswitch::rpsftm event   0   1
##                       0 714   0
##                       1   1 285
# Calculate agreement percentage
event_agreement <- mean(compare_fit5$d_star == compare_fit5$event2)
cat("Event status agreement:", round(event_agreement * 100, 1), "%\n")
## Event status agreement: 99.9 %
# Show any discrepancies
event_discrepancies <- compare_fit5 %>% 
  filter(d_star != event2) %>%
  select(id, d_star, event2, t_star, tstar2)

if(nrow(event_discrepancies) > 0) {
  cat("Patients with event status discrepancies:\n")
  print(event_discrepancies)
} else {
  cat("No event status discrepancies found.\n")
}
## Patients with event status discrepancies:
##    id d_star event2   t_star   tstar2
## 1 145      1      0 2.336004 2.335359

Check results for patients who DIDN’T switch (immdef dataset)

compare_fit5_nonswt <- compare_fit5 %>% 
  left_join(immdef1 %>% select(id, rx1, progyrs), by = "id") %>% 
  filter((rx1 == 1 & imm == 1) | (rx1 == 0 & imm == 0))
head(compare_fit5_nonswt, n = 5)
##   id    t_star d_star treated imm    tstar2 event2 rx1   progyrs
## 1  1 3.0000000      0       1   1 3.0000000      0   1 3.0000000
## 2  3 1.7378377      1       1   1 1.7378377      1   1 1.7378377
## 3  4 2.1662905      1       1   1 2.1662905      1   1 2.1662905
## 4  7 2.1894703      1       0   0 2.1894703      1   0 2.1894703
## 5  8 0.9226239      1       1   1 0.9226239      1   1 0.9226239

Check results for patients who DID switch (immdef dataset)

compare_fit5_swt <- compare_fit5 %>% 
  left_join(immdef1 %>% select(id, rx1, progyrs, xoyrs), by = "id") %>% 
  filter(rx1 > 0 & imm == 0)
head(compare_fit5_swt, n = 5)
##   id   t_star d_star treated imm   tstar2 event2       rx1  progyrs     xoyrs
## 1  2 2.502863      0       0   0 2.502170      0 0.1157343 3.000000 2.6527972
## 2  5 2.502863      0       0   0 2.502170      0 0.2643466 2.884646 2.1220999
## 3  6 2.502863      0       0   0 2.502170      0 0.8142027 3.000000 0.5573920
## 4 13 2.502863      0       0   0 2.502170      0 0.4089745 3.000000 1.7730765
## 5 19 2.023826      1       0   0 2.023395      1 0.8006016 2.333397 0.4652756

Mathematical Check for immdef dataset: Verify the relationship between counterfactual time and psi parameter.

if(nrow(compare_fit5_swt) > 0) {
  # Take first patient who switched as an example
  compare_fit5_example <- compare_fit5_swt %>% slice(1)
  
  # Calculate the scaling factor applied to the post-switch period
  scaling_factor <- (compare_fit5_example$t_star - compare_fit5_example$xoyrs) / 
                   (compare_fit5_example$progyrs - compare_fit5_example$xoyrs)
  
  cat("Scaling factor for patient", compare_fit5_example$id, ":", scaling_factor, "\n")
  cat("Expected value exp(psi):", exp(fit5$psi), "\n")
  cat("Difference:", abs(scaling_factor - exp(fit5$psi)), "\n")
} else {
  cat("No patients switched in the immdef dataset")
}
## Scaling factor for patient 2 : -0.4318344 
## Expected value exp(psi): 0.8342877 
## Difference: 1.266122

4.1.3 Anonymised Trial Dataset

4.1.3.1 (trtswitch::rpsftm()) with internal_data dataset, recensor = TRUE, one-way switch

# RUN TRTSWITCH RPSFTM ON INTERNAL_DATA DATASET
fit7 <- trtswitch::rpsftm(
  anonymised_trial_data, id = "id", time = "survtime", event = "status", 
  treat = "arm", rx = "rx1", censor_time = "dcut", 
  low_psi = -3, hi_psi = 3, boot = FALSE, recensor = TRUE)

# RUN REFERENCE IMPLEMENTATION ON INTERNAL_DATA DATASET
#First need to prepare data with correct column names for validation function
internal_data_rpsftm <- anonymised_trial_data %>%
  transmute(
   id = id,
   ady = ady, 
   event = event, 
  arm = arm, 
   dcut = dcut,
   rx1 = rx1
  )

fit7_v <- validate_rpsftm(internal_data_rpsftm, rx = "rx1", recensor = TRUE)

Compare RPSFTM results on internal_data dataset: psi, hazard ratio, p-value

c(fit7$psi, fit7$hr, fit7$cox_pvalue) 
## [1] -0.46988825  0.69457888  0.05584737
c(fit7_v$psi, fit7_v$hazard, fit7_v$pvalue)
## [1] -0.47007378  0.69457888  0.05584737

Compare counterfactual survival times and events

compare_fit7 <- fit7$data_outcome %>%
  left_join(fit7_v$data_outcome, by = "id")

# PLOT COMPARISON OF COUNTERFACTUAL SURVIVAL TIMES
ggplot(compare_fit7, aes(x = t_star, y = tstar2)) +
  geom_point() +
  geom_abline(intercept = 0, slope = 1) + 
  labs(title = "Counterfactual survival time comparison - internal_data dataset", 
       x = "From trtswitch::rpsftm", 
      y = "From validation")

# COMPARE EVENT INDICATORS
table(compare_fit7$event2, compare_fit7$d_star) 
##    
##       0   1
##   0  67   0
##   1   0 113

Event Status Comparison for fit7:

# Compare event status between implementations
table(compare_fit7$d_star, compare_fit7$event2, 
      dnn = c("trtswitch::rpsftm event", "rpsftm::rpsftm event"))
##                        rpsftm::rpsftm event
## trtswitch::rpsftm event   0   1
##                       0  67   0
##                       1   0 113
# Calculate agreement percentage
event_agreement <- mean(compare_fit7$d_star == compare_fit7$event2)
cat("Event status agreement:", round(event_agreement * 100, 1), "%\n")
## Event status agreement: 100 %
# Show any discrepancies
event_discrepancies <- compare_fit7 %>% 
  filter(d_star != event2) %>%
  select(id, d_star, event2, t_star, tstar2)

if(nrow(event_discrepancies) > 0) {
  cat("Patients with event status discrepancies:\n")
  print(event_discrepancies)
} else {
  cat("No event status discrepancies found.\n")
}
## No event status discrepancies found.

Check results for patients who DIDN’T switch (internal_data dataset)

compare_fit7_nonswt <- compare_fit7 %>% 
  left_join(anonymised_trial_data %>% select(id, rx1, ady), by = "id") %>% 
  filter((rx1 == 1 & arm == 1) | (rx1 == 0 & arm == 0))
head(compare_fit7_nonswt, n = 5)
##      id    t_star d_star treated arm    tstar2 event2 rx1       ady
## 1 10067 26.670376      1       1   1 26.670376      1   1 26.670376
## 2 10075 23.769947      0       1   1 23.769947      0   1 23.769947
## 3 10109 40.926910      0       1   1 40.926910      0   1 40.926910
## 4 10072  3.373891      1       1   1  3.373891      1   1  3.373891
## 5 10031 24.297373      1       1   1 24.297373      1   1 24.297373

Check results for patients who DID switch (internal_data dataset)

compare_fit7_swt <- compare_fit7 %>% 
  left_join(anonymised_trial_data %>% select(id, rx1, ady, xoyrs), by = "id") %>% 
 filter(rx1 > 0 & arm == 0)
head(compare_fit7_swt, n = 5)
##      id   t_star d_star treated arm   tstar2 event2       rx1      ady   xoyrs
## 1 10191 21.72202      0       0   0 21.72043      0 0.5108907 26.86863 13.1417
## 2 10032 18.16075      1       0   0 18.15894      1 0.6496853 24.00899  8.4107
## 3 10015 13.30396      1       0   0 13.30310      1 0.4606494 16.08138  8.6735
## 4 10022 33.07687      0       0   0 33.07414      0 0.5616927 41.90097 18.3655
## 5 10126 13.14653      1       0   0 13.14583      1 0.3926068 15.41572  9.3634

Mathematical Check for internal_data dataset: Verify the relationship between counterfactual time and psi parameter.

if(nrow(compare_fit7_swt) > 0) {
  # Take first patient who switched as an example
  compare_fit7_example <- compare_fit7_swt %>% slice(1)
  
  # Calculate the scaling factor applied to the post-switch period
  scaling_factor <- (compare_fit7_example$t_star - compare_fit7_example$xoyrs) / 
                  (compare_fit7_example$ady - compare_fit7_example$xoyrs)
  
 cat("Scaling factor for patient", compare_fit7_example$id, ":", scaling_factor, "\n")
cat("Expected value exp(psi):", exp(fit7$psi), "\n")
 cat("Difference:", abs(scaling_factor - exp(fit7$psi)), "\n")
} else {
 cat("No patients switched in the anonymised_trial_data dataset")
}
## Scaling factor for patient 10191 : 0.6250721 
## Expected value exp(psi): 0.6250721 
## Difference: 1.110223e-16

4.2 One-way, without recensoring

4.2.1 shilong Dataset

4.2.1.1 fit2 (trtswitch::rpsftm()) with shilong dataset, recensor = FALSE, one-way switch scenario

fit2 <- trtswitch::rpsftm(
  shilong1, id = "id", time = "ady", event = "event",
  treat = "arm", rx = "rx1", censor_time = "dcut", 
  low_psi = -3, hi_psi = 3, boot = FALSE, recensor = FALSE)
fit2_v<- validate_rpsftm(shilong1, rx = "rx1", recensor = FALSE) 
## Warning in rpsftm::rpsftm(fml, data = d, low_psi = -3, hi_psi = 3): Evaluation
## of a limit of the Confidence Interval failed.  It is set to NA
  • Compare fit2 and fit2_v: psi, hr, pvalue
c(fit2$psi, fit2$hr, fit2$cox_pvalue) 
## [1] 0.578078205 1.700890346 0.003967654
c(fit2_v$psi, fit2_v$hazard, fit2_v$pvalue)
## [1] 0.578095035 1.700890346 0.003967654

Similar as model with recensor, results of model without recensoring are also very comparable between trtswitch::rpsftm() and rpsftm::rpsftm() for one-way switch. Improvement suggestion for trtswitch::rpsftm(): parameter censor_time is required even recensor is FALSE.

  • Compare fit2 and fit2_v: counterfactual survival time, event
compare_fit2 <- fit2$data_outcome %>%
  left_join(fit2_v$data_outcome, by = "id")
ggplot(compare_fit2, aes(x = t_star, y = tstar2)) +
  geom_point() +
  geom_abline(intercept = 0, slope = 1) + 
  labs(title = "Counterfactual survival time comparison", x = "From trtswitch::rpsftm", y = "From validation")

table(compare_fit2$event2, compare_fit2$d_star) 
##    
##       0   1
##   0  63   0
##   1   0 130

Event Status Comparison for fit2:

# Compare event status between implementations
table(compare_fit2$d_star, compare_fit2$event2, 
      dnn = c("trtswitch::rpsftm event", "rpsftm::rpsftm event"))
##                        rpsftm::rpsftm event
## trtswitch::rpsftm event   0   1
##                       0  63   0
##                       1   0 130
# Calculate agreement percentage
event_agreement <- mean(compare_fit2$d_star == compare_fit2$event2)
cat("Event status agreement:", round(event_agreement * 100, 1), "%\n")
## Event status agreement: 100 %
# Show any discrepancies
event_discrepancies <- compare_fit2 %>% 
  filter(d_star != event2) %>%
  select(id, d_star, event2, t_star, tstar2)

if(nrow(event_discrepancies) > 0) {
  cat("Patients with event status discrepancies:\n")
  print(event_discrepancies)
} else {
  cat("No event status discrepancies found.\n")
}
## No event status discrepancies found.

4.2.1.2 fit2 vs. fit2_v: Compare counterfactual survival time to original survival time for subjects who didn’t switch.

compare_fit2_nonswt <- compare_fit2 %>% left_join(shilong1 %>% select(id, rx1, ady), by = "id") %>% filter(rx1 == 1)
head(compare_fit2_nonswt, n = 5)
##   id t_star d_star treated arm tstar2 event2 rx1 ady
## 1  2     64      1       1   1     64      1   1  64
## 2  4    156      1       1   1    156      1   1 156
## 3  5    436      0       1   1    436      0   1 436
## 4  6    500      1       1   1    500      1   1 500
## 5 12    139      1       1   1    139      1   1 139

In one-way switch scenario, counterfactual survival time (t_star from trtswitch::rpsftm() and tstar2 from rpsftm::rpsftm()) for subjects who didn’t switch (treatment arm in the shilong dataset) is the same as the original survival time ady, which is expected.

4.2.1.3 fit2 vs. fit2_v: Compare counterfactual survival time to original survival time for subjects who switched.

compare_fit2_swt <- compare_fit2 %>% left_join(shilong1 %>% select(id, rx1, ady, dco), by = "id") %>% filter(rx1 < 1)
head(compare_fit2_swt, n = 5)
##   id   t_star d_star treated arm   tstar2 event2       rx1 ady dco
## 1  1 234.2175      1       0   0 234.2209      1 0.7862069 145  31
## 2  3 412.2175      1       0   0 412.2223      1 0.5574913 287 127
## 3  8 815.2611      0       0   0 815.2738      0 0.8701031 485  63
## 4  9 560.3480      1       0   0 560.3564      1 0.8157895 342  63
## 5 10  37.0000      1       0   0  37.0000      1 0.0000000  37  NA

In one-way switch scenario, counterfactual survival time t_star from trtswitch::rpsftm() and tstar2 from rpsftm::rpsftm() are similar. Both of the counterfactual survival times for subjects who switched (control arm in the shilong dataset) are larger than the original survival time ady. Take subject id = 1 as an example to check how counterfactual survival time is related to the estimated psi from trtswitch::rpsftm().

compare_fit2_1 <- compare_fit2_swt %>% filter(id == "1") 
(compare_fit2_1$t_star- compare_fit2_1$dco)/(compare_fit2_1$ady - compare_fit2_1$dco)
## [1] 1.782609

equals to exp(-fit2$psi)

exp(fit2$psi) 
## [1] 1.782609

4.2.1.4 fit2_stratified (trtswitch::rpsftm()) with shilong dataset, recensor = FALSE, one-way switch scenario, stratified by sex

fit2_stratified <- trtswitch::rpsftm(
  shilong1, id = "id", time = "ady", event = "event",
  treat = "arm", rx = "rx1", censor_time = "dcut", 
  stratum = "sex.f",
  low_psi = -3, hi_psi = 3, boot = FALSE, recensor = FALSE)

fit2_stratified_v <- validate_rpsftm(shilong1, rx = "rx1", recensor = FALSE, stratify = "sex.f")
## Warning in rpsftm::rpsftm(fml, data = d, low_psi = -3, hi_psi = 3): Evaluation
## of a limit of the Confidence Interval failed.  It is set to NA
  • Compare fit2_stratified and fit2_stratified_v: psi, hr, pvalue
c(fit2_stratified$psi, fit2_stratified$hr, fit2_stratified$cox_pvalue) 
## [1] 0.47957342 1.59457623 0.01139537
c(fit2_stratified_v$psi, fit2_stratified_v$hazard, fit2_stratified_v$pvalue)
## [1] 0.48000000 1.59457623 0.01139537
  • Compare fit2_stratified and fit2_stratified_v: counterfactual survival time, event
compare_fit2_stratified <- fit2_stratified$data_outcome %>%
  left_join(fit2_stratified_v$data_outcome, by = "id")
ggplot(compare_fit2_stratified, aes(x = t_star, y = tstar2)) +
  geom_point() +
  geom_abline(intercept = 0, slope = 1) + 
  labs(title = "Counterfactual survival time comparison (stratified by sex)", 
       x = "From trtswitch::rpsftm", y = "From validation")

table(compare_fit2_stratified$event2, compare_fit2_stratified$d_star)
##    
##       0   1
##   0  63   0
##   1   0 130
  • Event Status Comparison for fit2_stratified:
# Compare event status between implementations
table(compare_fit2_stratified$d_star, compare_fit2_stratified$event2, 
      dnn = c("trtswitch::rpsftm event", "rpsftm::rpsftm event"))
##                        rpsftm::rpsftm event
## trtswitch::rpsftm event   0   1
##                       0  63   0
##                       1   0 130
# Calculate agreement percentage
event_agreement <- mean(compare_fit2_stratified$d_star == compare_fit2_stratified$event2)
cat("Event status agreement:", round(event_agreement * 100, 1), "%\n")
## Event status agreement: 100 %
# Show any discrepancies
event_discrepancies <- compare_fit2_stratified %>% 
  filter(d_star != event2) %>%
  select(id, d_star, event2, t_star, tstar2)

if(nrow(event_discrepancies) > 0) {
  cat("Patients with event status discrepancies:\n")
  print(event_discrepancies)
} else {
  cat("No event status discrepancies found.\n")
}
## No event status discrepancies found.

4.2.2 immdef Dataset

4.2.2.1 fit6 (trtswitch::rpsftm()) with immdef dataset, recensor = FALSE, one-way switch

# RUN TRTSWITCH RPSFTM ON IMMDEF DATASET
fit6 <- trtswitch::rpsftm(
  immdef1, id = "id", time = "progyrs", event = "prog", 
  treat = "imm", rx = "rx1", censor_time = "censyrs", 
  low_psi = -3, hi_psi = 3, boot = FALSE, recensor = FALSE)

# RUN REFERENCE IMPLEMENTATION ON IMMDEF DATASET
fit6_v <- validate_rpsftm(immdef_rpsftm, rx = "rx1", recensor = FALSE)

Compare RPSFTM results on immdef dataset: psi, hazard ratio, p-value

c(fit6$psi, fit6$hr, fit6$cox_pvalue) 
## [1] -0.18505941  0.76677041  0.01972649
c(fit6_v$psi, fit6_v$hazard, fit6_v$pvalue)
## [1] -0.18482978  0.76680675  0.01974827

Compare counterfactual survival times and events

compare_fit6 <- fit6$data_outcome %>%
  left_join(fit6_v$data_outcome, by = "id")

# PLOT COMPARISON OF COUNTERFACTUAL SURVIVAL TIMES
ggplot(compare_fit6, aes(x = t_star, y = tstar2)) +
  geom_point() +
  geom_abline(intercept = 0, slope = 1) + 
  labs(title = "Counterfactual survival time comparison - immdef dataset", 
       x = "From trtswitch::rpsftm", 
       y = "From validation")

# COMPARE EVENT INDICATORS
table(compare_fit6$event2, compare_fit6$d_star) 
##    
##       0   1
##   0 688   0
##   1   0 312

Event Status Comparison for fit6:

# Compare event status between implementations
table(compare_fit6$d_star, compare_fit6$event2, 
      dnn = c("trtswitch::rpsftm event", "rpsftm::rpsftm event"))
##                        rpsftm::rpsftm event
## trtswitch::rpsftm event   0   1
##                       0 688   0
##                       1   0 312
# Calculate agreement percentage
event_agreement <- mean(compare_fit6$d_star == compare_fit6$event2)
cat("Event status agreement:", round(event_agreement * 100, 1), "%\n")
## Event status agreement: 100 %
# Show any discrepancies
event_discrepancies <- compare_fit6 %>% 
  filter(d_star != event2) %>%
  select(id, d_star, event2, t_star, tstar2)

if(nrow(event_discrepancies) > 0) {
  cat("Patients with event status discrepancies:\n")
  print(event_discrepancies)
} else {
  cat("No event status discrepancies found.\n")
}
## No event status discrepancies found.

Check results for patients who DIDN’T switch (immdef dataset)

compare_fit6_nonswt <- compare_fit6 %>% 
  left_join(immdef1 %>% select(id, rx1, progyrs), by = "id") %>% 
  filter((rx1 == 1 & imm == 1) | (rx1 == 0 & imm == 0))
head(compare_fit6_nonswt, n = 5)
##   id    t_star d_star treated imm    tstar2 event2 rx1   progyrs
## 1  1 3.0000000      0       1   1 3.0000000      0   1 3.0000000
## 2  3 1.7378377      1       1   1 1.7378377      1   1 1.7378377
## 3  4 2.1662905      1       1   1 2.1662905      1   1 2.1662905
## 4  7 2.1894703      1       0   0 2.1894703      1   0 2.1894703
## 5  8 0.9226239      1       1   1 0.9226239      1   1 0.9226239

Check results for patients who DID switch (immdef dataset)

compare_fit6_swt <- compare_fit6 %>% 
  left_join(immdef1 %>% select(id, rx1, progyrs, xoyrs), by = "id") %>% 
  filter(rx1 > 0 & imm == 0)
head(compare_fit6_swt, n = 5)
##   id   t_star d_star treated imm   tstar2 event2       rx1  progyrs     xoyrs
## 1  2 2.941342      0       0   0 2.941408      0 0.1157343 3.000000 2.6527972
## 2  5 2.755818      1       0   0 2.755963      1 0.2643466 2.884646 2.1220999
## 3  6 2.587333      0       0   0 2.587800      0 0.8142027 3.000000 0.5573920
## 4 13 2.792717      0       0   0 2.792951      0 0.4089745 3.000000 1.7730765
## 5 19 2.017787      1       0   0 2.018144      1 0.8006016 2.333397 0.4652756

Mathematical Check for immdef dataset: Verify the relationship between counterfactual time and psi parameter.

if(nrow(compare_fit6_swt) > 0) {
  # Take first patient who switched as an example
  compare_fit6_example <- compare_fit6_swt %>% slice(1)
  
  # Calculate the scaling factor applied to the post-switch period
  scaling_factor <- (compare_fit6_example$t_star - compare_fit6_example$xoyrs) / 
                   (compare_fit6_example$progyrs - compare_fit6_example$xoyrs)
  
  cat("Scaling factor for patient", compare_fit6_example$id, ":", scaling_factor, "\n")
  cat("Expected value exp(psi):", exp(fit6$psi), "\n")
  cat("Difference:", abs(scaling_factor - exp(fit6$psi)), "\n")
} else {
  cat("No patients switched in the immdef dataset")
}
## Scaling factor for patient 2 : 0.8310549 
## Expected value exp(psi): 0.8310549 
## Difference: 0

4.2.3 anonymised_trial_data Dataset

4.2.3.1 fit8 (trtswitch::rpsftm()) with internal_data dataset, recensor = FALSE, one-way switch

# RUN TRTSWITCH RPSFTM ON INTERNAL_DATA DATASET
fit8 <- trtswitch::rpsftm(
  anonymised_trial_data, id = "id", time = "survtime", event = "status", 
  treat = "arm", rx = "rx1", censor_time = "ady", 
  low_psi = -3, hi_psi = 3, boot = FALSE, recensor = FALSE)

# RUN REFERENCE IMPLEMENTATION ON INTERNAL_DATA DATASET
fit8_v <- validate_rpsftm(internal_data_rpsftm, rx = "rx1", recensor = FALSE)

Compare RPSFTM results on internal_data dataset: psi, hazard ratio, p-value

c(fit8$psi, fit8$hr, fit8$cox_pvalue) 
## [1] -0.47224404  0.69359797  0.05494223
c(fit8_v$psi, fit8_v$hazard, fit8_v$pvalue)
## [1] -0.47007378  0.69457888  0.05584737

Compare counterfactual survival times and events

compare_fit8 <- fit8$data_outcome %>%
 left_join(fit8_v$data_outcome, by = "id")

# PLOT COMPARISON OF COUNTERFACTUAL SURVIVAL TIMES
ggplot(compare_fit8, aes(x = t_star, y = tstar2)) +
 geom_point() +
 geom_abline(intercept = 0, slope = 1) + 
 labs(title = "Counterfactual survival time comparison - internal_data dataset", 
 x = "From trtswitch::rpsftm", 
 y = "From validation")

# COMPARE EVENT INDICATORS
table(compare_fit8$event2, compare_fit8$d_star) 
##    
##       0   1
##   0  67   0
##   1   0 113

Event Status Comparison for fit8:

# Compare event status between implementations
table(compare_fit8$d_star, compare_fit8$event2, 
      dnn = c("trtswitch::rpsftm event", "rpsftm::rpsftm event"))
##                        rpsftm::rpsftm event
## trtswitch::rpsftm event   0   1
##                       0  67   0
##                       1   0 113
# Calculate agreement percentage
event_agreement <- mean(compare_fit8$d_star == compare_fit8$event2)
cat("Event status agreement:", round(event_agreement * 100, 1), "%\n")
## Event status agreement: 100 %
# Show any discrepancies
event_discrepancies <- compare_fit8 %>% 
  filter(d_star != event2) %>%
  select(id, d_star, event2, t_star, tstar2)

if(nrow(event_discrepancies) > 0) {
  cat("Patients with event status discrepancies:\n")
  print(event_discrepancies)
} else {
  cat("No event status discrepancies found.\n")
}
## No event status discrepancies found.

Check results for patients who DIDN’T switch (internal_data dataset)

compare_fit8_nonswt <- compare_fit8 %>% 
left_join(anonymised_trial_data %>% select(id, rx1, ady), by = "id") %>% 
filter((rx1 == 1 & arm == 1) | (rx1 == 0 & arm == 0))
head(compare_fit8_nonswt, n = 5)
##      id    t_star d_star treated arm    tstar2 event2 rx1       ady
## 1 10067 26.670376      1       1   1 26.670376      1   1 26.670376
## 2 10075 23.769947      0       1   1 23.769947      0   1 23.769947
## 3 10109 40.926910      0       1   1 40.926910      0   1 40.926910
## 4 10072  3.373891      1       1   1  3.373891      1   1  3.373891
## 5 10031 24.297373      1       1   1 24.297373      1   1 24.297373

Check results for patients who DID switch (internal_data dataset)

compare_fit8_swt <- compare_fit8 %>% 
left_join(anonymised_trial_data %>% select(id, rx1, ady, xoyrs), by = "id") %>% 
filter(rx1 > 0 & arm == 0)
head(compare_fit8_swt, n = 5)
##      id   t_star d_star treated arm   tstar2 event2       rx1      ady   xoyrs
## 1 10191 21.70183      0       0   0 21.72043      0 0.5108907 26.86863 13.1417
## 2 10032 18.13781      1       0   0 18.15894      1 0.6496853 24.00899  8.4107
## 3 10015 13.29306      1       0   0 13.30310      1 0.4606494 16.08138  8.6735
## 4 10022 33.04225      0       0   0 33.07414      0 0.5616927 41.90097 18.3655
## 5 10126 13.13763      1       0   0 13.14583      1 0.3926068 15.41572  9.3634

Mathematical Check for anonymised_trial_data: Verify the relationship between counterfactual time and psi parameter.

if(nrow(compare_fit8_swt) > 0) {
  #Take first patient who switched as an example
compare_fit8_example <- compare_fit8_swt %>% slice(1)
  
  # Calculate the scaling factor applied to the post-switch period
scaling_factor <- (compare_fit8_example$t_star - compare_fit8_example$xoyrs) / 
                  (compare_fit8_example$ady - compare_fit8_example$xoyrs)
  
cat("Scaling factor for patient", compare_fit8_example$id, ":", scaling_factor, "\n")
cat("Expected value exp(psi):", exp(fit8$psi), "\n")
cat("Difference:", abs(scaling_factor - exp(fit8$psi)), "\n")
} else {
  cat("No patients switched in the internal_data dataset")
}
## Scaling factor for patient 10191 : 0.6236013 
## Expected value exp(psi): 0.6236013 
## Difference: 1.110223e-16

4.3 Two-way, with recensoring

4.3.1 Shilong Dataset

4.3.1.1 fit3 (trtswitch::rpsftm()) with shilong dataset, recensor = TRUE, two-way switch

fit3 <- trtswitch::rpsftm(
  shilong1, id = "id", time = "tstop", event = "event",
  treat = "arm", rx = "rx", censor_time = "dcut",
  low_psi = -3, hi_psi = 3, boot = FALSE, recensor = TRUE)
fit3_v <- validate_rpsftm(shilong1, rx = "rx", recensor = TRUE)
  • Compare fit3 and fit3_v: psi, hr, pvalue
c(fit3$psi, fit3$hr, fit3$cox_pvalue) 
## [1] 1.007842e+00 2.721078e+00 3.648658e-06
c(fit3_v$psi, fit3_v$hazard, fit3_v$pvalue)
## [1] 1.007986e+00 2.722693e+00 3.608050e-06

In the two-way switch scenario, the estimated causal parameter psi = 1.0078423 from trtswitch::rpsftm() is close to psi = 1.007986 from rpsftm::rpsftm(). The estimated hazard ratio and p-value from the Cox model applied to counterfactual unswitched survival times are now very close between fit3 and fit3_v after validation function correction.

  • Compare fit3 and fit3_v: counterfactual survival time
compare_fit3 <- fit3$data_outcome %>%
  left_join(fit3_v$data_outcome, by = "id") %>%
  select(id, t_star, arm, tstar2, d_star, event2) %>% left_join(shilong1 %>% select(id, ady, dco, rx, status))
## Joining with `by = join_by(id)`
ggplot(compare_fit3, aes(x = t_star, y = tstar2, color = factor(arm))) +
  geom_point() +
  geom_abline(intercept = 0, slope = 1) + 
  labs(title = "Counterfactual survival time comparison", x = "From trtswitch::rpsftm", y = "From validation")

Event Status Comparison for fit3:

# Compare event status between implementations
table(compare_fit3$d_star, compare_fit3$event2, 
      dnn = c("trtswitch::rpsftm event", "rpsftm::rpsftm event"))
##                        rpsftm::rpsftm event
## trtswitch::rpsftm event   0   1
##                       0  73   0
##                       1   0 120
# Calculate agreement percentage
event_agreement <- mean(compare_fit3$d_star == compare_fit3$event2)
cat("Event status agreement:", round(event_agreement * 100, 1), "%\n")
## Event status agreement: 100 %
# Show any discrepancies
event_discrepancies <- compare_fit3 %>% 
  filter(d_star != event2) %>%
  select(id, d_star, event2, t_star, tstar2)

if(nrow(event_discrepancies) > 0) {
  cat("Patients with event status discrepancies:\n")
  print(event_discrepancies)
} else {
  cat("No event status discrepancies found.\n")
}
## No event status discrepancies found.

4.3.1.2 fit3 vs. fit3_v: Counterfactual survival time for subjects who didn’t switch.

compare_fit3_nonswt <- compare_fit3 %>% filter((arm == 0 & rx == 0) | (arm == 1 & rx == 1))
ggplot(compare_fit3_nonswt, aes(x = t_star, y = tstar2, color = factor(arm))) +
  geom_point() +
  geom_abline(intercept = 0, slope = 1) + 
  labs(title = "Counterfactual survival time comparison for subjects who didn't switch", x = "From trtswitch::rpsftm", y = "From validation")

compare_fit3_nonswt %>% mutate(delta = t_star - tstar2) %>% filter(abs(delta) > 1e-10)                                                                                            
##     id   t_star arm   tstar2 d_star event2 ady dco rx status      delta
## 1   56 210.6083   1 210.5780      0      0 286  NA  1      0 0.03026483
## 2   60 210.2433   1 210.2131      0      0 280  NA  1      1 0.03021238
## 3   72 392.0161   1 391.9598      0      0 510  NA  1      1 0.05633350
## 4   99 329.9652   1 329.9177      0      0 386  NA  1      0 0.04741665
## 5  120 314.9999   1 314.9547      0      0 315  NA  1      1 0.04526612
## 6  128 368.2908   1 368.2378      0      0 389  NA  1      1 0.05292412
## 7  151 230.6836   1 230.6505      0      0 573  NA  1      0 0.03314970
## 8  184 149.2873   1 149.2659      0      0 261  NA  1      0 0.02145289
## 9  186 205.1332   1 205.1037      0      0 238  NA  1      0 0.02947805
## 10 192 214.6234   1 214.5925      0      0 553  NA  1      0 0.03084181

4.3.1.3 A closer look at counterfactual survival time for subjects who switch

head(compare_fit3 %>% filter(arm == 1 & rx < 1), n = 5)
##   id    t_star arm    tstar2 d_star event2 ady dco        rx status
## 1  4  75.99072   1  75.98411      1      1 156  30 0.1923077      1
## 2  5 213.75200   1 213.73364      0      0 436  86 0.1972477      0
## 3  6 263.14713   1 263.12757      1      1 500 127 0.2540000      1
## 4 12  92.64542   1  92.64159      1      1 139  66 0.4748201      1
## 5 23 215.34644   1 215.33317      1      1 376 123 0.3271277      1

Take subject id = 5 from treatment arm as an example to check how counterfactual survival time is related to the estimated psi from trtswitch::rpsftm().

compare_fit3_5 <- compare_fit3 %>% filter(id == "5")
compare_fit3_5
##   id  t_star arm   tstar2 d_star event2 ady dco        rx status
## 1  5 213.752   1 213.7336      0      0 436  86 0.1972477      0
(compare_fit3_5$t_star- compare_fit3_5$dco)/(compare_fit3_5$ady - compare_fit3_5$dco)
## [1] 0.3650057

which equals to exp(-fit3$psi). This is consistent with our expectations.

exp(-fit3$psi)
## [1] 0.3650057

In control arm, both counterfactual survival time (tstar2) from rpsftm::rpsftm() and t_star from trtswitch::rpsftm() are larger than ady.

head(compare_fit3 %>% filter(arm == 0 & rx > 0), n = 5)
##   id   t_star arm    tstar2 d_star event2 ady dco        rx status
## 1  1 343.3239   0 343.36877      1      1 145  31 0.7862069      1
## 2  3 565.3493   0 565.41231      1      1 287 127 0.5574913      1
## 3  8 598.0000   0 598.00000      0      0 485  63 0.8701031      0
## 4  9 827.3716   0 827.48146      1      1 342  63 0.8157895      1
## 5 11  53.4381   0  53.44046      1      1  43  37 0.1395349      1

Take subject id = 9 from control arm as an example to check how counterfactual survival time is related to the estimated psi from trtswitch::rpsftm().

compare_fit3_9 <- compare_fit3 %>% filter(id == "9")
compare_fit3_9
##   id   t_star arm   tstar2 d_star event2 ady dco        rx status
## 1  9 827.3716   0 827.4815      1      1 342  63 0.8157895      1
(compare_fit3_9$t_star- compare_fit3_9$dco)/(compare_fit3_9$ady - compare_fit3_9$dco)
## [1] 2.739683

equals to exp(fit3$psi)

exp(fit3$psi)
## [1] 2.739683
(compare_fit3_9$tstar2- compare_fit3_9$dco)/(compare_fit3_9$ady - compare_fit3_9$dco)
## [1] 2.740077

equals to exp(fit3_v$psi)

exp(fit3_v$psi)
## [1] 2.740077

4.3.1.4 fit3_stratified (trtswitch::rpsftm()) with shilong dataset, recensor = TRUE, two-way switch, stratified by sex

fit3_stratified <- trtswitch::rpsftm(
  shilong1, id = "id", time = "tstop", event = "event",
  treat = "arm", rx = "rx", censor_time = "dcut",
  stratum = "sex.f",
  low_psi = -3, hi_psi = 3, boot = FALSE, recensor = TRUE)

fit3_stratified_v <- validate_rpsftm(shilong1, rx = "rx", recensor = TRUE, stratify = "sex.f")
  • Compare fit3_stratified and fit3_stratified_v: psi, hr, pvalue
c(fit3_stratified$psi, fit3_stratified$hr, fit3_stratified$cox_pvalue) 
## [1] 9.531361e-01 2.586768e+00 1.015313e-05
c(fit3_stratified_v$psi, fit3_stratified_v$hazard, fit3_stratified_v$pvalue)
## [1] 9.532766e-01 2.586768e+00 1.015313e-05
  • Compare fit3_stratified and fit3_stratified_v: counterfactual survival time
compare_fit3_stratified <- fit3_stratified$data_outcome %>%
  left_join(fit3_stratified_v$data_outcome, by = "id") %>%
  select(id, t_star, arm, tstar2, d_star, event2) %>% 
  left_join(shilong1 %>% select(id, ady, dco, rx, status))
## Joining with `by = join_by(id)`
ggplot(compare_fit3_stratified, aes(x = t_star, y = tstar2, color = factor(arm))) +
  geom_point() +
  geom_abline(intercept = 0, slope = 1) + 
  labs(title = "Counterfactual survival time comparison (stratified by sex)", 
       x = "From trtswitch::rpsftm", y = "From validation")

  • Event Status Comparison for fit3_stratified:
# Compare event status between implementations
table(compare_fit3_stratified$d_star, compare_fit3_stratified$event2, 
      dnn = c("trtswitch::rpsftm event", "rpsftm::rpsftm event"))
##                        rpsftm::rpsftm event
## trtswitch::rpsftm event   0   1
##                       0  72   0
##                       1   0 121
# Calculate agreement percentage
event_agreement <- mean(compare_fit3_stratified$d_star == compare_fit3_stratified$event2)
cat("Event status agreement:", round(event_agreement * 100, 1), "%\n")
## Event status agreement: 100 %
# Show any discrepancies
event_discrepancies <- compare_fit3_stratified %>% 
  filter(d_star != event2) %>%
  select(id, d_star, event2, t_star, tstar2)

if(nrow(event_discrepancies) > 0) {
  cat("Patients with event status discrepancies:\n")
  print(event_discrepancies)
} else {
  cat("No event status discrepancies found.\n")
}
## No event status discrepancies found.

4.3.1.5 fit3_stratified vs. fit3_stratified_v: Counterfactual survival time for subjects who didn’t switch.

```{r} compare_fit3_stratified_nonswt <- compare_fit3_stratified %>% filter((arm == 0 & rx == 0) | (arm == 1 & rx == 1)) ggplot(compare_fit3_stratified_nonswt, aes(x = t_star, y = tstar2, color = factor(arm))) + geom_point() + geom_abline(intercept = 0, slope = 1) + labs(title = “Counterfactual survival time comparison for subjects who didn’t switch (stratified by sex)”, x = “From trtswitch::rpsftm”, y = “From validation”)




##  Two-way, without recensoring

###  `shilong` Dataset
#### fit4 (`trtswitch::rpsftm()`) with `shilong` dataset, recensor = FALSE, two-way switch


``` r
fit4 <- trtswitch::rpsftm(
  shilong1, id = "id", time = "ady", event = "event",
  treat = "arm", rx = "rx", censor_time = "dcut",
  low_psi = -3, hi_psi = 3, boot = FALSE, recensor = FALSE)
fit4_v <- validate_rpsftm(shilong1, rx = "rx", recensor = FALSE)
## Warning in rpsftm::rpsftm(fml, data = d, low_psi = -3, hi_psi = 3): Evaluation
## of a limit of the Confidence Interval failed.  It is set to NA
  • Compare fit4 and fit4_v: psi, hr, pvalue
c(fit4$psi, fit4$hr, fit4$cox_pvalue) 
## [1] 1.119231e+00 2.717309e+00 7.257647e-07
c(fit4_v$psi, fit4_v$hazard, fit4_v$pvalue)
## [1] 1.118900e+00 2.717309e+00 7.257647e-07
  • Compare fit4 and fit4_v: counterfactual survival time
compare_fit4 <- fit4$data_outcome %>%
  left_join(fit4_v$data_outcome, by = "id") %>%
  select(id, t_star, arm, tstar2) %>% left_join(shilong1 %>% select(id, ady, dco, rx))
## Joining with `by = join_by(id)`
ggplot(compare_fit4, aes(x = t_star, y = tstar2, color = factor(arm))) +
  geom_point() +
  geom_abline(intercept = 0, slope = 1) + 
  labs(title = "Counterfactual survival time comparison", x = "From trtswitch::rpsftm", y = "From validation")

Event Status Comparison for fit4:

# First need to add d_star and event2 columns to compare_fit4
compare_fit4_full <- fit4$data_outcome %>%
  left_join(fit4_v$data_outcome, by = "id")

# Compare event status between implementations
table(compare_fit4_full$d_star, compare_fit4_full$event2, 
      dnn = c("trtswitch::rpsftm event", "rpsftm::rpsftm event"))
##                        rpsftm::rpsftm event
## trtswitch::rpsftm event   0   1
##                       0  63   0
##                       1   0 130
# Calculate agreement percentage
event_agreement <- mean(compare_fit4_full$d_star == compare_fit4_full$event2)
cat("Event status agreement:", round(event_agreement * 100, 1), "%\n")
## Event status agreement: 100 %
# Show any discrepancies
event_discrepancies <- compare_fit4_full %>% 
  filter(d_star != event2) %>%
  select(id, d_star, event2, t_star, tstar2)

if(nrow(event_discrepancies) > 0) {
  cat("Patients with event status discrepancies:\n")
  print(event_discrepancies)
} else {
  cat("No event status discrepancies found.\n")
}
## No event status discrepancies found.

4.3.1.6 fit4_stratified (trtswitch::rpsftm()) with shilong dataset, recensor = FALSE, two-way switch, stratified by sex

fit4_stratified <- trtswitch::rpsftm(
  shilong1, id = "id", time = "ady", event = "event",
  treat = "arm", rx = "rx", censor_time = "dcut",
  stratum = "sex.f",
  low_psi = -3, hi_psi = 3, boot = FALSE, recensor = FALSE)

fit4_stratified_v <- validate_rpsftm(shilong1, rx = "rx", recensor = FALSE, stratify = "sex.f")
## Warning in rpsftm::rpsftm(fml, data = d, low_psi = -3, hi_psi = 3): Evaluation
## of a limit of the Confidence Interval failed.  It is set to NA
  • Compare fit4_stratified and fit4_stratified_v: psi, hr, pvalue
c(fit4_stratified$psi, fit4_stratified$hr, fit4_stratified$cox_pvalue) 
## [1] 0.9444606542 2.3965919769 0.0000114252
c(fit4_stratified_v$psi, fit4_stratified_v$hazard, fit4_stratified_v$pvalue)
## [1] 0.9441827656 2.3965919769 0.0000114252
  • Compare fit4_stratified and fit4_stratified_v: counterfactual survival time
compare_fit4_stratified <- fit4_stratified$data_outcome %>%
  left_join(fit4_stratified_v$data_outcome, by = "id") %>%
  select(id, t_star, arm, tstar2) %>% 
  left_join(shilong1 %>% select(id, ady, dco, rx))
## Joining with `by = join_by(id)`
ggplot(compare_fit4_stratified, aes(x = t_star, y = tstar2, color = factor(arm))) +
  geom_point() +
  geom_abline(intercept = 0, slope = 1) + 
  labs(title = "Counterfactual survival time comparison (stratified by sex)", 
       x = "From trtswitch::rpsftm", y = "From validation")

  • Event Status Comparison for fit4_stratified:
# adding d_star and event2 columns to compare_fit4_stratified
compare_fit4_stratified_full <- fit4_stratified$data_outcome %>%
  left_join(fit4_stratified_v$data_outcome, by = "id")

# compare event status
table(compare_fit4_stratified_full$d_star, compare_fit4_stratified_full$event2, 
      dnn = c("trtswitch::rpsftm event", "rpsftm::rpsftm event"))
##                        rpsftm::rpsftm event
## trtswitch::rpsftm event   0   1
##                       0  63   0
##                       1   0 130
event_agreement <- mean(compare_fit4_stratified_full$d_star == compare_fit4_stratified_full$event2)
cat("Event status agreement:", round(event_agreement * 100, 1), "%\n")
## Event status agreement: 100 %
#discrepancies 
event_discrepancies <- compare_fit4_stratified_full %>% 
  filter(d_star != event2) %>%
  select(id, d_star, event2, t_star, tstar2)

if(nrow(event_discrepancies) > 0) {
  cat("Patients with event status discrepancies:\n")
  print(event_discrepancies)
} else {
  cat("No event status discrepancies found.\n")
}
## No event status discrepancies found.

5 Unadjusted CoxPH Model Results

5.1 shilong dataset

unadj_fit <- coxph(Surv(ady, event) ~ arm, data = shilong1)
unadj_fit
## Call:
## coxph(formula = Surv(ady, event) ~ arm, data = shilong1)
## 
##       coef exp(coef) se(coef)     z     p
## arm 0.2349    1.2648   0.1777 1.322 0.186
## 
## Likelihood ratio test=1.75  on 1 df, p=0.1859
## n= 193, number of events= 130

5.1.1 shilong dataset - stratified

unadj_fit_stratified <- coxph(Surv(ady, event) ~ arm + strata(sex.f), data = shilong1)
unadj_fit_stratified
## Call:
## coxph(formula = Surv(ady, event) ~ arm + strata(sex.f), data = shilong1)
## 
##       coef exp(coef) se(coef)     z     p
## arm 0.2106    1.2345   0.1786 1.179 0.238
## 
## Likelihood ratio test=1.39  on 1 df, p=0.2381
## n= 193, number of events= 130

5.2 immdef dataset

unadj_fit_immdef <- coxph(Surv(progyrs, prog) ~ imm, data = immdef1)
unadj_fit_immdef
## Call:
## coxph(formula = Surv(progyrs, prog) ~ imm, data = immdef1)
## 
##        coef exp(coef) se(coef)     z      p
## imm -0.2171    0.8048   0.1137 -1.91 0.0561
## 
## Likelihood ratio test=3.66  on 1 df, p=0.05562
## n= 1000, number of events= 312

5.3 anonymised_trial_data dataset

unadj_fit_internal <- coxph(Surv(ady, event) ~ arm, data = anonymised_trial_data)
unadj_fit_internal
## Call:
## coxph(formula = Surv(ady, event) ~ arm, data = anonymised_trial_data)
## 
##        coef exp(coef) se(coef)      z     p
## arm -0.2489    0.7797   0.1887 -1.319 0.187
## 
## Likelihood ratio test=1.74  on 1 df, p=0.1873
## n= 180, number of events= 113

6 Detailed Summary

6.1 Summarize the findings for comparing rpsftm from trtswitch::rpsftm() and rpsftm::rpsftm()

Summary of Validation Results: trtswitch::rpsftm() vs rpsftm::rpsftm()
1 or 2 way Recensoring Dataset HR trtswitch, rpsftm Psi trtswitch, rpsftm P-value trtswitch, rpsftm Counterfactual time Event Status
1 YES shilong 1.617, 1.617 0.482, 0.482 0.0087, 0.0082
1 NO shilong 1.701, 1.701 0.578, 0.578 0.0040, 0.0036
1 YES immdef 0.761, 0.768 -0.181, -0.181 0.0241, 0.0292
1 NO immdef 0.767, 0.767 -0.185, -0.185 0.0197, 0.0194
1 YES anonymised_trial_data 0.695, 0.695 -0.470, -0.470 0.0558, 0.0546
1 NO anonymised_trial_data 0.694, 0.695 -0.472, -0.470 0.0549, 0.0546
2 YES shilong 2.721, 2.724 1.008, 1.008 3.6e-06, 1.8e-06
2 NO shilong 2.717, 2.718 1.119, 1.119 7.3e-07, 3.4e-07

Legend: - ✅ = Good agreement between implementations - Values shown as: trtswitch::rpsftm(), rpsftm::rpsftm() - HR and Psi values rounded to 3 decimal places - P-values shown in scientific notation where appropriate - Note: Validation function was corrected to properly handle treatment arm patients in two-way switching scenarios.

  1. The unadjusted CoxPH model gives a HR > 1, indicating a worse treatment effect. Both One-way switch and Two-way switch RPSFTM model further enlarge the HR. With one-way switch scenario, two functions trtswitch::rpsftm() and rpsftm::rpsftm() give very similar hazard ratio. For two-way switch scenario, after validation function correction, both implementations now show excellent agreement.
##                                 Model Package::function  Dataset        HR
## 1                          Unadjusted   survival::coxph shilong1 1.2647965
## 2     One-way switch, recensor = TRUE trtswitch::rpsftm shilong1 1.6165309
## 3     One-way switch, recensor = TRUE    rpsftm::rpsftm shilong1 1.6165309
## 4    One-way switch, recensor = FALSE trtswitch::rpsftm shilong1 1.7008903
## 5    One-way switch, recensor = FALSE    rpsftm::rpsftm shilong1 1.7008903
## 6     Two-way switch, recensor = TRUE trtswitch::rpsftm shilong1 2.7210780
## 7     Two-way switch, recensor = TRUE    rpsftm::rpsftm shilong1 2.7226926
## 8    Two-way switch, recensor = FALSE trtswitch::rpsftm shilong1 2.7173094
## 9    Two-way switch, recensor = FALSE    rpsftm::rpsftm shilong1 2.7173094
## 10  RPSFTM on immdef, recensor = TRUE trtswitch::rpsftm  immdef1 0.7610992
## 11  RPSFTM on immdef, recensor = TRUE    rpsftm::rpsftm  immdef1 0.7683435
## 12 RPSFTM on immdef, recensor = FALSE trtswitch::rpsftm  immdef1 0.7667704
## 13 RPSFTM on immdef, recensor = FALSE    rpsftm::rpsftm  immdef1 0.7668068
  1. In one-way switch scenario, trtswitch::rpsftm() and rpsftm::rpsftm() give comparable numbers for estimated psi, hazard ratio, cox p-value, and event status. The counterfactual survival time for subjects who didn’t switch (treatment arm) are consistent with original survival time, which is expected. The counterfactual survival time for subjects who switched (control arm) are larger than original survival time, which is also expected since the unadjusted HR is greater than 1. This consistency is observed across all datasets: shilong, immdef, and anonymised_trial_data.

  2. RPSFTM validation across multiple datasets: The validation using all three datasets (shilong, immdef, and anonymised_trial_data) shows that trtswitch::rpsftm() and rpsftm::rpsftm() produce consistent results across different data structures and patient populations. This comprehensive cross-dataset validation strengthens confidence in both implementations.

  3. Two-way switching validation success: After correcting the validation function to properly handle treatment arm patients in two-way switching scenarios, trtswitch::rpsftm() and rpsftm::rpsftm() show excellent agreement in psi estimates, hazard ratios, p-values, counterfactual survival times, and event status across both recensoring conditions.

  4. Event status validation: The addition of event status comparison confirms that both implementations handle censoring and event indicators consistently across all scenarios. Event status agreement is excellent (typically 100%) across all fits, indicating robust handling of survival endpoints.

  5. Key methodological insight: The validation process revealed that proper handling of two-way switching requires careful attention to counterfactual time calculations for treatment arm patients who switch. The corrected validation function applies the transformation Result.RPSFT$Sstar[,1] * exp(-Result.RPSFT$psi) to treatment arm patients, ensuring appropriate comparison between implementations.