Visualizing Survival models, part 1

Training key concepts of visualizing Survial analysis models using Surviminer, survival & flexsurv packages

R programming Quarto Analysis Report for Survival analysis training
Author
Affiliation

Fredrick Onduru

Biostatistician-Analyst

Published

August 15, 2023

0.1 NCCTG Lung Cancer

  • inst: Institution code
  • time: Survival time in days
  • status: censoring status 1=censored, 2=dead
  • age: Age in years
  • sex: Male=1 Female=2
  • ph.ecog: ECOG performance score as rated by the physician. 0=asymptomatic, 1= symptomatic but completely ambulatory, 2= in bed <50% of - the day, 3= in bed > 50% of the day but not bedbound, 4 = bedbound
  • ph.karno: Karnofsky performance score (bad=0-good=100) rated by physician
  • pat.karno: Karnofsky performance score as rated by patient
  • meal.cal: Calories consumed at meals
  • wt.loss: Weight loss in last six months (pounds)
Code
lung <- survival::lung %>% filter(ph.karno != 3)
lung %>% head(5) %>% flextable()
Table 1:

NCCTG Lung Cancer Data preview

inst

time

status

age

sex

ph.ecog

ph.karno

pat.karno

meal.cal

wt.loss

3

306

2

74

1

1

90

100

1,175

3

455

2

68

1

0

90

90

1,225

15

3

1,010

1

56

1

0

90

90

15

5

210

2

57

1

1

90

60

1,150

11

1

883

2

60

1

0

100

90

0

Data Description

Survival in patients with advanced lung cancer from the North Central Cancer Treatment Group. Performance scores rate how well the patient can perform usual daily activities.

Note

The use of 1/2 for alive/dead instead of the usual 0/1 is a historical footnote. For data contained on punch cards, IBM 360 Fortran treated blank as a zero, which led to a policy within the section of Biostatistics to never use “0” as a data value since one could not distinguish it from a missing value. The policy became a habit, as is often the case; and the 1/2 coding endured long beyond the demise of punch cards and Fortran.

Source Terry Therneau

0.2 Kaplan Meire Survival probability based on ECOG performance score

Code
m <- survfit(Surv(time,status) ~ ph.ecog, data = lung)

ggsurvplot(m,
           pval = T,risk.table = "abs_pct",
           surv.median.line = "hv",
           font.main = 13,
           font.x =  14,
           font.y = 14,
           font.tickslab = 11,
           palette = c("gold", "hotpink","darkblue","wheat4"), # custom color palette
           conf.int = FALSE, # Add confidence interval
           #risk.table = TRUE, risk.table.y.text.col = TRUE,
           #xlim = c(0, 27),break.x.by = 2,
           ggtheme = theme_bw(),
           legend.labs = c("asymptomatic", "symptomatic but \n completely ambulatory", "in bed <50% of  - the day",  "bedbound"),
           pval.method = TRUE)

Figure 1: Survival probability based on ECOG performance score as rating by the physician

0.3 Post hoc for surviavl analysis

Code
pairwise_survdiff(
  formula = Surv(time,status) ~ ph.ecog, data = lung
  #p.adjust.methods = "fdr"
)

    Pairwise comparisons using Log-Rank test 

data:  lung and ph.ecog 

  0       1       2      
1 0.07559 -       -      
2 0.00039 0.01592 -      
3 0.03138 0.03138 0.23637

P value adjustment method: BH 

0.4 Exponential Survival probability based on ECOG performance score

Code
exp <- flexsurvreg(Surv(time,status) ~ ph.ecog,dist = "expon", 
                   data = lung %>% mutate(ph.ecog = factor(ph.ecog)))

ggsurvplot(exp,risk.table = "abs_pct",
           surv.median.line = "hv",
           title = "Exponential Survival probability based on ECOG performance score among lung cancer patients",
           font.main = 13,
           font.x =  14,
           font.y = 14,
           font.tickslab = 11,
           conf.int = FALSE, # Add confidence interval
           #pval = TRUE,
           #risk.table = TRUE, risk.table.y.text.col = TRUE,
           #xlim = c(0, 27),break.x.by = 2,
           ggtheme = theme_minimal()
           )

Figure 2: Exponential for surviavl analysis

0.5 Cox Proportion hazard model based on ECOG performance score

Code
cox_m <- coxph(Surv(time,status) ~ ph.ecog + age + sex ,
                   data = lung %>% mutate(ph.ecog = factor(ph.ecog)))

#   Model visualization
ggforest(cox_m, data = lung %>% mutate(ph.ecog = factor(ph.ecog)),
         fontsize = 1) 

#   Survival Probability plot
ggsurvplot(survfit(cox_m, data = lung %>% mutate(ph.ecog = factor(ph.ecog))), 
           conf.int = TRUE, palette = "Dark2",ggtheme = theme_light(),
           censor = FALSE, surv.median.line = "hv",
           title = "Genaral Coxph Survival Probability among lung cancer patients",
           font.main = 13,
           font.x =  14,
           font.y = 14,
           font.tickslab = 11,
           legend = "none")

(a) Model visualization,

(b) Survival Probability plot

Figure 3: Cox Proportion hazard model for surviavl probability analysis among lung cancer patients