Visualizing Survival models, part 1

Code

#  theme: yeti, united, flatly, sandstone, pulse, lumen

library(janitor)
library(CleanEasy)
library(flextable)
library(sjPlot)
library(effects)
library(performance)
library(survival)
library(survminer)
library(flexsurv)
library(tidyquant)
library(tidylog)
library(tidyverse)

0.1 NCCTG Lung Cancer

Data Dictionary (clink to show)

inst: Institution code
time: Survival time in days
status: censoring status 1=censored, 2=dead
age: Age in years
sex: Male=1 Female=2
ph.ecog: ECOG performance score as rated by the physician. 0=asymptomatic, 1= symptomatic but completely ambulatory, 2= in bed <50% of - the day, 3= in bed > 50% of the day but not bedbound, 4 = bedbound
ph.karno: Karnofsky performance score (bad=0-good=100) rated by physician
pat.karno: Karnofsky performance score as rated by patient
meal.cal: Calories consumed at meals
wt.loss: Weight loss in last six months (pounds)

Code

lung <- survival::lung %>% filter(ph.karno != 3)
lung %>% head(5) %>% flextable()

Table 1:
NCCTG Lung Cancer Data preview
inst	time	status	age	sex	ph.ecog	ph.karno	pat.karno	meal.cal	wt.loss
3	306	2	74	1	1	90	100	1,175
3	455	2	68	1	0	90	90	1,225	15
3	1,010	1	56	1	0	90	90		15
5	210	2	57	1	1	90	60	1,150	11
1	883	2	60	1	0	100	90		0

Data Description

Survival in patients with advanced lung cancer from the North Central Cancer Treatment Group. Performance scores rate how well the patient can perform usual daily activities.

Note

The use of 1/2 for alive/dead instead of the usual 0/1 is a historical footnote. For data contained on punch cards, IBM 360 Fortran treated blank as a zero, which led to a policy within the section of Biostatistics to never use “0” as a data value since one could not distinguish it from a missing value. The policy became a habit, as is often the case; and the 1/2 coding endured long beyond the demise of punch cards and Fortran.

Source Terry Therneau

0.2 Kaplan Meire Survival probability based on ECOG performance score

Code

m <- survfit(Surv(time,status) ~ ph.ecog, data = lung)

ggsurvplot(m,
           pval = T,risk.table = "abs_pct",
           surv.median.line = "hv",
           font.main = 13,
           font.x =  14,
           font.y = 14,
           font.tickslab = 11,
           palette = c("gold", "hotpink","darkblue","wheat4"), # custom color palette
           conf.int = FALSE, # Add confidence interval
           #risk.table = TRUE, risk.table.y.text.col = TRUE,
           #xlim = c(0, 27),break.x.by = 2,
           ggtheme = theme_bw(),
           legend.labs = c("asymptomatic", "symptomatic but \n completely ambulatory", "in bed <50% of  - the day",  "bedbound"),
           pval.method = TRUE)

Figure 1: Survival probability based on ECOG performance score as rating by the physician

0.3 Post hoc for surviavl analysis

Code

pairwise_survdiff(
  formula = Surv(time,status) ~ ph.ecog, data = lung
  #p.adjust.methods = "fdr"
)


    Pairwise comparisons using Log-Rank test 

data:  lung and ph.ecog 

  0       1       2      
1 0.07559 -       -      
2 0.00039 0.01592 -      
3 0.03138 0.03138 0.23637

P value adjustment method: BH

0.4 Exponential Survival probability based on ECOG performance score

Code

exp <- flexsurvreg(Surv(time,status) ~ ph.ecog,dist = "expon", 
                   data = lung %>% mutate(ph.ecog = factor(ph.ecog)))

ggsurvplot(exp,risk.table = "abs_pct",
           surv.median.line = "hv",
           title = "Exponential Survival probability based on ECOG performance score among lung cancer patients",
           font.main = 13,
           font.x =  14,
           font.y = 14,
           font.tickslab = 11,
           conf.int = FALSE, # Add confidence interval
           #pval = TRUE,
           #risk.table = TRUE, risk.table.y.text.col = TRUE,
           #xlim = c(0, 27),break.x.by = 2,
           ggtheme = theme_minimal()
           )

Figure 2: Exponential for surviavl analysis

0.5 Cox Proportion hazard model based on ECOG performance score

Code

cox_m <- coxph(Surv(time,status) ~ ph.ecog + age + sex ,
                   data = lung %>% mutate(ph.ecog = factor(ph.ecog)))

#   Model visualization
ggforest(cox_m, data = lung %>% mutate(ph.ecog = factor(ph.ecog)),
         fontsize = 1) 

#   Survival Probability plot
ggsurvplot(survfit(cox_m, data = lung %>% mutate(ph.ecog = factor(ph.ecog))), 
           conf.int = TRUE, palette = "Dark2",ggtheme = theme_light(),
           censor = FALSE, surv.median.line = "hv",
           title = "Genaral Coxph Survival Probability among lung cancer patients",
           font.main = 13,
           font.x =  14,
           font.y = 14,
           font.tickslab = 11,
           legend = "none")