1. Table of Contents

This document presents a non-exhaustive list of survival analysis and descriptive modelling methods for a two-group right-censored data with time-independent variables using various helpful packages in R.

1.1 Sample Data

The Leukemia dataset from the book Survival Analysis : A Self-Learning Text was used for this illustrated example.

Preliminary dataset assessment:

[A] 42 Rows (observations composed of leukemia patients enrolled in the study)

[B] 4 Columns (variables)
     [B.1] 1/4 Factor variable = Group (factor)
          [B.1.1] Placebo = placebo drug treatment
          [B.1.2] Treatment = new drug treatment
     [B.2] 1/4 Censoring variable = Status (factor)
          [B.2.1] 1 = event, patient went out of remission before the end of the study
          [B.2.2] 0 = censored or non-event, patient remains in remission until the end of the study, lost to follow-up, experiences a different event that makes further follow-up impossible or withdraws before the end of the study
     [B.3] 1/4 Response variables = Week (numeric)
          [B.3.1] Week = time in weeks a patient is in remission up to the point of the patient’s either going out of remission or being censored
     [B.4] 1/4 Covariate variable = WBC (numeric)
          [B.4.1] WBC = logarithm-transformed white blood cell measurement data gathered before the study which is also considered an important survival predictor in leukemia patients

##################################
# Loading R libraries
##################################
library(moments)
library(car)
library(multcomp)
library(effects)
library(psych)
library(ggplot2)
library(dplyr)
library(ggpubr)
library(rstatix)
library(ggfortify)
library(trend)
library(survival)
library(rms)
library(survminer)
library(Hmisc)
library(finalfit)
library(knitr)
library(gtsummary)


##################################
# Setting the working directory
##################################
setwd("C:/Users/User/Desktop/RMarkdown Projects/SurvivalAnalysis")

##################################
# Reading and creating the complete dataset
##################################
DBP.Complete <- read.csv('Leukemia.csv',
                na.strings=c("NA","NaN"," ",""),
                stringsAsFactors = T)

##################################
# Performing a general exploration of the dataset
##################################
dim(DBP.Complete)

## [1] 42  5

str(DBP.Complete)

## 'data.frame':    42 obs. of  5 variables:
##  $ ID    : int  1 2 3 4 5 6 7 8 9 10 ...
##  $ Group : Factor w/ 2 levels "Placebo","Treatment": 2 2 2 2 2 2 2 2 2 2 ...
##  $ Week  : int  6 6 6 7 10 13 16 22 23 6 ...
##  $ Status: int  1 1 1 1 1 1 1 1 1 0 ...
##  $ WBC   : num  2.31 4.06 3.28 4.43 2.96 2.88 3.6 2.32 2.57 3.2 ...

summary(DBP.Complete)

##        ID              Group         Week           Status      
##  Min.   : 1.00   Placebo  :21   Min.   : 1.00   Min.   :0.0000  
##  1st Qu.:11.25   Treatment:21   1st Qu.: 6.00   1st Qu.:0.0000  
##  Median :21.50                  Median :10.50   Median :1.0000  
##  Mean   :21.50                  Mean   :12.88   Mean   :0.7143  
##  3rd Qu.:31.75                  3rd Qu.:18.50   3rd Qu.:1.0000  
##  Max.   :42.00                  Max.   :35.00   Max.   :1.0000  
##       WBC       
##  Min.   :1.450  
##  1st Qu.:2.303  
##  Median :2.800  
##  Mean   :2.930  
##  3rd Qu.:3.490  
##  Max.   :5.000

##################################
# Setting the levels for the treatment categories
##################################
DBP.Complete$Group <- factor(DBP.Complete$Group,
                           levels = c("Placebo","Treatment"))

DBP.Complete$StatusLevel <- factor(DBP.Complete$Status,
                            levels = c(0,1))

##################################
# Creating an analysis dataset without the subject information column
##################################
DBP.Analysis <- DBP.Complete[,-1]

##################################
# Creating a response-covariate dataset without the treatment column
##################################
DBP.Response.Covariate <- DBP.Analysis[,-1]

##################################
# Formulating a data type assessment summary
##################################
PDA <- DBP.Response.Covariate
(PDA.Summary <- data.frame(
  Column.Index=c(1:length(names(PDA))),
  Column.Name= names(PDA), 
  Column.Type=sapply(PDA, function(x) class(x)), 
  row.names=NULL)
)

##   Column.Index Column.Name Column.Type
## 1            1        Week     integer
## 2            2      Status     integer
## 3            3         WBC     numeric
## 4            4 StatusLevel      factor

1.2 Data Quality Assessment

Data quality assessment:

[A] No missing observations noted for any response and covariate variable.

[B] No low variance (Unique.Count.Ratio<0.01 or First.Second.Mode.Ratio>5) noted for any response variable.

[C] No high skewness (Skewness>3 or Skewness<(-3)) noted for any response variable.

##################################
# Restoring the working directory
##################################
setwd("C:/Users/User/Desktop/RMarkdown Projects")

##################################
# Reusing the response-covariate dataset
##################################
DQA <- DBP.Response.Covariate

##################################
# Formulating a data subset of response variables (numeric)
##################################
DQA.Variables.Response <- as.data.frame(DQA$Week)

##################################
# Formulating a data subset of covariate variables (numeric)
##################################
DQA.Variables.Covariate.Numeric <- as.data.frame(DQA$WBC)

##################################
# Formulating an overall data quality assessment summary
##################################
(DQA.Summary <- data.frame(
  Column.Index=c(1:length(names(DQA))),
  Column.Name= names(DQA), 
  Column.Type=sapply(DQA, function(x) class(x)), 
  Row.Count=sapply(DQA, function(x) nrow(DQA)),
  NA.Count=sapply(DQA,function(x)sum(is.na(x))),
  Fill.Rate=sapply(DQA,function(x)format(round((sum(!is.na(x))/nrow(DQA)),3),nsmall=3)),
  row.names=NULL)
)

##   Column.Index Column.Name Column.Type Row.Count NA.Count Fill.Rate
## 1            1        Week     integer        42        0     1.000
## 2            2      Status     integer        42        0     1.000
## 3            3         WBC     numeric        42        0     1.000
## 4            4 StatusLevel      factor        42        0     1.000

##################################
# Checking for missing observations
##################################
if ((nrow(DQA.Summary[DQA.Summary$NA.Count>0,]))>0){
  print(paste0("Missing observations noted for ",
               (nrow(DQA.Summary[DQA.Summary$NA.Count>0,])),
               " variable(s) with NA.Count>0 and Fill.Rate<1.0."))
  DQA.Summary[DQA.Summary$NA.Count>0,]
} else {
  print("No missing observations noted.")
}

## [1] "No missing observations noted."

##################################
# Formulating a data quality assessment summary for response variables (numeric)
##################################

if (length(names(DQA.Variables.Response))>0) {
  
  ##################################
  # Formulating a function to determine the first mode
  ##################################
  FirstModes <- function(x) {
    ux <- unique(na.omit(x))
    tab <- tabulate(match(x, ux))
    ux[tab == max(tab)]
  }

  ##################################
  # Formulating a function to determine the second mode
  ##################################
  SecondModes <- function(x) {
    ux <- unique(na.omit(x))
    tab <- tabulate(match(x, ux))
    fm = ux[tab == max(tab)]
    sm = na.omit(x)[!(na.omit(x) %in% fm)]
    usm <- unique(sm)
    tabsm <- tabulate(match(sm, usm))
    usm[tabsm == max(tabsm)]
  }
  
  (DQA.Variables.Response.Summary <- data.frame(
  Column.Name= names(DQA.Variables.Response), 
  Column.Type=sapply(DQA.Variables.Response, function(x) class(x)), 
  Unique.Count=sapply(DQA.Variables.Response, function(x) length(unique(x))),
  Unique.Count.Ratio=sapply(DQA.Variables.Response, function(x) format(round((length(unique(x))/nrow(DQA.Variables.Response)),3), nsmall=3)),
  First.Mode.Value=sapply(DQA.Variables.Response, function(x) format(round((FirstModes(x)[1]),3),nsmall=3)),
  Second.Mode.Value=sapply(DQA.Variables.Response, function(x) format(round((SecondModes(x)[1]),3),nsmall=3)),
  First.Mode.Count=sapply(DQA.Variables.Response, function(x) sum(na.omit(x) == FirstModes(x)[1])),
  Second.Mode.Count=sapply(DQA.Variables.Response, function(x) sum(na.omit(x) == SecondModes(x)[1])),
  First.Second.Mode.Ratio=sapply(DQA.Variables.Response, function(x) format(round((sum(na.omit(x) == FirstModes(x)[1])/sum(na.omit(x) == SecondModes(x)[1])),3), nsmall=3)),
  Minimum=sapply(DQA.Variables.Response, function(x) format(round(min(x,na.rm = TRUE),3), nsmall=3)),
  Mean=sapply(DQA.Variables.Response, function(x) format(round(mean(x,na.rm = TRUE),3), nsmall=3)),
  Median=sapply(DQA.Variables.Response, function(x) format(round(median(x,na.rm = TRUE),3), nsmall=3)),
  Maximum=sapply(DQA.Variables.Response, function(x) format(round(max(x,na.rm = TRUE),3), nsmall=3)),
  Skewness=sapply(DQA.Variables.Response, function(x) format(round(skewness(x,na.rm = TRUE),3), nsmall=3)),
  Kurtosis=sapply(DQA.Variables.Response, function(x) format(round(kurtosis(x,na.rm = TRUE),3), nsmall=3)),
  Percentile25th=sapply(DQA.Variables.Response, function(x) format(round(quantile(x,probs=0.25,na.rm = TRUE),3), nsmall=3)),
  Percentile75th=sapply(DQA.Variables.Response, function(x) format(round(quantile(x,probs=0.75,na.rm = TRUE),3), nsmall=3)),
  row.names=NULL)
  )  
  
}

##   Column.Name Column.Type Unique.Count Unique.Count.Ratio First.Mode.Value
## 1    DQA$Week     integer           24              0.571            6.000
##   Second.Mode.Value First.Mode.Count Second.Mode.Count First.Second.Mode.Ratio
## 1            11.000                4                 3                   1.333
##   Minimum   Mean Median Maximum Skewness Kurtosis Percentile25th Percentile75th
## 1   1.000 12.881 10.500  35.000    0.876    2.870          6.000         18.500

##################################
# Identifying potential data quality issues for response variables (numeric) 
##################################

##################################
# Checking for zero or near-zero variance predictors
##################################
if (length(names(DQA.Variables.Response))==0) {
  print("No numeric predictors noted.")
} else if (nrow(DQA.Variables.Response.Summary[as.numeric(as.character(DQA.Variables.Response.Summary$First.Second.Mode.Ratio))>5,])>0){
  print(paste0("Low variance observed for ",
               (nrow(DQA.Variables.Response.Summary[as.numeric(as.character(DQA.Variables.Response.Summary$First.Second.Mode.Ratio))>5,])),
               " numeric variable(s) with First.Second.Mode.Ratio>5."))
  DQA.Variables.Response.Summary[as.numeric(as.character(DQA.Variables.Response.Summary$First.Second.Mode.Ratio))>5,]
} else {
  print("No low variance numeric predictors due to high first-second mode ratio noted.")
}

## [1] "No low variance numeric predictors due to high first-second mode ratio noted."

if (length(names(DQA.Variables.Response))==0) {
  print("No numeric predictors noted.")
} else if (nrow(DQA.Variables.Response.Summary[as.numeric(as.character(DQA.Variables.Response.Summary$Unique.Count.Ratio))<0.01,])>0){
  print(paste0("Low variance observed for ",
               (nrow(DQA.Variables.Response.Summary[as.numeric(as.character(DQA.Variables.Response.Summary$Unique.Count.Ratio))<0.01,])),
               " numeric variable(s) with Unique.Count.Ratio<0.01."))
  DQA.Variables.Response.Summary[as.numeric(as.character(DQA.Variables.Response.Summary$Unique.Count.Ratio))<0.01,]
} else {
  print("No low variance numeric predictors due to low unique count ratio noted.")
}

## [1] "No low variance numeric predictors due to low unique count ratio noted."

##################################
# Checking for skewed response variables
##################################
if (length(names(DQA.Variables.Response))==0) {
  print("No response variables noted.")
} else if (nrow(DQA.Variables.Response.Summary[as.numeric(as.character(DQA.Variables.Response.Summary$Skewness))>3 |
                                               as.numeric(as.character(DQA.Variables.Response.Summary$Skewness))<(-3),])>0){
  print(paste0("High skewness observed for ",
  (nrow(DQA.Variables.Response.Summary[as.numeric(as.character(DQA.Variables.Response.Summary$Skewness))>3 |
                                               as.numeric(as.character(DQA.Variables.Response.Summary$Skewness))<(-3),])),
  " response variable(s) with Skewness>3 or Skewness<(-3)."))
  DQA.Variables.Response.Summary[as.numeric(as.character(DQA.Variables.Response.Summary$Skewness))>3 |
                                 as.numeric(as.character(DQA.Variables.Response.Summary$Skewness))<(-3),]
} else {
  print("No skewed response variables noted.")
}

## [1] "No skewed response variables noted."

1.3 Research Question

Using the Leukemia dataset, survival analysis and modelling will be conducted to investigate the following :

[A] Research Question : What is the probability that a leukemia patient under the Treatment and Placebo groups will survive (remain in remission) after a specified number of weeks?
     [A.1] Factor variable = Group
          [A.1.1] Placebo = placebo drug treatment
          [A.1.2] New = new drug treatment
     [A.2] Response variable = Week (numeric)
     [A.3] Censoring variable = Status (factor)
          [A.3.1] 1 = event, patient went out of remission before the end of the study
          [A.3.2] 0 = censored or non-event, patient remains in remission until the end of the study, lost to follow-up, experiences a different event that makes further follow-up impossible or withdraws before the end of the study

[B] Research Question : Are there differences in survival between leukemia patients under the Treatment and Placebo groups?
     [B.1] Factor variable = Group
          [B.1.1] Placebo = placebo drug treatment
          [B.1.2] Treatment = new drug treatment
     [B.2] Response variable = Week (numeric)
     [B.3] Censoring variable = Status (factor)
          [B.3.1] 1 = event, patient went out of remission before the end of the study
          [B.3.2] 0 = censored or non-event, patient remains in remission until the end of the study, lost to follow-up, experiences a different event that makes further follow-up impossible or withdraws before the end of the study

[C] Research Question : What is the impact of the clinical characteristic WBC on the survival of leukemia patients under the Treatment and Placebo groups?
     [C.1] Factor variable = Group
          [C.1.1] Placebo = placebo drug treatment
          [C.1.2] Treatment = new drug treatment
     [C.2] Response variable = Week (numeric)
     [C.3] Censoring variable = Status (factor)
          [C.3.1] 1 = event, patient went out of remission before the end of the study
          [C.3.2] 0 = censored or non-event, patient remains in remission until the end of the study, lost to follow-up, experiences a different event that makes further follow-up impossible or withdraws before the end of the study
     [C.4] Covariate variable = WBC (numeric) = logarithm-transformed white blood cell measurement data gathered before the study which is also considered an important survival predictor in leukemia patients

1.4 Data Exploration

Exploratory and comparative analyses of the response variable Week by factor variable Group showed that :

[A] Differences in the proportion of censored and uncensored data (p<0.001, Pearson’s Chi-squared Test) observed between Group=Treatment (censored=57%, uncensored=43%) and Group=Placebo (censored=0%, uncensored=100%).

[B] With both censored and uncensored data considered, higher median survival time Week (p=0.004, Wilcoxon Rank Sum Test) was observed for Group=Treatment (16) as compared to Group=Placebo (8). Lower median white blood cell measurement WBC (p=0.047, Wilcoxon Rank Sum Test) was also noted for Group=Treatment (2.57) as compared to Group=Placebo (3.06).

[C] With only the uncensored considered, higher mean survival time Week observed for Group=Treatment (12.11) as compared to Group=Placebo (8.67). Lower mean hazard rate was also noted for Group=Treatment (2.51) as compared to Group=Placebo (11.53).

Kaplan-Meier survival curve analysis of the response variable Week by factor variable Group (with censoring variable Status considered) showed that :

[A] Differences in the Kaplan-Meier survival curves (p<0.0001, Log Rank Test) were noted between Group=Treatment (median survival time=23) and Group=Placebo (median survival time=8).

[B] Higher survival time was consistently observed for leukemia patients under Group=Treatment as compared to those under Group=Placebo at various sampled durations including Week=5 (Treatment=100%, Placebo=57%), Week=10 (Treatment=75%, Placebo=38%), Week=15 (Treatment=69%, Placebo=14%), Week=20 (Treatment=63%, Placebo=10%) and Week=23 (Treatment=45%, Placebo=0%).

1.4.1 Descriptive Statistics

##################################
# Reusing the dataset
##################################
head(DBP.Analysis)

##       Group Week Status  WBC StatusLevel
## 1 Treatment    6      1 2.31           1
## 2 Treatment    6      1 4.06           1
## 3 Treatment    6      1 3.28           1
## 4 Treatment    7      1 4.43           1
## 5 Treatment   10      1 2.96           1
## 6 Treatment   13      1 2.88           1

##################################
# Formulating a preliminary summary
# of descriptive statistics 
##################################

DBP.Analysis %>%
  select(Week,StatusLevel,WBC,Group) %>%
  tbl_summary(
    by = Group,
    type = all_continuous() ~ "continuous2",
    statistic = all_continuous() ~ c("{N_nonmiss}",
                                     "{mean} ({sd})", 
                                     "{median} ({p25}, {p75})", 
                                     "{min}, {max}"),
    missing = "no") %>% 
  add_overall() %>% 
  add_p()

Characteristic	Overall, N = 42¹	Placebo, N = 21¹	Treatment, N = 21¹	p-value²
Week				0.004
N	42	21	21
Mean (SD)	13 (9)	9 (6)	17 (10)
Median (IQR)	10 (6, 18)	8 (4, 12)	16 (9, 23)
Range	1, 35	1, 23	6, 35
StatusLevel				<0.001
0	12 (29%)	0 (0%)	12 (57%)
1	30 (71%)	21 (100%)	9 (43%)
WBC				0.047
N	42	21	21
Mean (SD)	2.93 (0.92)	3.22 (0.97)	2.64 (0.77)
Median (IQR)	2.80 (2.30, 3.49)	3.06 (2.42, 3.97)	2.57 (2.16, 2.96)
Range	1.45, 5.00	1.50, 5.00	1.45, 4.43
¹ n (%)
² Wilcoxon rank sum test; Pearson's Chi-squared test

##################################
# Computing for the mean uncensored survival time
# and mean hazard rate between treatment and placebo groups
##################################
DBP.Analysis.DescriptiveStatistics <- DBP.Analysis %>%
  group_by(Group) %>%
  summarise(
    Mean.Uncensored.Survival.Time = mean(Week[Status==1]),
    Total.Uncensored.Survival.Time = sum(Status==1),
    Total.Survival.Time = sum(Week),
    Mean.Hazard.Rate = (Total.Uncensored.Survival.Time/Total.Survival.Time)*100)

kable(DBP.Analysis.DescriptiveStatistics)

Group	Mean.Uncensored.Survival.Time	Total.Uncensored.Survival.Time	Total.Survival.Time	Mean.Hazard.Rate
Placebo	8.666667	21	182	11.538462
Treatment	12.111111	9	359	2.506964

1.4.2 Kaplan-Meier Survival Curve

##################################
# Reusing the dataset
##################################
head(DBP.Analysis)

##       Group Week Status  WBC StatusLevel
## 1 Treatment    6      1 2.31           1
## 2 Treatment    6      1 4.06           1
## 3 Treatment    6      1 3.28           1
## 4 Treatment    7      1 4.43           1
## 5 Treatment   10      1 2.96           1
## 6 Treatment   13      1 2.88           1

##################################
# Computing the Kaplan-Meier survival estimates
# using the Baseline data
##################################
DBP.Analysis.Baseline.KaplanMeierEstimates <- survfit(Surv(Week, Status) ~ 1, data = DBP.Analysis)
print(DBP.Analysis.Baseline.KaplanMeierEstimates)

## Call: survfit(formula = Surv(Week, Status) ~ 1, data = DBP.Analysis)
## 
##       n  events  median 0.95LCL 0.95UCL 
##      42      30      12       8      22

tbl_survfit(DBP.Analysis.Baseline.KaplanMeierEstimates,
            times = c(5,10,15,20,23),
            label_header = "**Week {time}**"
)

Characteristic	Week 5	Week 10	Week 15	Week 20	Week 23
Overall	79% (67%, 92%)	57% (43%, 74%)	40% (27%, 59%)	34% (22%, 53%)	19% (9.1%, 39%)

##################################
# Formulating a summary of the Kaplan-Meier survival estimates
# using the Baseline data
##################################
summary(DBP.Analysis.Baseline.KaplanMeierEstimates)

## Call: survfit(formula = Surv(Week, Status) ~ 1, data = DBP.Analysis)
## 
##  time n.risk n.event survival std.err lower 95% CI upper 95% CI
##     1     42       2    0.952  0.0329       0.8901        1.000
##     2     40       2    0.905  0.0453       0.8202        0.998
##     3     38       1    0.881  0.0500       0.7883        0.985
##     4     37       2    0.833  0.0575       0.7279        0.954
##     5     35       2    0.786  0.0633       0.6709        0.920
##     6     33       3    0.714  0.0697       0.5899        0.865
##     7     29       1    0.690  0.0715       0.5628        0.845
##     8     28       4    0.591  0.0764       0.4588        0.762
##    10     23       1    0.565  0.0773       0.4325        0.739
##    11     21       2    0.512  0.0788       0.3783        0.692
##    12     18       2    0.455  0.0796       0.3227        0.641
##    13     16       1    0.426  0.0795       0.2958        0.615
##    15     15       1    0.398  0.0791       0.2694        0.588
##    16     14       1    0.369  0.0784       0.2437        0.560
##    17     13       1    0.341  0.0774       0.2186        0.532
##    22      9       2    0.265  0.0765       0.1507        0.467
##    23      7       2    0.189  0.0710       0.0909        0.395

(DBP.Analysis.Baseline.KaplanMeierEstimates.Summary <- summary(DBP.Analysis.Baseline.KaplanMeierEstimates)$table)

##    records      n.max    n.start     events     *rmean *se(rmean)     median 
##  42.000000  42.000000  42.000000  30.000000  15.339334   1.860218  12.000000 
##    0.95LCL    0.95UCL 
##   8.000000  22.000000

##################################
# Listing the details of the Kaplan-Meier survival estimates
# using the Baseline data
##################################
(DBP.Analysis.Baseline.KaplanMeierEstimates.Details <- surv_summary(DBP.Analysis.Baseline.KaplanMeierEstimates))

##    time n.risk n.event n.censor      surv    std.err     upper      lower
## 1     1     42       2        0 0.9523810 0.03450328 1.0000000 0.89010544
## 2     2     40       2        0 0.9047619 0.05006262 0.9980394 0.82020220
## 3     3     38       1        0 0.8809524 0.05672304 0.9845441 0.78826037
## 4     4     37       2        0 0.8333333 0.06900656 0.9540195 0.72791433
## 5     5     35       2        0 0.7857143 0.08058230 0.9201453 0.67092330
## 6     6     33       3        1 0.7142857 0.09759001 0.8648499 0.58993369
## 7     7     29       1        0 0.6896552 0.10370794 0.8451005 0.56280201
## 8     8     28       4        0 0.5911330 0.12925834 0.7615705 0.45883899
## 9     9     24       0        1 0.5911330 0.12925834 0.7615705 0.45883899
## 10   10     23       1        1 0.5654316 0.13668944 0.7391461 0.43254350
## 11   11     21       2        1 0.5115809 0.15393678 0.6917443 0.37834076
## 12   12     18       2        0 0.4547386 0.17504565 0.6408567 0.32267306
## 13   13     16       1        0 0.4263175 0.18656807 0.6145258 0.29575091
## 14   15     15       1        0 0.3978963 0.19892096 0.5876134 0.26943131
## 15   16     14       1        0 0.3694751 0.21228296 0.5601196 0.24371913
## 16   17     13       1        1 0.3410540 0.22687951 0.5320388 0.21862656
## 17   19     11       0        1 0.3410540 0.22687951 0.5320388 0.21862656
## 18   20     10       0        1 0.3410540 0.22687951 0.5320388 0.21862656
## 19   22      9       2        0 0.2652642 0.28847936 0.4669095 0.15070392
## 20   23      7       2        0 0.1894744 0.37465077 0.3948698 0.09091746
## 21   25      5       0        1 0.1894744 0.37465077 0.3948698 0.09091746
## 22   32      4       0        2 0.1894744 0.37465077 0.3948698 0.09091746
## 23   34      2       0        1 0.1894744 0.37465077 0.3948698 0.09091746
## 24   35      1       0        1 0.1894744 0.37465077 0.3948698 0.09091746

##################################
# Plotting the Kaplan-Meier survival curves
# using the Baseline data
##################################
SurvivalTimeInWeek.KaplanMeier.Baseline <- ggsurvplot(DBP.Analysis.Baseline.KaplanMeierEstimates,
                                                     title = "Kaplan-Meier Survival Curve (Baseline)",
                                                     pval = FALSE,
                                                     conf.int = TRUE,
                                                     conf.int.style = "ribbon",
                                                     xlab = "Time (Weeks)",
                                                     ylab="Estimated Survival Probability",
                                                     break.time.by = 5,
                                                     ggtheme = theme_bw(),
                                                     risk.table = "abs_pct",
                                                     risk.table.title="Number at Risk (Survival Probability)",
                                                     risk.table.y.text.col = FALSE,
                                                     risk.table.y.text = TRUE,
                                                     fontsize = 5,
                                                     ncensor.plot = FALSE,
                                                     surv.median.line = "hv",
                                                     palette = c("#000000"))

ggpar(SurvivalTimeInWeek.KaplanMeier.Baseline,
      font.title = c(25,"bold"),
      font.x = c(20,"bold"),
      font.y = c(20,"bold"),
      font.legend=c(20,"bold"),
      font.xtickslab=c(15,"bold","black"),
      font.ytickslab=c(15,"bold","black"))

##################################
# Computing the Kaplan-Meier survival estimates
# using the Factor Variable
##################################
DBP.Analysis.KaplanMeierEstimates <- survfit(Surv(Week, Status) ~ Group, data = DBP.Analysis)
print(DBP.Analysis.KaplanMeierEstimates)

## Call: survfit(formula = Surv(Week, Status) ~ Group, data = DBP.Analysis)
## 
##                  n events median 0.95LCL 0.95UCL
## Group=Placebo   21     21      8       4      12
## Group=Treatment 21      9     23      16      NA

tbl_survfit(DBP.Analysis.KaplanMeierEstimates,
            times = c(5,10,15,20,23),
            label_header = "**Week {time}**"
)

Characteristic	Week 5	Week 10	Week 15	Week 20	Week 23
Group
Placebo	57% (39%, 83%)	38% (22%, 66%)	14% (5.0%, 41%)	9.5% (2.5%, 36%)	0% (0.7%, 32%)
Treatment	100% (100%, 100%)	75% (59%, 97%)	69% (51%, 93%)	63% (44%, 90%)	45% (25%, 81%)

##################################
# Formulating a summary of the Kaplan-Meier survival estimates
# using the Factor Variable
##################################
summary(DBP.Analysis.KaplanMeierEstimates)

## Call: survfit(formula = Surv(Week, Status) ~ Group, data = DBP.Analysis)
## 
##                 Group=Placebo 
##  time n.risk n.event survival std.err lower 95% CI upper 95% CI
##     1     21       2   0.9048  0.0641      0.78754        1.000
##     2     19       2   0.8095  0.0857      0.65785        0.996
##     3     17       1   0.7619  0.0929      0.59988        0.968
##     4     16       2   0.6667  0.1029      0.49268        0.902
##     5     14       2   0.5714  0.1080      0.39455        0.828
##     8     12       4   0.3810  0.1060      0.22085        0.657
##    11      8       2   0.2857  0.0986      0.14529        0.562
##    12      6       2   0.1905  0.0857      0.07887        0.460
##    15      4       1   0.1429  0.0764      0.05011        0.407
##    17      3       1   0.0952  0.0641      0.02549        0.356
##    22      2       1   0.0476  0.0465      0.00703        0.322
##    23      1       1   0.0000     NaN           NA           NA
## 
##                 Group=Treatment 
##  time n.risk n.event survival std.err lower 95% CI upper 95% CI
##     6     21       3    0.857  0.0764        0.720        1.000
##     7     17       1    0.807  0.0869        0.653        0.996
##    10     15       1    0.753  0.0963        0.586        0.968
##    13     12       1    0.690  0.1068        0.510        0.935
##    16     11       1    0.627  0.1141        0.439        0.896
##    22      7       1    0.538  0.1282        0.337        0.858
##    23      6       1    0.448  0.1346        0.249        0.807

(DBP.Analysis.KaplanMeierEstimates.Summary <- summary(DBP.Analysis.KaplanMeierEstimates)$table)

##                 records n.max n.start events    *rmean *se(rmean) median
## Group=Placebo        21    21      21     21  8.666667   1.377390      8
## Group=Treatment      21    21      21      9 23.287395   2.827468     23
##                 0.95LCL 0.95UCL
## Group=Placebo         4      12
## Group=Treatment      16      NA

##################################
# Listing the details of the Kaplan-Meier survival estimates
# using the Factor Variable
##################################
(DBP.Analysis.KaplanMeierEstimates.Details <- surv_summary(DBP.Analysis.KaplanMeierEstimates))

##    time n.risk n.event n.censor       surv    std.err     upper      lower
## 1     1     21       2        0 0.90476190 0.07079923 1.0000000 0.78753505
## 2     2     19       2        0 0.80952381 0.10585122 0.9961629 0.65785306
## 3     3     17       1        0 0.76190476 0.12198751 0.9676909 0.59988048
## 4     4     16       2        0 0.66666667 0.15430335 0.9020944 0.49268063
## 5     5     14       2        0 0.57142857 0.18898224 0.8276066 0.39454812
## 6     8     12       4        0 0.38095238 0.27817432 0.6571327 0.22084536
## 7    11      8       2        0 0.28571429 0.34503278 0.5618552 0.14529127
## 8    12      6       2        0 0.19047619 0.44986771 0.4600116 0.07887014
## 9    15      4       1        0 0.14285714 0.53452248 0.4072755 0.05010898
## 10   17      3       1        0 0.09523810 0.67259271 0.3558956 0.02548583
## 11   22      2       1        0 0.04761905 0.97590007 0.3224544 0.00703223
## 12   23      1       1        0 0.00000000        Inf        NA         NA
## 13    6     21       3        1 0.85714286 0.08908708 1.0000000 0.71981708
## 14    7     17       1        0 0.80672269 0.10776353 0.9964437 0.65312422
## 15    9     16       0        1 0.80672269 0.10776353 0.9964437 0.65312422
## 16   10     15       1        1 0.75294118 0.12796438 0.9675748 0.58591898
## 17   11     13       0        1 0.75294118 0.12796438 0.9675748 0.58591898
## 18   13     12       1        0 0.69019608 0.15475995 0.9347692 0.50961310
## 19   16     11       1        0 0.62745098 0.18177335 0.8959949 0.43939392
## 20   17     10       0        1 0.62745098 0.18177335 0.8959949 0.43939392
## 21   19      9       0        1 0.62745098 0.18177335 0.8959949 0.43939392
## 22   20      8       0        1 0.62745098 0.18177335 0.8959949 0.43939392
## 23   22      7       1        0 0.53781513 0.23843463 0.8582008 0.33703662
## 24   23      6       1        0 0.44817927 0.30030719 0.8073720 0.24878823
## 25   25      5       0        1 0.44817927 0.30030719 0.8073720 0.24878823
## 26   32      4       0        2 0.44817927 0.30030719 0.8073720 0.24878823
## 27   34      2       0        1 0.44817927 0.30030719 0.8073720 0.24878823
## 28   35      1       0        1 0.44817927 0.30030719 0.8073720 0.24878823
##             strata     Group
## 1    Group=Placebo   Placebo
## 2    Group=Placebo   Placebo
## 3    Group=Placebo   Placebo
## 4    Group=Placebo   Placebo
## 5    Group=Placebo   Placebo
## 6    Group=Placebo   Placebo
## 7    Group=Placebo   Placebo
## 8    Group=Placebo   Placebo
## 9    Group=Placebo   Placebo
## 10   Group=Placebo   Placebo
## 11   Group=Placebo   Placebo
## 12   Group=Placebo   Placebo
## 13 Group=Treatment Treatment
## 14 Group=Treatment Treatment
## 15 Group=Treatment Treatment
## 16 Group=Treatment Treatment
## 17 Group=Treatment Treatment
## 18 Group=Treatment Treatment
## 19 Group=Treatment Treatment
## 20 Group=Treatment Treatment
## 21 Group=Treatment Treatment
## 22 Group=Treatment Treatment
## 23 Group=Treatment Treatment
## 24 Group=Treatment Treatment
## 25 Group=Treatment Treatment
## 26 Group=Treatment Treatment
## 27 Group=Treatment Treatment
## 28 Group=Treatment Treatment

##################################
# Plotting the Kaplan-Meier survival curves
# using the Factor Variable
##################################
SurvivalTimeInWeek.KaplanMeier.ByGroup <- ggsurvplot(DBP.Analysis.KaplanMeierEstimates,
                                                     title = "Kaplan-Meier Survival Curve (Group)",
                                                     pval = FALSE,
                                                     conf.int = TRUE,
                                                     conf.int.style = "ribbon",
                                                     xlab = "Time (Weeks)",
                                                     ylab="Estimated Survival Probability",
                                                     break.time.by = 5,
                                                     ggtheme = theme_bw(),
                                                     risk.table = "abs_pct",
                                                     risk.table.title="Number at Risk (Survival Probability)",
                                                     risk.table.y.text.col = FALSE,
                                                     risk.table.y.text = TRUE,
                                                     fontsize = 5,
                                                     ncensor.plot = FALSE,
                                                     surv.median.line = "hv",
                                                     legend.labs = c("Placebo", "Treatment"),
                                                     palette = c("#3259A0","#FF5050"))

ggpar(SurvivalTimeInWeek.KaplanMeier.ByGroup,
      font.title = c(25,"bold"),
      font.x = c(20,"bold"),
      font.y = c(20,"bold"),
      font.legend=c(20,"bold"),
      font.xtickslab=c(15,"bold","black"),
      font.ytickslab=c(15,"bold","black"))

1.4.3 Log Rank Test

##################################
# Conducting the Log Rank Test
# between the Kaplan-Meier survival estimates
# using the Factor Variable
##################################
(DBP.Analysis.LogRankTest <- survdiff(Surv(Week, Status) ~ Group, data = DBP.Analysis))

## Call:
## survdiff(formula = Surv(Week, Status) ~ Group, data = DBP.Analysis)
## 
##                  N Observed Expected (O-E)^2/E (O-E)^2/V
## Group=Placebo   21       21     10.7      9.77      16.8
## Group=Treatment 21        9     19.3      5.46      16.8
## 
##  Chisq= 16.8  on 1 degrees of freedom, p= 4e-05

##################################
# Plotting the Kaplan-Meier survival curves
# using the Factor Variable
##################################
SurvivalTimeInWeek.KaplanMeier.ByGroup <- ggsurvplot(DBP.Analysis.KaplanMeierEstimates,
                                                     title = "Kaplan-Meier Survival Curve (Group)",
                                                     pval = TRUE,
                                                     conf.int = TRUE,
                                                     conf.int.style = "ribbon",
                                                     xlab = "Time (Weeks)",
                                                     ylab="Estimated Survival Probability",
                                                     break.time.by = 5,
                                                     ggtheme = theme_bw(),
                                                     risk.table = "abs_pct",
                                                     risk.table.title="Number at Risk (Survival Probability)",
                                                     risk.table.y.text.col = FALSE,
                                                     risk.table.y.text = TRUE,
                                                     fontsize = 5,
                                                     ncensor.plot = FALSE,
                                                     surv.median.line = "hv",
                                                     legend.labs = c("Placebo", "Treatment"),
                                                     palette = c("#3259A0","#FF5050"))

ggpar(SurvivalTimeInWeek.KaplanMeier.ByGroup,
      font.title = c(25,"bold"),
      font.x = c(20,"bold"),
      font.y = c(20,"bold"),
      font.legend=c(20,"bold"),
      font.xtickslab=c(15,"bold","black"),
      font.ytickslab=c(15,"bold","black"))

1.5 Model Development

Cox proportional hazards regression analysis involving the response variable Week, factor variable Group and covariate variable WBC showed that :

[A] A cox proportional hazards regression model applied on the response variable Week and factor variable Group which includes the covariate variable WBC was most appropriate as the final survival model since its addition significantly improved model fit (p<0.001, Likelihood Ratio Test). However, the further inclusion of an interaction term Group * WBC into the model did not show any significant improvement to model fit (p=0.549, Likelihood Ratio Test).

[B] Using the final survival model controlled for WBC, association between Group and survival time Week was statistically significant with an estimated hazard ratio of 0.25 (95% CI 0.11, 0.57; p=0.001, Cox Proportional Hazards Model Wald Test) for Group=Treatment (lower hazard and higher survival rate for leukemia patients using the new treatment) relative to Group=Placebo (higher hazard and lower survival rate for leukemia patients using the placebo treatment).

[C] Using the final survival model controlled for Group, association between WBC and survival time Week was statistically significant with an estimated hazard ratio of 5.42 (95% CI 2.81, 10.50; p<0.001, Cox Proportional Hazards Model Wald Test) indicating higher hazard and lower survival rate for leukemia patients with higher white blood count measurement.

[D] The estimated Harrel’s concordance index (proportion of leukemia patients that the model can correctly order in terms of survival times when adjusted for censoring) for the final survival model was determined to be relatively high at 0.85 (optimistic) and 0.84 (optimism-corrected after a 500-cycle internal bootstrap validation).

[E] The scaled Schoenfield residuals showed random patterns when plotted against time indicating no violation of the proportional hazards assumption. The proportional hazards assumption was further supported by the non-significant relationship between the scaled Schoenfield residuals and time based from the global (p=0.9801) and individual (p=0.9927 for Group, 0.8414 for WBC ) statistical test results.

[F] The deviance residuals did not show any severely influential observations when plotted against the observations, the dfbeta residuals were reasonably symmetrical when plotted against the observations, and the fitted line with lowess function of the martingale residuals obtained for the null cox model was reasonably linear for the numeric covariate WBC - all indicating a fairly good model fit for the final survival model.

1.5.1 Univariate Cox Proportional Hazards Model

##################################
# Formulating a Univariate Cox Proportional Hazards Model
# independently for the Factor Variable
##################################
(DBP.Analysis.UnivariateCoxPH.Group <- coxph(Surv(Week, Status) ~ Group, data = DBP.Analysis))

## Call:
## coxph(formula = Surv(Week, Status) ~ Group, data = DBP.Analysis)
## 
##                   coef exp(coef) se(coef)      z        p
## GroupTreatment -1.5721    0.2076   0.4124 -3.812 0.000138
## 
## Likelihood ratio test=16.35  on 1 df, p=5.261e-05
## n= 42, number of events= 30

summary(DBP.Analysis.UnivariateCoxPH.Group)

## Call:
## coxph(formula = Surv(Week, Status) ~ Group, data = DBP.Analysis)
## 
##   n= 42, number of events= 30 
## 
##                   coef exp(coef) se(coef)      z Pr(>|z|)    
## GroupTreatment -1.5721    0.2076   0.4124 -3.812 0.000138 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
##                exp(coef) exp(-coef) lower .95 upper .95
## GroupTreatment    0.2076      4.817   0.09251    0.4659
## 
## Concordance= 0.69  (se = 0.041 )
## Likelihood ratio test= 16.35  on 1 df,   p=5e-05
## Wald test            = 14.53  on 1 df,   p=1e-04
## Score (logrank) test = 17.25  on 1 df,   p=3e-05

tbl_regression(DBP.Analysis.UnivariateCoxPH.Group, 
               exponentiate = TRUE)

Characteristic	HR¹	95% CI¹	p-value
Group
Placebo	—	—
Treatment	0.21	0.09, 0.47	<0.001
¹ HR = Hazard Ratio, CI = Confidence Interval

(ggforest(DBP.Analysis.UnivariateCoxPH.Group, 
          data=DBP.Analysis,
          main = "",
          fontsize = 1.5))

##################################
# Plotting the Kaplan-Meier survival curves
# for the formulated Univariate Cox Proportional Hazards Model
# involving the Factor Variable
##################################
DBP.Analysis.UnivariateCoxPH.Group.KaplanMeierEstimates <- survfit(DBP.Analysis.UnivariateCoxPH.Group, data = DBP.Analysis)

SurvivalTimeInWeek.KaplanMeier.UnivariateCoxPH.Group <- ggsurvplot(DBP.Analysis.UnivariateCoxPH.Group.KaplanMeierEstimates,
                                                     title = "Kaplan-Meier Survival Curves (CoxPH Model : Survival ~ Group)",
                                                     pval = FALSE,
                                                     conf.int = TRUE,
                                                     conf.int.style = "ribbon",
                                                     xlab = "Time (Weeks)",
                                                     ylab="Estimated Survival Probability",
                                                     break.time.by = 5,
                                                     ggtheme = theme_bw(),
                                                     risk.table = "abs_pct",
                                                     risk.table.title="Number at Risk (Survival Probability)",
                                                     risk.table.y.text.col = FALSE,
                                                     risk.table.y.text = TRUE,
                                                     fontsize = 5,
                                                     ncensor.plot = FALSE,
                                                     surv.median.line = "hv",
                                                     palette = c("#000000"))

ggpar(SurvivalTimeInWeek.KaplanMeier.UnivariateCoxPH.Group,
      font.title = c(25,"bold"),
      font.x = c(20,"bold"),
      font.y = c(20,"bold"),
      font.legend=c(20,"bold"),
      font.xtickslab=c(15,"bold","black"),
      font.ytickslab=c(15,"bold","black"))

##################################
# Formulating a Univariate Cox Proportional Hazards Model
# independently for the Covariate Variable
##################################
(DBP.Analysis.UnivariateCoxPH.WBC <- coxph(Surv(Week, Status) ~ WBC, data = DBP.Analysis))

## Call:
## coxph(formula = Surv(Week, Status) ~ WBC, data = DBP.Analysis)
## 
##      coef exp(coef) se(coef)     z        p
## WBC 1.646     5.189    0.298 5.525 3.29e-08
## 
## Likelihood ratio test=34.84  on 1 df, p=3.577e-09
## n= 42, number of events= 30

summary(DBP.Analysis.UnivariateCoxPH.WBC)

## Call:
## coxph(formula = Surv(Week, Status) ~ WBC, data = DBP.Analysis)
## 
##   n= 42, number of events= 30 
## 
##      coef exp(coef) se(coef)     z Pr(>|z|)    
## WBC 1.646     5.189    0.298 5.525 3.29e-08 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
##     exp(coef) exp(-coef) lower .95 upper .95
## WBC     5.188     0.1927     2.893     9.304
## 
## Concordance= 0.809  (se = 0.045 )
## Likelihood ratio test= 34.84  on 1 df,   p=4e-09
## Wald test            = 30.53  on 1 df,   p=3e-08
## Score (logrank) test = 36.62  on 1 df,   p=1e-09

tbl_regression(DBP.Analysis.UnivariateCoxPH.WBC, 
               exponentiate = TRUE)

Characteristic	HR¹	95% CI¹	p-value
WBC	5.19	2.89, 9.30	<0.001
¹ HR = Hazard Ratio, CI = Confidence Interval

(ggforest(DBP.Analysis.UnivariateCoxPH.WBC, 
          data=DBP.Analysis,
          main = "",
          fontsize = 1.5))

##################################
# Plotting the Kaplan-Meier survival curves
# for the formulated Univariate Cox Proportional Hazards Model
# involving the Covariate Variable
##################################
DBP.Analysis.UnivariateCoxPH.WBC.KaplanMeierEstimates <- survfit(DBP.Analysis.UnivariateCoxPH.WBC, data = DBP.Analysis)

SurvivalTimeInWeek.KaplanMeier.UnivariateCoxPH.WBC <- ggsurvplot(DBP.Analysis.UnivariateCoxPH.WBC.KaplanMeierEstimates,
                                                     title = "Kaplan-Meier Survival Curves (CoxPH Model : Survival ~ WBC)",
                                                     pval = FALSE,
                                                     conf.int = TRUE,
                                                     conf.int.style = "ribbon",
                                                     xlab = "Time (Weeks)",
                                                     ylab="Estimated Survival Probability",
                                                     break.time.by = 5,
                                                     ggtheme = theme_bw(),
                                                     risk.table = "abs_pct",
                                                     risk.table.title="Number at Risk (Survival Probability)",
                                                     risk.table.y.text.col = FALSE,
                                                     risk.table.y.text = TRUE,
                                                     fontsize = 5,
                                                     ncensor.plot = FALSE,
                                                     surv.median.line = "hv",
                                                     palette = c("#000000"))

ggpar(SurvivalTimeInWeek.KaplanMeier.UnivariateCoxPH.WBC,
      font.title = c(25,"bold"),
      font.x = c(20,"bold"),
      font.y = c(20,"bold"),
      font.legend=c(20,"bold"),
      font.xtickslab=c(15,"bold","black"),
      font.ytickslab=c(15,"bold","black"))

##################################
# Formulating a Univariate Cox Proportional Hazards Model
# individually for the Factor and Covariate Variables
##################################
tbl_survfit(
  list(survfit(Surv(Week, Status) ~ 1, data = DBP.Analysis),
       survfit(Surv(Week, Status) ~ Group, data = DBP.Analysis),
       survfit(Surv(Week, Status) ~ WBC, data = DBP.Analysis)),
  probs = 0.5,
  label_header = "**Median Survival**") %>%
  add_n() %>%
  add_nevent()

Characteristic	N	Event N	Median Survival
Overall	42	30	12 (8.0, 22)
Group	42	30
Placebo			8.0 (4.0, 12)
Treatment			23 (16, —)
WBC	42	30
1.45			— (—, —)
1.47			— (—, —)
1.5			12 (—, —)
1.78			— (—, —)
1.97			23 (—, —)
2.01			— (—, —)
2.05			— (—, —)
2.12			11 (—, —)
2.16			— (—, —)
2.2			— (—, —)
2.3			15 (—, —)
2.31			6.0 (—, —)
2.32			15 (8.0, —)
2.42			4.0 (—, —)
2.53			— (—, —)
2.57			23 (—, —)
2.6			— (—, —)
2.7			— (—, —)
2.73			22 (—, —)
2.8			5.0 (1.0, —)
2.88			13 (—, —)
2.95			17 (—, —)
2.96			10 (—, —)
3.05			8.0 (—, —)
3.06			12 (—, —)
3.2			— (—, —)
3.26			8.0 (—, —)
3.28			6.0 (—, —)
3.49			8.0 (5.0, —)
3.52			8.0 (—, —)
3.6			16 (—, —)
3.97			5.0 (—, —)
4.01			3.0 (—, —)
4.06			6.0 (—, —)
4.36			4.0 (—, —)
4.43			7.0 (—, —)
4.48			2.0 (—, —)
4.91			2.0 (—, —)
5			1.0 (—, —)

1.5.2 Multivariate Cox Proportional Hazards Model

##################################
# Formulating a Multivariate Cox Proportional Hazards Model
# with both the Factor and Covariate Variables
##################################
(DBP.Analysis.MultivariateCoxPH.GroupWBC <- coxph(Surv(Week, Status) ~ Group + WBC, data = DBP.Analysis))

## Call:
## coxph(formula = Surv(Week, Status) ~ Group + WBC, data = DBP.Analysis)
## 
##                   coef exp(coef) se(coef)      z       p
## GroupTreatment -1.3861    0.2501   0.4248 -3.263  0.0011
## WBC             1.6909    5.4243   0.3359  5.034 4.8e-07
## 
## Likelihood ratio test=46.71  on 2 df, p=7.187e-11
## n= 42, number of events= 30

summary(DBP.Analysis.MultivariateCoxPH.GroupWBC )

## Call:
## coxph(formula = Surv(Week, Status) ~ Group + WBC, data = DBP.Analysis)
## 
##   n= 42, number of events= 30 
## 
##                   coef exp(coef) se(coef)      z Pr(>|z|)    
## GroupTreatment -1.3861    0.2501   0.4248 -3.263   0.0011 ** 
## WBC             1.6909    5.4243   0.3359  5.034  4.8e-07 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
##                exp(coef) exp(-coef) lower .95 upper .95
## GroupTreatment    0.2501     3.9991    0.1088    0.5749
## WBC               5.4243     0.1844    2.8082   10.4776
## 
## Concordance= 0.852  (se = 0.04 )
## Likelihood ratio test= 46.71  on 2 df,   p=7e-11
## Wald test            = 33.6  on 2 df,   p=5e-08
## Score (logrank) test = 46.07  on 2 df,   p=1e-10

tbl_regression(DBP.Analysis.MultivariateCoxPH.GroupWBC, 
               exponentiate = TRUE)

Characteristic	HR¹	95% CI¹	p-value
Group
Placebo	—	—
Treatment	0.25	0.11, 0.57	0.001
WBC	5.42	2.81, 10.5	<0.001
¹ HR = Hazard Ratio, CI = Confidence Interval

(DBP.Analysis.MultivariateCoxPH.GroupWBC.ForestPlot <- ggforest(DBP.Analysis.MultivariateCoxPH.GroupWBC , 
          data=DBP.Analysis,
          main = "",
          fontsize = 1.5))

##################################
# Plotting the Kaplan-Meier survival curves
# for the formulated Multivariate Cox Proportional Hazards Model
# involving the Factor and Covariate variables
##################################
DBP.Analysis.MultivariateCoxPH.GroupWBC.KaplanMeierEstimates <- survfit(DBP.Analysis.MultivariateCoxPH.GroupWBC , data = DBP.Analysis)

SurvivalTimeInWeek.KaplanMeier.MultivariateCoxPH.GroupWBC <- ggsurvplot(DBP.Analysis.MultivariateCoxPH.GroupWBC.KaplanMeierEstimates,
                                                     title = "Kaplan-Meier Survival Curves (CoxPH Model : Survival ~ Group + WBC)",
                                                     pval = FALSE,
                                                     conf.int = TRUE,
                                                     conf.int.style = "ribbon",
                                                     xlab = "Time (Weeks)",
                                                     ylab="Estimated Survival Probability",
                                                     break.time.by = 5,
                                                     ggtheme = theme_bw(),
                                                     risk.table = "abs_pct",
                                                     risk.table.title="Number at Risk (Survival Probability)",
                                                     risk.table.y.text.col = FALSE,
                                                     risk.table.y.text = TRUE,
                                                     fontsize = 5,
                                                     ncensor.plot = FALSE,
                                                     surv.median.line = "hv",
                                                     palette = c("#000000"))

ggpar(SurvivalTimeInWeek.KaplanMeier.MultivariateCoxPH.GroupWBC,
      font.title = c(25,"bold"),
      font.x = c(20,"bold"),
      font.y = c(20,"bold"),
      font.legend=c(20,"bold"),
      font.xtickslab=c(15,"bold","black"),
      font.ytickslab=c(15,"bold","black"))

##################################
# Formulating a Multivariate Cox Proportional Hazards Model
# with both the Factor and Covariate Variables
# including their interaction
##################################
(DBP.Analysis.MultivariateCoxPH.GroupWBC.WithInteraction <- coxph(Surv(Week, Status) ~ Group + WBC + Group*WBC, data = DBP.Analysis))

## Call:
## coxph(formula = Surv(Week, Status) ~ Group + WBC + Group * WBC, 
##     data = DBP.Analysis)
## 
##                        coef exp(coef) se(coef)      z        p
## GroupTreatment     -2.37491   0.09302  1.70547 -1.393    0.164
## WBC                 1.55489   4.73459  0.39866  3.900 9.61e-05
## GroupTreatment:WBC  0.31752   1.37372  0.52579  0.604    0.546
## 
## Likelihood ratio test=47.07  on 3 df, p=3.356e-10
## n= 42, number of events= 30

summary(DBP.Analysis.MultivariateCoxPH.GroupWBC.WithInteraction)

## Call:
## coxph(formula = Surv(Week, Status) ~ Group + WBC + Group * WBC, 
##     data = DBP.Analysis)
## 
##   n= 42, number of events= 30 
## 
##                        coef exp(coef) se(coef)      z Pr(>|z|)    
## GroupTreatment     -2.37491   0.09302  1.70547 -1.393    0.164    
## WBC                 1.55489   4.73459  0.39866  3.900 9.61e-05 ***
## GroupTreatment:WBC  0.31752   1.37372  0.52579  0.604    0.546    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
##                    exp(coef) exp(-coef) lower .95 upper .95
## GroupTreatment       0.09302    10.7500  0.003288     2.632
## WBC                  4.73459     0.2112  2.167413    10.342
## GroupTreatment:WBC   1.37372     0.7280  0.490169     3.850
## 
## Concordance= 0.851  (se = 0.041 )
## Likelihood ratio test= 47.07  on 3 df,   p=3e-10
## Wald test            = 32.39  on 3 df,   p=4e-07
## Score (logrank) test = 49.86  on 3 df,   p=9e-11

tbl_regression(DBP.Analysis.MultivariateCoxPH.GroupWBC.WithInteraction, 
               exponentiate = TRUE)

Characteristic	HR¹	95% CI¹	p-value
Group
Placebo	—	—
Treatment	0.09	0.00, 2.63	0.2
WBC	4.73	2.17, 10.3	<0.001
Group * WBC
Treatment * WBC	1.37	0.49, 3.85	0.5
¹ HR = Hazard Ratio, CI = Confidence Interval

(DBP.Analysis.MultivariateCoxPH.GroupWBC.WithInteraction.ForestPlot <- ggforest(DBP.Analysis.MultivariateCoxPH.GroupWBC.WithInteraction, 
          data=DBP.Analysis,
          main = "",
          fontsize = 1.5))

##################################
# Plotting the Kaplan-Meier survival curves
# for the formulated Multivariate Cox Proportional Hazards Model
# involving the Factor and Covariate variables
# including their interaction
##################################
DBP.Analysis.MultivariateCoxPH.GroupWBC.WithInteraction.KaplanMeierEstimates <- survfit(DBP.Analysis.MultivariateCoxPH.GroupWBC.WithInteraction, data = DBP.Analysis)

SurvivalTimeInWeek.KaplanMeier.MultivariateCoxPH.GroupWBC.WithInteraction <- ggsurvplot(DBP.Analysis.MultivariateCoxPH.GroupWBC.WithInteraction.KaplanMeierEstimates,
                                                     title = "Kaplan-Meier Survival Curves (CoxPH Model : Survival ~ Group * WBC)",
                                                     pval = FALSE,
                                                     conf.int = TRUE,
                                                     conf.int.style = "ribbon",
                                                     xlab = "Time (Weeks)",
                                                     ylab="Estimated Survival Probability",
                                                     break.time.by = 5,
                                                     ggtheme = theme_bw(),
                                                     risk.table = "abs_pct",
                                                     risk.table.title="Number at Risk (Survival Probability)",
                                                     risk.table.y.text.col = FALSE,
                                                     risk.table.y.text = TRUE,
                                                     fontsize = 5,
                                                     ncensor.plot = FALSE,
                                                     surv.median.line = "hv",
                                                     palette = c("#000000"))

ggpar(SurvivalTimeInWeek.KaplanMeier.MultivariateCoxPH.GroupWBC.WithInteraction,
      font.title = c(25,"bold"),
      font.x = c(20,"bold"),
      font.y = c(20,"bold"),
      font.legend=c(20,"bold"),
      font.xtickslab=c(15,"bold","black"),
      font.ytickslab=c(15,"bold","black"))

##################################
# Formulating complete and reduced
# Multivariate Cox Proportional Hazards Models
# involving the Factor and Covariate Variables
##################################
DBP.Analysis.SurvivalResponse <- "Surv(Week, Status)"
DBP.Analysis.SurvivalPredictorComplete <- c("Group", "WBC")
DBP.Analysis.SurvivalPredictorReduced <- c("Group", "WBC")

DBP.Analysis.CoxPHModelSummary <- DBP.Analysis %>%
  finalfit(DBP.Analysis.SurvivalResponse, 
           DBP.Analysis.SurvivalPredictorComplete, 
           DBP.Analysis.SurvivalPredictorReduced, 
           keep_models = TRUE,
           add_dependent_label = FALSE) %>%
  rename("Overall Survival" = label) %>%
  rename(" " = levels) %>%
  rename("  " = all)

kable(DBP.Analysis.CoxPHModelSummary)

Overall Survival			HR (univariable)	HR (multivariable)	HR (multivariable reduced)
Group	Placebo	21 (50.0)	-	-	-
	Treatment	21 (50.0)	0.21 (0.09-0.47, p<0.001)	0.25 (0.11-0.57, p=0.001)	0.25 (0.11-0.57, p=0.001)
WBC	Mean (SD)	2.9 (0.9)	5.19 (2.89-9.30, p<0.001)	5.42 (2.81-10.48, p<0.001)	5.42 (2.81-10.48, p<0.001)

1.5.3 Model Selection

##################################
# Reusing formulated models
##################################
summary(DBP.Analysis.UnivariateCoxPH.Group)

## Call:
## coxph(formula = Surv(Week, Status) ~ Group, data = DBP.Analysis)
## 
##   n= 42, number of events= 30 
## 
##                   coef exp(coef) se(coef)      z Pr(>|z|)    
## GroupTreatment -1.5721    0.2076   0.4124 -3.812 0.000138 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
##                exp(coef) exp(-coef) lower .95 upper .95
## GroupTreatment    0.2076      4.817   0.09251    0.4659
## 
## Concordance= 0.69  (se = 0.041 )
## Likelihood ratio test= 16.35  on 1 df,   p=5e-05
## Wald test            = 14.53  on 1 df,   p=1e-04
## Score (logrank) test = 17.25  on 1 df,   p=3e-05

summary(DBP.Analysis.MultivariateCoxPH.GroupWBC)

## Call:
## coxph(formula = Surv(Week, Status) ~ Group + WBC, data = DBP.Analysis)
## 
##   n= 42, number of events= 30 
## 
##                   coef exp(coef) se(coef)      z Pr(>|z|)    
## GroupTreatment -1.3861    0.2501   0.4248 -3.263   0.0011 ** 
## WBC             1.6909    5.4243   0.3359  5.034  4.8e-07 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
##                exp(coef) exp(-coef) lower .95 upper .95
## GroupTreatment    0.2501     3.9991    0.1088    0.5749
## WBC               5.4243     0.1844    2.8082   10.4776
## 
## Concordance= 0.852  (se = 0.04 )
## Likelihood ratio test= 46.71  on 2 df,   p=7e-11
## Wald test            = 33.6  on 2 df,   p=5e-08
## Score (logrank) test = 46.07  on 2 df,   p=1e-10

summary(DBP.Analysis.MultivariateCoxPH.GroupWBC.WithInteraction)

## Call:
## coxph(formula = Surv(Week, Status) ~ Group + WBC + Group * WBC, 
##     data = DBP.Analysis)
## 
##   n= 42, number of events= 30 
## 
##                        coef exp(coef) se(coef)      z Pr(>|z|)    
## GroupTreatment     -2.37491   0.09302  1.70547 -1.393    0.164    
## WBC                 1.55489   4.73459  0.39866  3.900 9.61e-05 ***
## GroupTreatment:WBC  0.31752   1.37372  0.52579  0.604    0.546    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
##                    exp(coef) exp(-coef) lower .95 upper .95
## GroupTreatment       0.09302    10.7500  0.003288     2.632
## WBC                  4.73459     0.2112  2.167413    10.342
## GroupTreatment:WBC   1.37372     0.7280  0.490169     3.850
## 
## Concordance= 0.851  (se = 0.041 )
## Likelihood ratio test= 47.07  on 3 df,   p=3e-10
## Wald test            = 32.39  on 3 df,   p=4e-07
## Score (logrank) test = 49.86  on 3 df,   p=9e-11

##################################
# Formulating the linear predictors
# for the formulated Univariate Cox Proportional Hazards Models
##################################
DBP.Analysis$LP.Univariate.Group <- predict(DBP.Analysis.UnivariateCoxPH.Group, type = "lp")

##################################
# Formulating the linear predictors
# for the formulated Multivariate Cox Proportional Hazards Models
##################################
DBP.Analysis$LP.Multivariate.GroupWBC <- predict(DBP.Analysis.MultivariateCoxPH.GroupWBC, type = "lp")
DBP.Analysis$LP.Multivariate.GroupWBC.WithInteraction <- predict(DBP.Analysis.MultivariateCoxPH.GroupWBC.WithInteraction, type = "lp")

##################################
# Conducting a likelihood ratio test between nested models
# including the formulated Multivariate Cox Proportional Hazards Models
# involving the Factor variable only
# and both Factor and Covariate variables
##################################
(LikelihoodRatioTest.CoxPHGroup.CoxPHGroupWBC <- anova(DBP.Analysis.UnivariateCoxPH.Group,
      DBP.Analysis.MultivariateCoxPH.GroupWBC,
      test="LRT"))

## Analysis of Deviance Table
##  Cox model: response is  Surv(Week, Status)
##  Model 1: ~ Group
##  Model 2: ~ Group + WBC
##    loglik  Chisq Df P(>|Chi|)    
## 1 -85.008                        
## 2 -69.828 30.361  1 3.587e-08 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

##################################
# Conducting a likelihood ratio test between nested models
# including the formulated Multivariate Cox Proportional Hazards Models
# involving both Factor and Covariate variables only
# and both Factor and Covariate variables including their interaction
##################################
(LikelihoodRatioTest.CoxPHGroupWBC.CoxPHGroupWBCWithInteraction <- anova(DBP.Analysis.MultivariateCoxPH.GroupWBC,
      DBP.Analysis.MultivariateCoxPH.GroupWBC.WithInteraction,
      test="LRT"))

## Analysis of Deviance Table
##  Cox model: response is  Surv(Week, Status)
##  Model 1: ~ Group + WBC
##  Model 2: ~ Group + WBC + Group * WBC
##    loglik  Chisq Df P(>|Chi|)
## 1 -69.828                    
## 2 -69.648 0.3594  1    0.5488

1.5.4 Model Performance Validation

##################################
# Measuring the Harrel's Concordance Index
# for the formulated Univariate Cox Proportional Hazards Model
# involving the Factor variable
##################################
rcorrcens(formula = Surv(Week, Status) ~ I(-1 * LP.Univariate.Group), data = DBP.Analysis)

## 
## Somers' Rank Correlation for Censored Data    Response variable:Surv(Week, Status)
## 
##                                C  Dxy aDxy    SD    Z P  n
## I(-1 * LP.Univariate.Group) 0.69 0.38 0.38 0.082 4.62 0 42

LP.Univariate.Group.Concordance <- survConcordance(Surv(Week, Status) ~ LP.Univariate.Group,
                                               data = DBP.Analysis)

(LP.Univariate.Group.Concordance$concordance)

## concordant 
##  0.6900421

DBP.Analysis.UnivariateCoxPH.Group.Validation <- cph(Surv(Week, Status) ~ Group, 
                                                     x=TRUE, 
                                                     y=TRUE, 
                                                     data = DBP.Analysis)

set.seed (88888888)
DBP.Analysis.UnivariateCoxPH.Group.Validation.Summary <- validate(DBP.Analysis.UnivariateCoxPH.Group.Validation, B =500)

(DBP.Analysis.UnivariateCoxPH.Group.ConcordanceIndex.Optimistic <- (DBP.Analysis.UnivariateCoxPH.Group.Validation.Summary[1,1]/2)+0.50)

## [1] 0.6900421

(DBP.Analysis.UnivariateCoxPH.Group.ConcordanceIndex.OptimismCorrected <- (DBP.Analysis.UnivariateCoxPH.Group.Validation.Summary[1,5]/2)+0.50)

## [1] 0.6894831

##################################
# Measuring the Harrel's Concordance Index
# for the formulated Multivariate Cox Proportional Hazards Model
# involving the Factor and Covariate variables
##################################
rcorrcens(formula = Surv(Week, Status) ~ I(-1 * LP.Multivariate.GroupWBC), data = DBP.Analysis)

## 
## Somers' Rank Correlation for Censored Data    Response variable:Surv(Week, Status)
## 
##                                      C   Dxy  aDxy    SD    Z P  n
## I(-1 * LP.Multivariate.GroupWBC) 0.852 0.704 0.704 0.081 8.73 0 42

LP.Multivariate.GroupWBC.Concordance <- survConcordance(Surv(Week, Status) ~ LP.Multivariate.GroupWBC, 
                                                        data = DBP.Analysis)

(LP.Multivariate.GroupWBC.Concordance$concordance)

## concordant 
##  0.8520337

DBP.Analysis.Multivariate.GroupWBC.Validation <- cph(Surv(Week, Status) ~ Group + WBC, 
                                                     x=TRUE, 
                                                     y=TRUE, 
                                                     data = DBP.Analysis)

set.seed (88888888)
DBP.Analysis.Multivariate.GroupWBC.Validation.Summary <- validate(DBP.Analysis.Multivariate.GroupWBC.Validation, B =500)

(DBP.Analysis.Multivariate.GroupWBC.ConcordanceIndex.Optimistic <- (DBP.Analysis.Multivariate.GroupWBC.Validation.Summary[1,1]/2)+0.50)

## [1] 0.8520337

(DBP.Analysis.Multivariate.GroupWBC.ConcordanceIndex.OptimismCorrected <- (DBP.Analysis.Multivariate.GroupWBC.Validation.Summary[1,5]/2)+0.50)

## [1] 0.8448274

##################################
# Measuring the Harrel's Concordance Index
# for the formulated Multivariate Cox Proportional Hazards Model
# involving the Factor and Covariate variables
# including their interaction
##################################
rcorrcens(formula = Surv(Week, Status) ~ I(-1 * LP.Multivariate.GroupWBC.WithInteraction), data = DBP.Analysis)

## 
## Somers' Rank Correlation for Censored Data    Response variable:Surv(Week, Status)
## 
##                                                      C   Dxy  aDxy    SD    Z P
## I(-1 * LP.Multivariate.GroupWBC.WithInteraction) 0.851 0.701 0.701 0.082 8.57 0
##                                                   n
## I(-1 * LP.Multivariate.GroupWBC.WithInteraction) 42

LP.Multivariate.GroupWBC.WithInteraction.Concordance <- survConcordance(Surv(Week, Status) ~ LP.Multivariate.GroupWBC.WithInteraction, data = DBP.Analysis)

(LP.Multivariate.GroupWBC.WithInteraction.Concordance$concordance)

## concordant 
##  0.8506311

DBP.Analysis.Multivariate.GroupWBC.WithInteraction.Validation <- cph(Surv(Week, Status) ~ Group + WBC + Group*WBC, 
                                                     x=TRUE, 
                                                     y=TRUE, 
                                                     data = DBP.Analysis)

set.seed (88888888)
DBP.Analysis.Multivariate.GroupWBC.WithInteraction.Validation.Summary <- validate(DBP.Analysis.Multivariate.GroupWBC.WithInteraction.Validation, B =500)

(DBP.Analysis.Multivariate.GroupWBC.WithInteraction.ConcordanceIndex.Optimistic <- (DBP.Analysis.Multivariate.GroupWBC.WithInteraction.Validation.Summary[1,1]/2)+0.50)

## [1] 0.8506311

(DBP.Analysis.Multivariate.GroupWBC.WithInteraction.ConcordanceIndex.OptimismCorrected <- (DBP.Analysis.Multivariate.GroupWBC.WithInteraction.Validation.Summary[1,5]/2)+0.50)

## [1] 0.8406148

1.5.5 Model Diagnostics

##################################
# Reusing selected model
##################################
summary(DBP.Analysis.MultivariateCoxPH.GroupWBC)

## Call:
## coxph(formula = Surv(Week, Status) ~ Group + WBC, data = DBP.Analysis)
## 
##   n= 42, number of events= 30 
## 
##                   coef exp(coef) se(coef)      z Pr(>|z|)    
## GroupTreatment -1.3861    0.2501   0.4248 -3.263   0.0011 ** 
## WBC             1.6909    5.4243   0.3359  5.034  4.8e-07 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
##                exp(coef) exp(-coef) lower .95 upper .95
## GroupTreatment    0.2501     3.9991    0.1088    0.5749
## WBC               5.4243     0.1844    2.8082   10.4776
## 
## Concordance= 0.852  (se = 0.04 )
## Likelihood ratio test= 46.71  on 2 df,   p=7e-11
## Wald test            = 33.6  on 2 df,   p=5e-08
## Score (logrank) test = 46.07  on 2 df,   p=1e-10

tbl_regression(DBP.Analysis.MultivariateCoxPH.GroupWBC, 
               exponentiate = TRUE)

Characteristic	HR¹	95% CI¹	p-value
Group
Placebo	—	—
Treatment	0.25	0.11, 0.57	0.001
WBC	5.42	2.81, 10.5	<0.001
¹ HR = Hazard Ratio, CI = Confidence Interval

(DBP.Analysis.MultivariateCoxPH.GroupWBC.ForestPlot <- ggforest(DBP.Analysis.MultivariateCoxPH.GroupWBC , 
          data=DBP.Analysis,
          main = "",
          fontsize = 1.5))

##################################
# Plotting the Kaplan-Meier survival curves
# for the formulated Multivariate Cox Proportional Hazards Model
# involving the Factor and Covariate variables
##################################
DBP.Analysis.MultivariateCoxPH.GroupWBC.KaplanMeierEstimates <- survfit(DBP.Analysis.MultivariateCoxPH.GroupWBC , data = DBP.Analysis)

SurvivalTimeInWeek.KaplanMeier.MultivariateCoxPH.GroupWBC <- ggsurvplot(DBP.Analysis.MultivariateCoxPH.GroupWBC.KaplanMeierEstimates,
                                                     title = "Kaplan-Meier Survival Curves (CoxPH Model : Survival ~ Group + WBC)",
                                                     pval = FALSE,
                                                     conf.int = TRUE,
                                                     conf.int.style = "ribbon",
                                                     xlab = "Time (Weeks)",
                                                     ylab="Estimated Survival Probability",
                                                     break.time.by = 5,
                                                     ggtheme = theme_bw(),
                                                     risk.table = "abs_pct",
                                                     risk.table.title="Number at Risk (Survival Probability)",
                                                     risk.table.y.text.col = FALSE,
                                                     risk.table.y.text = TRUE,
                                                     fontsize = 5,
                                                     ncensor.plot = FALSE,
                                                     surv.median.line = "hv",
                                                     palette = c("#000000"))

ggpar(SurvivalTimeInWeek.KaplanMeier.MultivariateCoxPH.GroupWBC,
      font.title = c(25,"bold"),
      font.x = c(20,"bold"),
      font.y = c(20,"bold"),
      font.legend=c(20,"bold"),
      font.xtickslab=c(15,"bold","black"),
      font.ytickslab=c(15,"bold","black"))

##################################
# Plotting the Kaplan-Meier survival curves
# of the Factor variable adjusted by the Covariate variable 
# using the formulated Multivariate Cox Proportional Hazards Model
##################################

(DBP.GroupAdjustedByWBC <- with(DBP.Analysis,
                                 data.frame(Group = c("Placebo", "Treatment"),
                                            WBC = rep(mean(WBC, na.rm = TRUE), 2))))

##       Group      WBC
## 1   Placebo 2.930238
## 2 Treatment 2.930238

DBP.Analysis.MultivariateCoxPH.GroupWBC.GroupAdjustedByWBC.KaplanMeierEstimates <- survfit(DBP.Analysis.MultivariateCoxPH.GroupWBC, 
                                                                                             newdata = DBP.GroupAdjustedByWBC, 
                                                                                             data=DBP.Analysis)


SurvivalTimeInWeek.KaplanMeier.MultivariateCoxPH.GroupWBC.GroupAdjustedByWBC <- ggsurvplot(DBP.Analysis.MultivariateCoxPH.GroupWBC.GroupAdjustedByWBC.KaplanMeierEstimates,
                                                                         title = "Kaplan-Meier Survival Curves (CoxPH Model : Survival ~ Group + WBC)",
                                                                         pval = TRUE,
                                                                         conf.int = TRUE,
                                                                         conf.int.style = "ribbon",
                                                                         xlab = "Time (Weeks)",
                                                                         ylab="Estimated Survival Probability",
                                                                         break.time.by = 5,
                                                                         ggtheme = theme_bw(),
                                                                         risk.table = "abs_pct",
                                                                         risk.table.title="Number at Risk (Survival Probability)",
                                                                         risk.table.y.text.col = FALSE,
                                                                         risk.table.y.text = TRUE,
                                                                         fontsize = 5,
                                                                         ncensor.plot = FALSE,
                                                                         surv.median.line = "hv",
                                                                         legend.labs = c("Placebo", "Treatment"),
                                                                         palette = c("#3259A0","#FF5050"))

ggpar(SurvivalTimeInWeek.KaplanMeier.MultivariateCoxPH.GroupWBC.GroupAdjustedByWBC,
      font.title = c(25,"bold"),
      font.x = c(20,"bold"),
      font.y = c(20,"bold"),
      font.legend=c(20,"bold"),
      font.xtickslab=c(15,"bold","black"),
      font.ytickslab=c(15,"bold","black"))

##################################
# Testing the proportional hazards assumption
# based on the scaled Schoenfield residuals
##################################
(DBP.Analysis.MultivariateCoxPH.GroupWBC.PHAssumptionCheck <- cox.zph(DBP.Analysis.MultivariateCoxPH.GroupWBC))

##           chisq df    p
## Group  8.27e-05  1 0.99
## WBC    4.00e-02  1 0.84
## GLOBAL 4.02e-02  2 0.98

##################################
# Formulating the graphical verification
# of the proportional hazards assumption test results
# using the scaled Schoenfeld residuals against time
##################################
(DBP.Analysis.MultivariateCoxPH.GroupWBC.PHAssumptionPlot <- ggcoxzph(DBP.Analysis.MultivariateCoxPH.GroupWBC.PHAssumptionCheck,
         point.col = "red",
         point.size = 3,
         point.shape = 19,
         point.alpha = 0.80,
  ggtheme = theme_survminer(),
  font.main = c(25,"bold"),
  font.x = c(20,"bold"),
  font.y = c(20,"bold"),
  font.xtickslab=c(15,"bold","black"),
  font.ytickslab=c(15,"bold","black")))

##################################
# Formulating the graphical verification
# to evaluate potential influential observations
# using the deviance residuals against observations
##################################
ggcoxdiagnostics(DBP.Analysis.MultivariateCoxPH.GroupWBC, 
 type = "deviance", 
 ox.scale = "observation.id",
 point.col = "red",
 point.size = 3,
 point.shape = 19,
 point.alpha = 0.80,
 ggtheme = theme_survminer(),
 font.main = c(25,"bold"),
 font.x = c(20,"bold"),
 font.y = c(20,"bold"),
 font.xtickslab=c(15,"bold","black"),
 font.ytickslab=c(15,"bold","black"),
 main = "Deviance Residuals by Observation")

##################################
# Formulating the graphical verification
# to evaluate symmetry
# using the dfbeta residuals against observations
##################################
ggcoxdiagnostics(DBP.Analysis.MultivariateCoxPH.GroupWBC, 
 type = "dfbeta", 
 ox.scale = "observation.id",
 point.col = "red",
 point.size = 3,
 point.shape = 19,
 point.alpha = 0.80,
 ggtheme = theme_survminer(),
 font.main = c(25,"bold"),
 font.x = c(20,"bold"),
 font.y = c(20,"bold"),
 font.xtickslab=c(15,"bold","black"),
 font.ytickslab=c(15,"bold","black"),
 main = "Dfbeta Residuals by Observation")

##################################
# Formulating the graphical verification
# to evaluate linearity of the numeric covariate
# using the martingale residuals of the null cox proportional hazards model
# and the individual values of the numeric covariate 
##################################
ggcoxfunctional(Surv(Week, Status) ~ WBC, 
 data = DBP.Analysis,
 point.col = "red",
 point.size = 3,
 point.shape = 19,
 point.alpha = 0.80,
 ggtheme = theme_survminer(),
 font.main = c(25,"bold"),
 font.x = c(20,"bold"),
 font.y = c(20,"bold"),
 font.xtickslab=c(15,"bold","black"),
 font.ytickslab=c(15,"bold","black"),
 main = "Martingale Residuals by WBC")

1.6 References

[Book] Survival Analysis : A Self-Learning Text by David Kleinbaum and Mitchel Klein
[Book] Applied Survival Analysis Using R by Dirk Moore
[Book] R for Health Data Science by Ewen Harrison and Riinu Pius
[Book] Supervised Machine Learning by Michael Foley
[Book] Data Science for Biological, Medical and Health Research by Thomas Love
[Article] Survival Analysis by Alboukadel Kassambara
[Article] Survival Analysis in R by Emily Zabor
[Article] Survival Analysis with R by Joseph Rickert
[Article] Assessment of Discrimination in Survival Analysis by R-Studio Team
[Article] Cox Model Assumptions by Alboukadel Kassambara
[Publication] Survival Analysis Part I: Basic Concepts and First Analyses by Taane Clark (British Journal of Cancer)
[Publication] Survival Analysis Part II: Multivariate Data Analysis – An Introduction to Concepts and Methods by Mike Bradburn (British Journal of Cancer)
[Publication] Survival Analysis Part III: Multivariate Data Analysis – Choosing a Model and Assessing its Adequacy and Fit by Mike Bradburn (British Journal of Cancer)
[Publication] Survival Analysis Part IV: Further Concepts and Methods in Survival Analysis by Taane Clark (British Journal of Cancer)
[Tutorial] Survival Analysis / Time-To-Event Analysis in R by Heidi Seibold (Datacamp)
[R Package] moments by Lukasz Komsta and Frederick Novomestky
[R Package] car by John Fox and Sanford Weisberg
[R Package] multcomp by Torsten Hothorn, Frank Bretz and Peter Westfall
[R Package] effects by John Fox and Sanford Weisberg
[R Package] psych by William Revelle
[R Package] ggplot2 by Hadley Wickham
[R Package] dplyr by Hadley Wickham
[R Package] ggpubr by Alboukadel Kassambara
[R Package] rstatix by Alboukadel Kassambara
[R Package] ggfortify by Yuan Tang
[R Package] trend by Thorsten Pohlert
[R Package] survival by Terry Therneau
[R Package] rms by Frank Harrell
[R Package] survminer by Alboukadel Kassambara
[R Package] Hmisc by Frank Harrel
[R Package] finalfit by Ewen Harrison
[R Package] knitr by Yihui Xie
[R Package] gtsummary by Daniel Sjoberg

Methods : Survival Analysis and Descriptive Modelling for a Two-Group Right-Censored Data with Time-Independent Variables Using Cox Proportional Hazards Model

John Pauline Pineda

May 5, 2022