Survival Analysis III: Stratified Cox

This is an R Markdown Notebook. When you execute code within the notebook, the results appear beneath the code.

Try executing this chunk by clicking the Run button within the chunk or by placing your cursor inside it and pressing Ctrl+Shift+Enter.

Stratified Cox Procedure

The “stratifed Cox model” is a modifcation of the Cox proportional hazards (PH) model that allows for control by “stratifcation” of a predictor that does not satisfy the PH assumption. Predictors that are assumed to satisfy the PH assumption are included in the model, whereas the predictor being stratifed is not included.

er<-read.csv("D:\\Users\\EChen13\\OneDrive - JNJ\\Documents\\stat\\Survival data\\remissionData.csv", sep=";")

er$grp <- factor(er$placebo  )
library(survival)
fit1 <- with(er, coxph(Surv(time, event) ~grp+logWBC+female))
fit1

## Call:
## coxph(formula = Surv(time, event) ~ grp + logWBC + female)
## 
##          coef exp(coef) se(coef)     z        p
## grp1   1.5036    4.4978   0.4615 3.258  0.00112
## logWBC 1.6819    5.3760   0.3366 4.997 5.82e-07
## female 0.3147    1.3698   0.4545 0.692  0.48872
## 
## Likelihood ratio test=47.19  on 3 df, p=3.171e-10
## n= 42, number of events= 30

coefficient of female is not significant. use cox.zph to check for PH assumption violation

print(cox.zph(fit1))

##        chisq df    p
## grp    0.036  1 0.85
## logWBC 0.142  1 0.71
## female 5.420  1 0.02
## GLOBAL 5.879  3 0.12

The result shows female varaible does not satisfy PH. The variable that do not satisfy PH assumption are used in stratified COX procedure. In the follow code, the covariate is entered as strata(female).

er$we_cat<-cut(er$logWBC,c(0,2.3,3,8))
fit<-with(er, coxph(Surv(time, event)~grp+logWBC+strata(female)))

print(fit)

## Call:
## coxph(formula = Surv(time, event) ~ grp + logWBC + strata(female))
## 
##          coef exp(coef) se(coef)     z        p
## grp1   0.9981    2.7131   0.4736 2.108   0.0351
## logWBC 1.4537    4.2787   0.3441 4.225 2.39e-05
## 
## Likelihood ratio test=32.06  on 2 df, p=1.092e-07
## n= 42, number of events= 30

The computer results show that the log WBC and grp variables are included in the model listing, whereas the sex variable is not included; rather, the model stratifes on the sex variable, as indicated at the bottom of the output. Note that the sex variable is being adjusted by stratifcation, whereas log WBCis being adjusted by its inclusion in the model along with grp. Another important aspect to note is that sex variable is not included in the model summary because it is controlled by stratifcation.

Another question: If we allow for interaction, then we would expect to obtain diffierent coefficients for each of the (SEX) strata. But are corresponding coeficients statistically diffierent? That is, which model is more appropriate statistically, the no-interaction model or the interaction model? To answer this question, one must fit a model that is an alternate specification of model with interactions

er<-read.csv("D:\\Users\\EChen13\\OneDrive - JNJ\\Documents\\stat\\Survival data\\remissionData.csv", sep=";")
er$grp <- er$placebo  
fit <- with(er,coxph(Surv(time,event)~ grp + logWBC + strata(female)))
print(fit)

## Call:
## coxph(formula = Surv(time, event) ~ grp + logWBC + strata(female))
## 
##          coef exp(coef) se(coef)     z        p
## grp    0.9981    2.7131   0.4736 2.108   0.0351
## logWBC 1.4537    4.2787   0.3441 4.225 2.39e-05
## 
## Likelihood ratio test=32.06  on 2 df, p=1.092e-07
## n= 42, number of events= 30

#should not use factor when there is interaction term

fit_alt <- with(er,coxph(Surv(time,event) ~ grp + logWBC + I(female*grp)+I(female*logWBC)+ strata(female)))
#print(fit_alt)
fit_alt1 <- with(er,coxph(Surv(time,event) ~ grp + logWBC + I(female*grp)+I(female*logWBC)))
print(fit_alt1)

## Call:
## coxph(formula = Surv(time, event) ~ grp + logWBC + I(female * 
##     grp) + I(female * logWBC))
## 
##                       coef exp(coef) se(coef)      z        p
## grp                 0.8212    2.2732   0.5743  1.430   0.1527
## logWBC              1.7750    5.9003   0.3607  4.921 8.61e-07
## I(female * grp)     1.4437    4.2364   0.8250  1.750   0.0801
## I(female * logWBC) -0.1320    0.8763   0.2081 -0.635   0.5257
## 
## Likelihood ratio test=50.9  on 4 df, p=2.341e-10
## n= 42, number of events= 30

chistat <- (-2*fit$loglik[2]) - (-2*fit_alt$loglik[2])
1-pchisq(chistat,2)

## [1] 0.1521439

Thus, it appears that despite the numerical difference between corresponding coefficients in the female and male models, there is no statistically significant difference (P=0.15). We can therefore conclude for these data that the no-interaction model is acceptable