Kaplan-Meier estimator of the survival function


Stratified Kaplan-Meier estimates

The Kaplan-Meier (KM) estimator is a very popular non-parametric method to estimate the survival function, S(t). In this example, we estimate separate KM survival functions for different strata (sex: 1=male, 2=female) and then compare their survival functions. From the plot, we see evidence that the survival curves may be different according to the sex of the patient (females tend to have higher probability of survival).

Column

Column

The log-rank test also provides evidence for the assessment of survival of the 2 groups. Comparing survival functions formally, we see there is a statistically significant difference in survival functions (p-value = 0.001):
Call:
survdiff(formula = Surv(time, status) ~ sex, data = lung)

        N Observed Expected (O-E)^2/E (O-E)^2/V
sex=1 138      112     91.6      4.55      10.3
sex=2  90       53     73.4      5.68      10.3

 Chisq= 10.3  on 1 degrees of freedom, p= 0.001 

The Cox proportional hazards model


The Cox model is used to estimate covariate effects on the hazard functions. It assumes proportional hazards, which means that the effects of covariates are constant over time, i.e. the effect of treatment does not change over time.

Call:
coxph(formula = Surv(time, status) ~ age + sex + wt.loss, data = lung)

  n= 214, number of events= 152 
   (14 observations deleted due to missingness)

              coef  exp(coef)   se(coef)      z Pr(>|z|)   
age      0.0200882  1.0202913  0.0096644  2.079   0.0377 * 
sex     -0.5210319  0.5939074  0.1743541 -2.988   0.0028 **
wt.loss  0.0007596  1.0007599  0.0061934  0.123   0.9024   
---
Signif. codes:  
0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

        exp(coef) exp(-coef) lower .95 upper .95
age        1.0203     0.9801    1.0011    1.0398
sex        0.5939     1.6838    0.4220    0.8359
wt.loss    1.0008     0.9992    0.9887    1.0130

Concordance= 0.612  (se = 0.027 )
Likelihood ratio test= 14.67  on 3 df,   p=0.002
Wald test            = 13.98  on 3 df,   p=0.003
Score (logrank) test = 14.24  on 3 df,   p=0.003

Example interpretation:

For each additional year of age at baseline, the hazard increases by 2.03%, or by a factor of 1.0203.

Females have 60% the hazard of males, or a 40% decrease

For each additional pound of weight loss, the hazard increases by 0.08%

Model assessment


Column

A chi-square test tests the hypothesis that the covariate effect is constant (proportional) over time against the alternative that covariate effect changes over time.

cox.zph(lung.cox)
         chisq df    p
age     0.5077  1 0.48
sex     2.5489  1 0.11
wt.loss 0.0144  1 0.90
GLOBAL  3.0051  3 0.39

No strong evidence of violation of proportional hazards for any covariate can be assessed, though some suggestion that sex may violate this assumption.

Column

Another tool used to assess proportional hazards is a plot of a smoothed curve over the Schoenfeld residuals. Let’s check the residuals for the “sex” covariate. Again, we see some evidence of non-proportional hazards for sex, as the effect of sex seems to increase with time.

plot(cox.zph(lung.cox), var = "sex")