BIO 223 Applied Survival Analysis Chapter 8: Parametric Survival Analysis

References

survival::survreg

The survreg function in R runs parametric accelerated failure time (AFT) models. The key assumption is that survival time accelerates (or decelerates) by a constant factor when comparing different levels of covariates. The Weibull distribution has the desirable property in that if the AFT assumption holds then the PH assumption also holds.

The coefficients have the AFT interpretation, i.e., exp(coef) means the factor by which the survival time is multiplied for that group compared to the baseline. The scale parameter is the reciprocal of the shape parameter. This terminology is consistent with SAS PROC LIFEREG.

To obtain the hazard ratio, the coefficient need further transformation as follows.

To translate the coefficient in an AFT model \( \alpha_j \) to that of a PH model \( \beta_j \),

\( \beta_j = -\alpha_j p \)

where \( p \) is the shape parameter.

Then the hazard ratio is exp\( (\beta_j) \)

The shape parameter defines changes in hazard over time.

Prepare data

## Load survival
library(survival)

## Load foreign package for Stata data
library(foreign)

## Load nursing home dataset
nhome <- read.dta("~/statistics/bio223/nursinghome.dta")

## Create survival object
nhome$SurvObj <- with(nhome, Surv(los,cens == 1))

Kaplan-Meier

res.km <- survfit(formula = SurvObj ~ gender, data = nhome)
res.km
Call: survfit(formula = SurvObj ~ gender, data = nhome)

         records n.max n.start events median 0.95LCL 0.95UCL
gender=0    1173  1173    1173    902    144     123     164
gender=1     418   418     418    367     70      54      88

Cox proportional hazards model

res.cox <- coxph(formula = SurvObj ~ gender, data = nhome)
summary(res.cox)
Call:
coxph(formula = SurvObj ~ gender, data = nhome)

  n= 1591, number of events= 1269 

         coef exp(coef) se(coef)    z Pr(>|z|)    
gender 0.3958    1.4855   0.0621 6.37  1.9e-10 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

       exp(coef) exp(-coef) lower .95 upper .95
gender      1.49      0.673      1.32      1.68

Concordance= 0.541  (se = 0.006 )
Rsquare= 0.024   (max possible= 1 )
Likelihood ratio test= 38.3  on 1 df,   p=6.11e-10
Wald test            = 40.6  on 1 df,   p=1.85e-10
Score (logrank) test = 41.1  on 1 df,   p=1.41e-10

Having gender = 1 increase the hazard of events by a factor of 1.49 (1.49 times more hazard compared to the baseline).

Exponential model

survreg() fits accelerated failure models, not proportional hazards models. The coefficients are logarithms of ratios of survival times, so a positive coefficient means longer survival.

res.exp <- survreg(formula = SurvObj ~ gender, data = nhome, dist = "exponential")
summary(res.exp)

Call:
survreg(formula = SurvObj ~ gender, data = nhome, dist = "exponential")
             Value Std. Error      z        p
(Intercept)  5.842     0.0333 175.46 0.00e+00
gender      -0.516     0.0619  -8.34 7.62e-17

Scale fixed at 1 

Exponential distribution
Loglik(model)= -8493   Loglik(intercept only)= -8525
    Chisq= 64.2 on 1 degrees of freedom, p= 1.1e-15 
Number of Newton-Raphson Iterations: 5 
n= 1591 

AFT interpretation

Having gender = 1 accelerates the time to event by a factor of exp(-0.516) = 0.597 (0.597 times shorter survival time compared to the baseline survival).

PH interpretation

To translate the coefficient in an AFT model \( \alpha_j \) to that of a PH model \( \beta_j \),

\( \beta_j = -\alpha_j p \)

where \( p \) is the shape parameter.

For this example, the coefficient is multiplied by -1, then multiplied by the shape parameter (1/scale parameter = 1/1 for exponential model).

exp(-0.516 * -1 * 1/1) = 1.68

This is the hazard ratio comparing gender = 1 to gender = 0.

Weibull model

res.weibull <- survreg(formula = SurvObj ~ gender,
                       data      = nhome,
                       dist      = c("weibull","exponential","gaussian","logistic","lognormal","loglogistic")[1])
summary(res.weibull)

Call:
survreg(formula = SurvObj ~ gender, data = nhome, dist = c("weibull", 
    "exponential", "gaussian", "logistic", "lognormal", "loglogistic")[1])
             Value Std. Error      z        p
(Intercept)  5.756     0.0542 106.21 0.00e+00
gender      -0.673     0.1011  -6.66 2.67e-11
Log(scale)   0.487     0.0232  20.99 8.94e-98

Scale= 1.63 

Weibull distribution
Loglik(model)= -8218   Loglik(intercept only)= -8239
    Chisq= 41.73 on 1 degrees of freedom, p= 1e-10 
Number of Newton-Raphson Iterations: 5 
n= 1591 

1/Scale = Weibull shape parameter. The terminology is consistent with SAS PROC LIFEREG.

AFT interpretation

Having gender = 1 accelerates the time to event by a factor of exp(-0.673) = 0.510 (0.510 times shorter survival time compared to the baseline survival).

PH interpretation

To translate the coefficient in an AFT model \( \alpha_j \) to that of a PH model \( \beta_j \),

\( \beta_j = -\alpha_j p \)

where \( p \) is the shape parameter.

For this example, the coefficient is multiplied by -1, then multiplied by the shape parameter (1/scale parameter = 1/1.63 for this Weibull model).

exp(-0.673 * -1 * 1/1.63) = 1.51

This is the hazard ratio comparing gender = 1 to gender = 0.

Plotting together

## rms for survplot()
library(rms)
## plyr for List to (none) PLY (l_ply) for looping
library(plyr)

## Plot KM curves
survplot(fit  = res.km,
         conf = c("none","bands","bars")[1],
         xlab = "", ylab = "Survival",
         label.curves = TRUE,                     # label curves directly
         time.inc = 100,                          # time increment
         n.risk   = TRUE,                         # show number at risk
         )

## Plot Cox prediction (use survfit)
lines(survfit(res.cox, newdata = data.frame(gender = 0:1)), col = "green", lty = 1:2, mark.time = FALSE)

## Define a function to plot survreg prediction by gender
survreg.curves <- function(model, col = "black", values = c(0, 1),
                           seq.quantiles = seq(from = 0.00, to = 1.00, by = 0.01)) {

    l_ply(values, function(X) {
        lines(x = predict(model,                    # survreg object to use
                  newdata = data.frame(gender = X), # Dataset to perform prediction for
                  type = "quantile",                # Predict survival time (X-axis values) given event quantile
                  p = seq.quantiles),               # Vector of quantiles (Y-axis values)

              y = (1 - seq.quantiles),              # Change to survival quantile (proportion remaining)

              col = col, lty = X + 1)               # COLor and Line TYpe
    })
}

## Plot exponential model prediction
survreg.curves(res.exp, "red")

## Plot Weibull model prediction
survreg.curves(res.weibull, "blue")

## Add legends
legend(x = "topright",
       legend = c("Kaplan-Meier", "Cox (Efron)", "Exponential", "Weibull"),
       lwd = 2, bty = "n",
       col = c("black", "green", "red", "blue"))

plot of chunk unnamed-chunk-7

The Weibull model appear to fit the data better than the exponential model, as it allows decreasing hazard (shape parameter p = 1/1.63 = 0.613 < 1, thus decreasing).