The survreg function in R runs parametric accelerated failure time (AFT) models. The key assumption is that survival time accelerates (or decelerates) by a constant factor when comparing different levels of covariates. The Weibull distribution has the desirable property in that if the AFT assumption holds then the PH assumption also holds.
The coefficients have the AFT interpretation, i.e., exp(coef) means the factor by which the survival time is multiplied for that group compared to the baseline. The scale parameter is the reciprocal of the shape parameter. This terminology is consistent with SAS PROC LIFEREG.
To obtain the hazard ratio, the coefficient need further transformation as follows.
To translate the coefficient in an AFT model \( \alpha_j \) to that of a PH model \( \beta_j \),
\( \beta_j = -\alpha_j p \)
where \( p \) is the shape parameter.
Then the hazard ratio is exp\( (\beta_j) \)
The shape parameter defines changes in hazard over time.
## Load survival
library(survival)
## Load foreign package for Stata data
library(foreign)
## Load nursing home dataset
nhome <- read.dta("~/statistics/bio223/nursinghome.dta")
## Create survival object
nhome$SurvObj <- with(nhome, Surv(los,cens == 1))
res.km <- survfit(formula = SurvObj ~ gender, data = nhome)
res.km
Call: survfit(formula = SurvObj ~ gender, data = nhome)
records n.max n.start events median 0.95LCL 0.95UCL
gender=0 1173 1173 1173 902 144 123 164
gender=1 418 418 418 367 70 54 88
res.cox <- coxph(formula = SurvObj ~ gender, data = nhome)
summary(res.cox)
Call:
coxph(formula = SurvObj ~ gender, data = nhome)
n= 1591, number of events= 1269
coef exp(coef) se(coef) z Pr(>|z|)
gender 0.3958 1.4855 0.0621 6.37 1.9e-10 ***
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
exp(coef) exp(-coef) lower .95 upper .95
gender 1.49 0.673 1.32 1.68
Concordance= 0.541 (se = 0.006 )
Rsquare= 0.024 (max possible= 1 )
Likelihood ratio test= 38.3 on 1 df, p=6.11e-10
Wald test = 40.6 on 1 df, p=1.85e-10
Score (logrank) test = 41.1 on 1 df, p=1.41e-10
Having gender = 1 increase the hazard of events by a factor of 1.49 (1.49 times more hazard compared to the baseline).
survreg() fits accelerated failure models, not proportional hazards models. The coefficients are logarithms of ratios of survival times, so a positive coefficient means longer survival.
res.exp <- survreg(formula = SurvObj ~ gender, data = nhome, dist = "exponential")
summary(res.exp)
Call:
survreg(formula = SurvObj ~ gender, data = nhome, dist = "exponential")
Value Std. Error z p
(Intercept) 5.842 0.0333 175.46 0.00e+00
gender -0.516 0.0619 -8.34 7.62e-17
Scale fixed at 1
Exponential distribution
Loglik(model)= -8493 Loglik(intercept only)= -8525
Chisq= 64.2 on 1 degrees of freedom, p= 1.1e-15
Number of Newton-Raphson Iterations: 5
n= 1591
AFT interpretation
Having gender = 1 accelerates the time to event by a factor of exp(-0.516) = 0.597 (0.597 times shorter survival time compared to the baseline survival).
PH interpretation
To translate the coefficient in an AFT model \( \alpha_j \) to that of a PH model \( \beta_j \),
\( \beta_j = -\alpha_j p \)
where \( p \) is the shape parameter.
For this example, the coefficient is multiplied by -1, then multiplied by the shape parameter (1/scale parameter = 1/1 for exponential model).
exp(-0.516 * -1 * 1/1) = 1.68
This is the hazard ratio comparing gender = 1 to gender = 0.
res.weibull <- survreg(formula = SurvObj ~ gender,
data = nhome,
dist = c("weibull","exponential","gaussian","logistic","lognormal","loglogistic")[1])
summary(res.weibull)
Call:
survreg(formula = SurvObj ~ gender, data = nhome, dist = c("weibull",
"exponential", "gaussian", "logistic", "lognormal", "loglogistic")[1])
Value Std. Error z p
(Intercept) 5.756 0.0542 106.21 0.00e+00
gender -0.673 0.1011 -6.66 2.67e-11
Log(scale) 0.487 0.0232 20.99 8.94e-98
Scale= 1.63
Weibull distribution
Loglik(model)= -8218 Loglik(intercept only)= -8239
Chisq= 41.73 on 1 degrees of freedom, p= 1e-10
Number of Newton-Raphson Iterations: 5
n= 1591
1/Scale = Weibull shape parameter. The terminology is consistent with SAS PROC LIFEREG.
AFT interpretation
Having gender = 1 accelerates the time to event by a factor of exp(-0.673) = 0.510 (0.510 times shorter survival time compared to the baseline survival).
PH interpretation
To translate the coefficient in an AFT model \( \alpha_j \) to that of a PH model \( \beta_j \),
\( \beta_j = -\alpha_j p \)
where \( p \) is the shape parameter.
For this example, the coefficient is multiplied by -1, then multiplied by the shape parameter (1/scale parameter = 1/1.63 for this Weibull model).
exp(-0.673 * -1 * 1/1.63) = 1.51
This is the hazard ratio comparing gender = 1 to gender = 0.
## rms for survplot()
library(rms)
## plyr for List to (none) PLY (l_ply) for looping
library(plyr)
## Plot KM curves
survplot(fit = res.km,
conf = c("none","bands","bars")[1],
xlab = "", ylab = "Survival",
label.curves = TRUE, # label curves directly
time.inc = 100, # time increment
n.risk = TRUE, # show number at risk
)
## Plot Cox prediction (use survfit)
lines(survfit(res.cox, newdata = data.frame(gender = 0:1)), col = "green", lty = 1:2, mark.time = FALSE)
## Define a function to plot survreg prediction by gender
survreg.curves <- function(model, col = "black", values = c(0, 1),
seq.quantiles = seq(from = 0.00, to = 1.00, by = 0.01)) {
l_ply(values, function(X) {
lines(x = predict(model, # survreg object to use
newdata = data.frame(gender = X), # Dataset to perform prediction for
type = "quantile", # Predict survival time (X-axis values) given event quantile
p = seq.quantiles), # Vector of quantiles (Y-axis values)
y = (1 - seq.quantiles), # Change to survival quantile (proportion remaining)
col = col, lty = X + 1) # COLor and Line TYpe
})
}
## Plot exponential model prediction
survreg.curves(res.exp, "red")
## Plot Weibull model prediction
survreg.curves(res.weibull, "blue")
## Add legends
legend(x = "topright",
legend = c("Kaplan-Meier", "Cox (Efron)", "Exponential", "Weibull"),
lwd = 2, bty = "n",
col = c("black", "green", "red", "blue"))
The Weibull model appear to fit the data better than the exponential model, as it allows decreasing hazard (shape parameter p = 1/1.63 = 0.613 < 1, thus decreasing).