We want to estimate the effekt of 3 covariates: ulceration, thickness ogf tumor and sex on death from malignant melanoma This is a competing risk setting because the population can die from malignant melanoma or they can die from other causes. The timescale is days from study entry.
library(data.table)
library(prodlim)
library(foreign)
library(survival)
library(mets)
library(timereg)
library(pec)
library(riskRegression)
library(cmprsk)
melanoma<-fread("C:/Users/juliea/Desktop/Kurser/Survival analysis/Exercises/melanoma.csv")
Kaplan Meier :
setDT(melanoma)
melanoma[,Sex:=factor(sex,
levels=c(0,1),
labels=c("Female","Male"))]
KMsex<-prodlim(Surv(days/365,status==1)~Sex,data=melanoma)
plot(KMsex,
xlab="Years",
logrank = TRUE,
legend.x="bottomleft",
legend.cex=0.9)
title(main = "Kaplan Meier plot: The probability of not dying from malignant melanoma")
The plot illustrates the probability of not dying from malignant melanoma in the first 15 years after study start, for men and women respectively. Men have a higher probability of dying from malignant melanoma compared with women. The logrank test shows that this difference is statistically significant. However after 7-10 year it seems that no one in the population is dying from malignant melanoma, but are instead cencored due to end of followup or death from other causes. This competing risk setting makes it difficult to interpret the results in a meaningfull way. one solution is to use the Nelson-Aalen estimator. Beause it calculates cause specific hazards in a competing risk setting
#Nelson-Aalen
outna <- phreg(Surv(days/365,status==1)~strata(Sex),data=melanoma)
basehazplot.phreg(outna,
se=TRUE,
xlab="Years")
title(main="Death from malignant melanoma")
The Nelson-Aalen estimator calculates the cause specific cummulative hazard for both sexes. Men have a higher cumulative hazard of death from malignant melanoma compared with females.
# logrank and sex
out=survdiff(Surv(days/365,status==1) ~Sex,data=melanoma)
out
## Call:
## survdiff(formula = Surv(days/365, status == 1) ~ Sex, data = melanoma)
##
## N Observed Expected (O-E)^2/E (O-E)^2/V
## Sex=Female 126 28 37.1 2.25 6.47
## Sex=Male 79 29 19.9 4.21 6.47
##
## Chisq= 6.5 on 1 degrees of freedom, p= 0.011
# stratificeret logrank
out=survdiff(Surv(days/365,status==1) ~Sex+strata(ulc),data=melanoma)
out
## Call:
## survdiff(formula = Surv(days/365, status == 1) ~ Sex + strata(ulc),
## data = melanoma)
##
## N Observed Expected (O-E)^2/E (O-E)^2/V
## Sex=Female 126 28 34.7 1.28 3.31
## Sex=Male 79 29 22.3 1.99 3.31
##
## Chisq= 3.3 on 1 degrees of freedom, p= 0.0687
Before adjusting for ulceration, the was significant differnces in death from malignant melanoma between the two sexes. After adjusting for ulceration, the sex differences in death from malignant melanoma are now borderline significant.
melanoma[,thickgroup:=cut(thick,
breaks = c(0,125,275,Inf),
labels=c("low","medium","high"))]
summary(melanoma$thickgroup)
## low medium high
## 61 69 75
out1=survdiff(Surv(days/365,status==1) ~thickgroup+strata(Sex),data=melanoma)
out1
## Call:
## survdiff(formula = Surv(days/365, status == 1) ~ thickgroup +
## strata(Sex), data = melanoma)
##
## N Observed Expected (O-E)^2/E (O-E)^2/V
## thickgroup=low 61 6 19.1 8.970 13.567
## thickgroup=medium 69 17 19.0 0.219 0.338
## thickgroup=high 75 34 18.9 12.120 18.750
##
## Chisq= 21.8 on 2 degrees of freedom, p= 1.87e-05
Tumorsize was grouped in three groups (0-125,125-275,>275) A stratified logrank test showed that there is a significant difference in survival from malignant melanoma according to tumorsize, after adjusting for sex.
#)5Investigate interactions between all variables. Think about how to parametrize the interactions.
cox.nointeraction1 <- coxph(Surv(days, status==1) ~ Sex + thickgroup + ulc, data=melanoma)
summary(cox.nointeraction1)
## Call:
## coxph(formula = Surv(days, status == 1) ~ Sex + thickgroup +
## ulc, data = melanoma)
##
## n= 205, number of events= 57
##
## coef exp(coef) se(coef) z Pr(>|z|)
## SexMale 0.4289 1.5355 0.2738 1.567 0.11723
## thickgroupmedium 0.6667 1.9479 0.4959 1.344 0.17879
## thickgrouphigh 1.1527 3.1668 0.4940 2.333 0.01962 *
## ulc 0.9917 2.6957 0.3336 2.973 0.00295 **
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## exp(coef) exp(-coef) lower .95 upper .95
## SexMale 1.536 0.6512 0.8979 2.626
## thickgroupmedium 1.948 0.5134 0.7370 5.148
## thickgrouphigh 3.167 0.3158 1.2026 8.339
## ulc 2.696 0.3710 1.4019 5.183
##
## Concordance= 0.749 (se = 0.04 )
## Rsquare= 0.173 (max possible= 0.937 )
## Likelihood ratio test= 38.9 on 4 df, p=7.3e-08
## Wald test = 33.45 on 4 df, p=9.669e-07
## Score (logrank) test = 40.71 on 4 df, p=3.083e-08
cox.interaction1 <- coxph(Surv(days, status==1) ~Sex+ulc+thickgroup+Sex*thickgroup + sex*ulc + thickgroup*ulc, data=melanoma)
## Warning in coxph(Surv(days, status == 1) ~ Sex + ulc + thickgroup + Sex * :
## X matrix deemed to be singular; variable 5
summary(cox.interaction1)
## Call:
## coxph(formula = Surv(days, status == 1) ~ Sex + ulc + thickgroup +
## Sex * thickgroup + sex * ulc + thickgroup * ulc, data = melanoma)
##
## n= 205, number of events= 57
##
## coef exp(coef) se(coef) z Pr(>|z|)
## SexMale 0.73865 2.09311 0.84071 0.879 0.380
## ulc 0.97343 2.64701 1.13959 0.854 0.393
## thickgroupmedium 0.45004 1.56838 0.80884 0.556 0.578
## thickgrouphigh 1.16498 3.20587 0.83078 1.402 0.161
## sex NA NA 0.00000 NA NA
## SexMale:thickgroupmedium 0.33037 1.39148 1.02834 0.321 0.748
## SexMale:thickgrouphigh 0.01517 1.01529 1.02611 0.015 0.988
## ulc:sex -0.58365 0.55786 0.67839 -0.860 0.390
## ulc:thickgroupmedium 0.36626 1.44233 1.23540 0.296 0.767
## ulc:thickgrouphigh 0.26359 1.30160 1.22891 0.214 0.830
##
## exp(coef) exp(-coef) lower .95 upper .95
## SexMale 2.0931 0.4778 0.4029 10.874
## ulc 2.6470 0.3778 0.2836 24.705
## thickgroupmedium 1.5684 0.6376 0.3213 7.655
## thickgrouphigh 3.2059 0.3119 0.6292 16.334
## sex NA NA NA NA
## SexMale:thickgroupmedium 1.3915 0.7187 0.1854 10.442
## SexMale:thickgrouphigh 1.0153 0.9849 0.1359 7.586
## ulc:sex 0.5579 1.7926 0.1476 2.109
## ulc:thickgroupmedium 1.4423 0.6933 0.1281 16.242
## ulc:thickgrouphigh 1.3016 0.7683 0.1171 14.472
##
## Concordance= 0.75 (se = 0.04 )
## Rsquare= 0.179 (max possible= 0.937 )
## Likelihood ratio test= 40.36 on 9 df, p=6.536e-06
## Wald test = 32.09 on 9 df, p=0.0001925
## Score (logrank) test = 42.74 on 9 df, p=2.401e-06
anova(cox.nointeraction1,cox.interaction1)
## Analysis of Deviance Table
## Cox model: response is Surv(days, status == 1)
## Model 1: ~ Sex + thickgroup + ulc
## Model 2: ~ Sex + ulc + thickgroup + Sex * thickgroup + sex * ulc + thickgroup * ulc
## loglik Chisq Df P(>|Chi|)
## 1 -263.75
## 2 -263.02 1.4592 5 0.9177
A simple coxregression with the variables: sex, ulc and grouped thickness show thata high tumor thickness is associated with a significantly higher hazard rate of death from malignanat melanoma. Also the presence of ulceration is associated with a significantly higher hazard rate of death from malignanat melanoma. the cox model with all interaction terms show that none of the interaction terms are statistically significant.
Doing a likelihood test testing the no-interaction hypothesis, shows that the simplemodel describes the data as good as the interaction model.
We want to check that our model does not violate the proportional hazard assumption. We evaluate this by using martingale residuals:
cox.a1<-cox.aalen(Surv(days,status==1)~prop(Sex)+prop(ulc)+prop(thickgroup), data=melanoma)
summary(cox.a1)
## Cox-Aalen Model
##
## Test for Aalen terms
## Test not computed, sim=0
##
## Proportional Cox terms :
## Coef. SE Robust SE D2log(L)^-1 z P-val
## prop(Sex)Male 0.429 0.282 0.283 0.274 1.52 0.12900
## prop(ulc) 0.992 0.331 0.323 0.334 3.07 0.00217
## prop(thickgroup)medium 0.667 0.491 0.478 0.496 1.40 0.16300
## prop(thickgroup)high 1.150 0.487 0.483 0.494 2.39 0.01700
## lower2.5% upper97.5%
## prop(Sex)Male -0.124 0.982
## prop(ulc) 0.343 1.640
## prop(thickgroup)medium -0.295 1.630
## prop(thickgroup)high 0.195 2.100
## Test of Proportionality
## sup| hat U(t) | p-value H_0
## prop(Sex)Male 3.16 0.352
## prop(ulc) 4.33 0.046
## prop(thickgroup)medium 4.02 0.100
## prop(thickgroup)high 5.62 0.010
par(mfrow=c(1,2))
plot(cox.a1, score=TRUE)
There seems to be an issue with the proportionality of ulceration and the group with highest tumor thickness values. The plot showing the observed score along with 50 simulated scores, that illustrates what the scores would look like if the cox model was correct shows similar results. there seems to be a proportionality issue with ulceration and tumorthickness.
First, we asses the effect of tumor-thickness as a continuous variable, for linearity on the log scale.
melanoma[,log2thick:=log2(thick)]
summary(coxph(Surv(days, status==1) ~ Sex+ulc+thick+log2thick, data=melanoma))
## Call:
## coxph(formula = Surv(days, status == 1) ~ Sex + ulc + thick +
## log2thick, data = melanoma)
##
## n= 205, number of events= 57
##
## coef exp(coef) se(coef) z Pr(>|z|)
## SexMale 0.3735715 1.4529145 0.2728765 1.369 0.17100
## ulc 0.9227652 2.5162388 0.3304717 2.792 0.00523 **
## thick -0.0002151 0.9997849 0.0009139 -0.235 0.81391
## log2thick 0.4553431 1.5767142 0.2704631 1.684 0.09227 .
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## exp(coef) exp(-coef) lower .95 upper .95
## SexMale 1.4529 0.6883 0.8511 2.480
## ulc 2.5162 0.3974 1.3166 4.809
## thick 0.9998 1.0002 0.9980 1.002
## log2thick 1.5767 0.6342 0.9280 2.679
##
## Concordance= 0.768 (se = 0.04 )
## Rsquare= 0.188 (max possible= 0.937 )
## Likelihood ratio test= 42.72 on 4 df, p=1.185e-08
## Wald test = 36.33 on 4 df, p=2.474e-07
## Score (logrank) test = 45.27 on 4 df, p=3.501e-09
cox.nointeraction <- coxph(Surv(days, status==1) ~ Sex + log2thick + ulc, data=melanoma)
summary(cox.nointeraction)
## Call:
## coxph(formula = Surv(days, status == 1) ~ Sex + log2thick + ulc,
## data = melanoma)
##
## n= 205, number of events= 57
##
## coef exp(coef) se(coef) z Pr(>|z|)
## SexMale 0.3813 1.4641 0.2706 1.409 0.15879
## log2thick 0.3990 1.4903 0.1243 3.209 0.00133 **
## ulc 0.9389 2.5571 0.3243 2.895 0.00379 **
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## exp(coef) exp(-coef) lower .95 upper .95
## SexMale 1.464 0.6830 0.8615 2.488
## log2thick 1.490 0.6710 1.1680 1.902
## ulc 2.557 0.3911 1.3542 4.828
##
## Concordance= 0.768 (se = 0.04 )
## Rsquare= 0.188 (max possible= 0.937 )
## Likelihood ratio test= 42.66 on 3 df, p=2.908e-09
## Wald test = 36.99 on 3 df, p=4.625e-08
## Score (logrank) test = 42.81 on 3 df, p=2.697e-09
cox.interaction <- coxph(Surv(days, status==1) ~Sex+ulc+log2thick+Sex*log2thick + Sex*ulc + log2thick*ulc, data=melanoma)
summary(cox.interaction)
## Call:
## coxph(formula = Surv(days, status == 1) ~ Sex + ulc + log2thick +
## Sex * log2thick + Sex * ulc + log2thick * ulc, data = melanoma)
##
## n= 205, number of events= 57
##
## coef exp(coef) se(coef) z Pr(>|z|)
## SexMale 0.871130 2.389609 1.973685 0.441 0.659
## ulc 0.532960 1.703968 2.038312 0.261 0.794
## log2thick 0.361774 1.435874 0.237180 1.525 0.127
## SexMale:log2thick -0.008457 0.991578 0.254687 -0.033 0.974
## SexMale:ulc -0.595363 0.551363 0.670327 -0.888 0.374
## ulc:log2thick 0.084744 1.088438 0.254740 0.333 0.739
##
## exp(coef) exp(-coef) lower .95 upper .95
## SexMale 2.3896 0.4185 0.04993 114.375
## ulc 1.7040 0.5869 0.03137 92.571
## log2thick 1.4359 0.6964 0.90205 2.286
## SexMale:log2thick 0.9916 1.0085 0.60192 1.633
## SexMale:ulc 0.5514 1.8137 0.14820 2.051
## ulc:log2thick 1.0884 0.9187 0.66065 1.793
##
## Concordance= 0.768 (se = 0.04 )
## Rsquare= 0.192 (max possible= 0.937 )
## Likelihood ratio test= 43.76 on 6 df, p=8.238e-08
## Wald test = 37.04 on 6 df, p=1.729e-06
## Score (logrank) test = 49.5 on 6 df, p=5.922e-09
anova(cox.nointeraction,cox.interaction)
## Analysis of Deviance Table
## Cox model: response is Surv(days, status == 1)
## Model 1: ~ Sex + log2thick + ulc
## Model 2: ~ Sex + ulc + log2thick + Sex * log2thick + Sex * ulc + log2thick * ulc
## loglik Chisq Df P(>|Chi|)
## 1 -261.87
## 2 -261.32 1.1039 3 0.7761
melanoma$Sex<-relevel(melanoma$Sex,"Male")
summary(coxph(Surv(days, status==1) ~ Sex + log2thick + ulc, data=melanoma))
## Call:
## coxph(formula = Surv(days, status == 1) ~ Sex + log2thick + ulc,
## data = melanoma)
##
## n= 205, number of events= 57
##
## coef exp(coef) se(coef) z Pr(>|z|)
## SexFemale -0.3813 0.6830 0.2706 -1.409 0.15879
## log2thick 0.3990 1.4903 0.1243 3.209 0.00133 **
## ulc 0.9389 2.5571 0.3243 2.895 0.00379 **
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## exp(coef) exp(-coef) lower .95 upper .95
## SexFemale 0.683 1.4641 0.4019 1.161
## log2thick 1.490 0.6710 1.1680 1.902
## ulc 2.557 0.3911 1.3542 4.828
##
## Concordance= 0.768 (se = 0.04 )
## Rsquare= 0.188 (max possible= 0.937 )
## Likelihood ratio test= 42.66 on 3 df, p=2.908e-09
## Wald test = 36.99 on 3 df, p=4.625e-08
## Score (logrank) test = 42.81 on 3 df, p=2.697e-09
A simple coxregression with the variables: sex, ulc and thickness show that a one unit increase in log(tumor thickness) is associated with a significantly higher hazard rate of death from malignanat melanoma. The presence of ulceration is associated with a significantly higher hazard rate of death from malignanat melanoma. the cox model with all interaction terms show that none of the interaction terms are statistically significant.
Doing a likelihood test testing the no-interaction hypothesis, shows that the simplemodel describes the data as good as the interaction model.
Among men : An 1 unit increase in log(tumor thickness) is associated with 49% CI(1.17-1.90) higher hazard rate of death from malignanat melanoma.
Women have a significantly lower hazard rate of death from malignanat melanoma compared with men. HR:0,68 CI95%: (0.4-1.16) given the same levels of tumor thickness and ulc status.
melanoma$thick10<-melanoma$thick/100
out=prodlim(Surv(days, status!=2) ~ Sex + thick10 + ulc, data=melanoma)
#out=coxph(Surv(days, status!=2)???Sex + thick10 + ulc, data=melanoma)
nd=melanoma[Sex=="Male"& ulc==1 & thick10<3 &thick10>=2]
plotPredictSurvProb(out,
sort(unique(melanoma$days)),
newdata=nd)
title("Individualized survival curves from multiple Cox regression: male
tumor tickness= 2mm/100, ulc positive")
Men with ulceration adn tumor thickness of 2 mm/100 have less than 50% probability of surviving after 3000 days
melanoma$status2<-melanoma$status
melanoma$status2[melanoma$status==2]<-0
melanoma$status2[melanoma$status==3]<-2
mdeathsex<-prodlim(Hist((days/365),status2)~Sex, data=melanoma)
mdeathulc<-prodlim(Hist(days/365,status2)~ulc, data=melanoma)
par(mfrow=c(1,2))
# the local slope of the cummulative hazard
na2 <- survfit(Surv((days/365), status2==1)~sex, data=melanoma)
plot(na2, fun="cumhaz", col=1:2, xlab="Years")
legend("topleft",inset=.05,lty=1,col=1:2,c("Male","Female"))
title("Death from malignant melanoma")
na2 <- survfit(Surv((days/365), status2==1)~ulc, data=melanoma)
plot(na2, fun="cumhaz", col=1:2, xlab="Years")
legend("topleft",inset=.05,lty=1,col=1:2,c("No ulc","Ulc"))
title("Death from malignant melanoma")
# the cummulative incidence by aalen johansen
plot(mdeathsex, cause=1, plot.main="Death from melanoma")
plot(mdeathulc, cause=1, plot.main="Death from melanoma")
summary(mdeathulc, cause=1, time=c(1,2,3,4,5,6,7,8,9,10))
##
##
## ----------> Cause: 1
##
## ulc=0 :
## time n.risk n.event n.lost cuminc se.cuminc lower upper
## 1 1 112 0 0 0.0000 0.0000 0.00000 0.0000
## 2 2 112 0 0 0.0000 0.0000 0.00000 0.0000
## 3 3 107 0 0 0.0439 0.0192 0.00627 0.0815
## 4 4 105 0 0 0.0526 0.0209 0.01164 0.0936
## 5 5 78 0 0 0.0908 0.0275 0.03698 0.1446
## 6 6 53 0 0 0.1456 0.0371 0.07294 0.2182
## 7 7 44 0 0 0.1622 0.0399 0.08402 0.2404
## 8 8 39 0 0 0.1817 0.0434 0.09652 0.2668
## 9 9 24 0 0 0.1817 0.0434 0.09652 0.2668
## 10 10 15 0 0 0.1817 0.0434 0.09652 0.2668
##
## ulc=1 :
## time n.risk n.event n.lost cuminc se.cuminc lower upper
## 1 1 81 0 0 0.0667 0.0263 0.0151 0.118
## 2 2 71 0 0 0.1667 0.0393 0.0897 0.244
## 3 3 60 0 0 0.2778 0.0472 0.1852 0.370
## 4 4 55 0 0 0.3333 0.0497 0.2359 0.431
## 5 5 44 0 0 0.3897 0.0515 0.2888 0.491
## 6 6 30 0 0 0.4063 0.0527 0.3030 0.510
## 7 7 20 0 0 0.4447 0.0559 0.3352 0.554
## 8 8 16 0 0 0.4697 0.0586 0.3548 0.585
## 9 9 14 0 0 0.4979 0.0618 0.3768 0.619
## 10 10 8 0 0 0.5331 0.0664 0.4028 0.663
summary(mdeathsex,cause=1,time=c(1,2,3,4,5,6,7,8,9,10))
##
##
## ----------> Cause: 1
##
## Sex=Male :
## time n.risk n.event n.lost cuminc se.cuminc lower upper
## 1 1 72 0 0 0.0513 0.0250 0.00233 0.100
## 2 2 65 0 0 0.1282 0.0379 0.05403 0.202
## 3 3 57 0 0 0.2308 0.0477 0.13732 0.324
## 4 4 54 0 0 0.2565 0.0495 0.15957 0.353
## 5 5 42 0 0 0.3101 0.0527 0.20689 0.413
## 6 6 27 0 0 0.3504 0.0568 0.23913 0.462
## 7 7 22 0 0 0.3737 0.0593 0.25748 0.490
## 8 8 19 0 0 0.4245 0.0644 0.29840 0.551
## 9 9 12 0 0 0.4245 0.0644 0.29840 0.551
## 10 10 7 0 0 0.4245 0.0644 0.29840 0.551
##
## Sex=Female :
## time n.risk n.event n.lost cuminc se.cuminc lower upper
## 1 1 121 0 0 0.0159 0.0111 0.0000 0.0377
## 2 2 118 0 0 0.0397 0.0174 0.0056 0.0738
## 3 3 110 0 0 0.0952 0.0262 0.0440 0.1465
## 4 4 106 0 0 0.1270 0.0297 0.0688 0.1851
## 5 5 80 0 0 0.1701 0.0339 0.1036 0.2366
## 6 6 56 0 0 0.2057 0.0382 0.1308 0.2806
## 7 7 42 0 0 0.2357 0.0423 0.1528 0.3185
## 8 8 36 0 0 0.2357 0.0423 0.1528 0.3185
## 9 9 26 0 0 0.2566 0.0460 0.1665 0.3467
## 10 10 16 0 0 0.2842 0.0519 0.1826 0.3859
The local slope of the cummulative hazard of death looks steeper for women than men, meaning females have a higher hazard of death from malignant melanoma than men. Likewise the local slope looks steeper in patients with ulceration copared with patients without ulceration, meaning the hazard is higher for patients with ulceration.
The cummulative incidence of death from malignanat melanoma is higher for men than for women.After 10 years approximately 42% of men and 28% of women die from malignant melanoma.Likewise the cummulative incidence of death from malignant melanoma is higher for patients with ulceration. approx. 53% of patients with ulceration have died after 10 years while only 18% of patients without ulceration will have died.