Penanganan Asumsi Proportional Hazard
Library
## Loading required package: survival
## Registered S3 method overwritten by 'dplyr':
## method from
## print.src SurvRegCensCov
##
## Attaching package: 'dplyr'
## The following objects are masked from 'package:stats':
##
## filter, lag
## The following objects are masked from 'package:base':
##
## intersect, setdiff, setequal, union
## Loading required package: splines
## Loading required package: Epi
## ── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ──
## ✔ forcats 1.0.0 ✔ stringr 1.5.1
## ✔ lubridate 1.9.3 ✔ tibble 3.2.1
## ✔ purrr 1.0.2 ✔ tidyr 1.3.1
## ✔ readr 2.1.5
## ── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
## ✖ dplyr::filter() masks stats::filter()
## ✖ dplyr::lag() masks stats::lag()
## ℹ Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to become errors
## Loading required package: MASS
##
## Attaching package: 'MASS'
##
## The following object is masked from 'package:dplyr':
##
## select
##
##
## Attaching package: 'biostat3'
##
## The following object is masked from 'package:lubridate':
##
## year
##
## The following object is masked from 'package:survival':
##
## colon
## Loading required package: ggpubr
##
## Attaching package: 'survminer'
##
## The following object is masked from 'package:survival':
##
## myeloma
##
## Attaching package: 'actuar'
##
## The following objects are masked from 'package:stats':
##
## sd, var
##
## The following object is masked from 'package:grDevices':
##
## cm
## Loading required package: Hmisc
##
## Attaching package: 'Hmisc'
##
## The following objects are masked from 'package:dplyr':
##
## src, summarize
##
## The following objects are masked from 'package:base':
##
## format.pval, units
##
## Attaching package: 'flexsurv'
##
## The following objects are masked from 'package:actuar':
##
## dllogis, pllogis, qllogis, rllogis
Import Dataset
Data merupakan hasil penelitian pasien kanker paru di RSUP Dr. Kariadi Semarang sebanyak 50 pasien, dengan 46 pasien teramati atau meninggal dan 4 pasien tersensor. Data dihimpun dalam Tugas Akhir Utami (2015).
Gambaran Umum Data
Data terdiri dari 7 variabel: 1. Y (time), merupakan variabel respon yang menunjukkan waktu hidup pasien kanker paru (dalam hari) 2. X1(umur), merupakan variabel penjelas yang menunjukkan umur pasien kanker paru saat awal masuk studi (dalam tahun).
Eksplorasi Data
Ubah Nama Variabel
time <- Data_UAS$`Y (waktu)`
age <- as.factor(Data_UAS$`X1 (umur)`)
score <- as.factor(Data_UAS$`X2 (skor)`)
month <- as.factor(Data_UAS$`X3 (bulan)`)
cancer <- as.factor(Data_UAS$`X4 (kanker)`)
treatment <- as.factor(Data_UAS$`X5 (metode pengobatan)`)
status <- Data_UAS$Status
age2 <- Data_UAS$`X1 (umur)2`
score2 <- Data_UAS$`X2 (skor)2`
month2 <- Data_UAS$`X3 (bulan)2`
data <- data.frame(time,age,score,month,cancer,treatment,status,age2,month2,score2)
View(data)Statistik Deskriptif
## time age score month cancer treatment status age2 month2 score2
## 1 33 0 0 1 1 1 1 14 3.57 70
## 2 29 1 1 0 1 1 1 64 0.97 50
## 3 12 1 1 0 1 1 1 58 0.40 50
## 4 36 1 1 1 1 2 1 74 2.37 50
## 5 5 1 1 0 2 2 1 57 0.17 30
## 6 29 0 0 0 1 2 1 39 0.97 60
## 'data.frame': 50 obs. of 10 variables:
## $ time : num 33 29 12 36 5 29 31 24 3 62 ...
## $ age : Factor w/ 2 levels "0","1": 1 2 2 2 2 1 2 1 1 1 ...
## $ score : Factor w/ 2 levels "0","1": 1 2 2 2 2 1 2 2 2 1 ...
## $ month : Factor w/ 2 levels "0","1": 2 1 1 2 1 1 2 1 1 2 ...
## $ cancer : Factor w/ 3 levels "1","2","3": 1 1 1 1 2 1 1 1 1 1 ...
## $ treatment: Factor w/ 2 levels "1","2": 1 1 1 2 2 2 1 2 2 2 ...
## $ status : num 1 1 1 1 1 1 1 1 1 1 ...
## $ age2 : num 14 64 58 74 57 39 61 53 43 47 ...
## $ month2 : num 3.57 0.97 0.4 2.37 0.17 0.97 1.5 0.8 0.1 2.4 ...
## $ score2 : num 70 50 50 50 30 60 50 40 50 60 ...
usia <- ggplot(data,aes(x=age2))+
geom_histogram(fill="skyblue",
color="black",
bins=20)+
theme_minimal()+
labs(x="Usia",
y= "Frekuensi",
title="Diagram Batang Usia Pasien")
#png(filename = "usia.png")
usiaDiagram batang usia pasien didominasi pasien dengan usia tua karena histogram tampak menjulur ke kiri.
hidup <- ggplot(data,aes(x=time))+
geom_histogram(fill="lightgreen",
color="black",
bins=20)+
theme_minimal()+
labs(x="Waktu Survival (hari)",
y= "Frekuensi",
title="Histogram Waktu Bertahan Pasien")
#png(filename = "hidup.png")
hidupWaktu bertahan pasien didominasi oleh waktu bertahan yang singkat. Hal ini tampak pada histogram yang menjulur kanan dan waktu bertahan yang mengumpul di awal waktu penelitian.
cancer <- ggplot(data,aes(x=cancer))+
geom_bar(fill="red",color="black")+
theme_minimal()+
labs(x="Jenis Kanker",
y= "Frekuensi",
title="Diagram Batang Jenis Kanker Pasien")
#png(filename = "cancer.png")
cancerJenis kanker pasien didominasi oleh kanker jenis 1.
treatment <- ggplot(data,aes(x=treatment))+
geom_bar(fill="orange", color="black")+
theme_minimal()+
labs(x="Jenis Treatment",
y= "Frekuensi",
title="Diagram Batang Jenis Treatment Pasien")
#png(filename = "treatment.png")
treatmentJenis treatment yang diberikan kepada pasien relatif hampir sama banyak. Tampak pada batang treatment 1 yang hampir sama tinggi dengan batang treatment 2.
bulan <- ggplot(data,aes(x=month2))+
geom_histogram(fill="pink",
color="black",
bins=20)+
theme_minimal()+
labs(x="Lama Bulan Diagnosa",
y= "Frekuensi",
title="Histogram Lama Bulan Diagnosa Pasien")
#png(filename = "bulan.png")
bulanLama bulan diagnosa pasien didominasi oleh diagnosa yang baru terjadi.
Survival Analysis
Life Table
## tstart tstop nsubs nlost nrisk nevent surv pdf hazard
## 0-1 0 1 50 0 50.0 0 1.00 0.0000000000 0.000000000
## 1-2 1 2 50 0 50.0 1 1.00 0.0200000000 0.020202020
## 2-3 2 3 49 0 49.0 1 0.98 0.0200000000 0.020618557
## 3-4 3 4 48 0 48.0 2 0.96 0.0400000000 0.042553191
## 4-5 4 5 46 0 46.0 1 0.92 0.0200000000 0.021978022
## 5-6 5 6 45 0 45.0 2 0.90 0.0400000000 0.045454545
## 6-8 6 8 43 0 43.0 1 0.86 0.0100000000 0.011764706
## 8-9 8 9 42 0 42.0 1 0.84 0.0200000000 0.024096386
## 9-10 9 10 41 0 41.0 1 0.82 0.0200000000 0.024691358
## 10-11 10 11 40 0 40.0 2 0.80 0.0400000000 0.051282051
## 11-12 11 12 38 0 38.0 2 0.76 0.0400000000 0.054054054
## 12-13 12 13 36 0 36.0 1 0.72 0.0200000000 0.028169014
## 13-15 13 15 35 0 35.0 1 0.70 0.0100000000 0.014492754
## 15-19 15 19 34 0 34.0 2 0.68 0.0100000000 0.015151515
## 19-20 19 20 32 0 32.0 1 0.64 0.0200000000 0.031746032
## 20-21 20 21 31 0 31.0 1 0.62 0.0200000000 0.032786885
## 21-22 21 22 30 0 30.0 1 0.60 0.0200000000 0.033898305
## 22-24 22 24 29 0 29.0 2 0.58 0.0200000000 0.035714286
## 24-25 24 25 27 0 27.0 2 0.54 0.0400000000 0.076923077
## 25-26 25 26 25 0 25.0 1 0.50 0.0200000000 0.040816327
## 26-29 26 29 24 0 24.0 1 0.48 0.0066666667 0.014184397
## 29-30 29 30 23 0 23.0 3 0.46 0.0600000000 0.139534884
## 30-31 30 31 20 0 20.0 1 0.40 0.0200000000 0.051282051
## 31-32 31 32 19 0 19.0 1 0.38 0.0200000000 0.054054054
## 32-33 32 33 18 0 18.0 1 0.36 0.0200000000 0.057142857
## 33-34 33 34 17 0 17.0 4 0.34 0.0800000000 0.266666667
## 34-36 34 36 13 0 13.0 1 0.26 0.0100000000 0.040000000
## 36-37 36 37 12 0 12.0 1 0.24 0.0200000000 0.086956522
## 37-41 37 41 11 0 11.0 1 0.22 0.0050000000 0.023809524
## 41-50 41 50 10 0 10.0 1 0.20 0.0022222222 0.011695906
## 50-62 50 62 9 0 9.0 1 0.18 0.0016666667 0.009803922
## 62-67 62 67 8 0 8.0 1 0.16 0.0040000000 0.026666667
## 67-95 67 95 7 0 7.0 1 0.14 0.0007142857 0.005494505
## 95-132 95 132 6 0 6.0 1 0.12 0.0005405405 0.004914005
## 132-162 132 162 5 0 5.0 1 0.10 0.0006666667 0.007407407
## 162-185 162 185 4 1 3.5 0 0.08 0.0000000000 0.000000000
## 185-285 185 285 3 1 2.5 0 0.08 0.0000000000 0.000000000
## 285-703 285 703 2 1 1.5 0 0.08 0.0000000000 0.000000000
## 703-Inf 703 Inf 1 1 0.5 0 0.08 NA NA
## se.surv se.pdf se.hazard
## 0-1 0.00000000 NaN NaN
## 1-2 0.00000000 0.0197989899 0.020200990
## 2-3 0.01979899 0.0197989899 0.020617461
## 3-4 0.02771281 0.0277128129 0.030082839
## 4-5 0.03836665 0.0197989899 0.021976695
## 5-6 0.04242641 0.0277128129 0.032132915
## 6-8 0.04907138 0.0098994949 0.011763892
## 8-9 0.05184593 0.0197989899 0.024094637
## 9-10 0.05433231 0.0197989899 0.024689476
## 10-11 0.05656854 0.0277128129 0.036249964
## 11-12 0.06039868 0.0277128129 0.038208026
## 12-13 0.06349803 0.0197989899 0.028166220
## 13-15 0.06480741 0.0098994949 0.014491232
## 15-19 0.06596969 0.0069282032 0.010708819
## 19-20 0.06788225 0.0197989899 0.031742032
## 20-21 0.06864401 0.0197989899 0.032782479
## 21-22 0.06928203 0.0197989899 0.033893436
## 22-24 0.06979971 0.0138564065 0.025237703
## 24-25 0.07048404 0.0277128129 0.054352583
## 25-26 0.07071068 0.0197989899 0.040807826
## 26-29 0.07065409 0.0065996633 0.014181186
## 29-30 0.07048404 0.0335857112 0.080364200
## 30-31 0.06928203 0.0197989899 0.051265191
## 31-32 0.06864401 0.0197989899 0.054034308
## 32-33 0.06788225 0.0197989899 0.057119529
## 33-34 0.06699254 0.0383666522 0.132142833
## 34-36 0.06203225 0.0098994949 0.039967987
## 36-37 0.06039868 0.0197989899 0.086874293
## 37-41 0.05858327 0.0049497475 0.023782514
## 41-50 0.05656854 0.0021998878 0.011679696
## 50-62 0.05433231 0.0016499158 0.009786945
## 62-67 0.05184593 0.0039597980 0.026607341
## 67-95 0.04907138 0.0007071068 0.005478225
## 95-132 0.04595650 0.0005351078 0.004893657
## 132-162 0.04242641 0.0006599663 0.007361541
## 162-185 0.03836665 NaN NaN
## 185-285 0.03836665 NaN NaN
## 285-703 0.03836665 NaN NaN
## 703-Inf 0.03836665 NA NA
Kaplan Meier
## Call: survfit(formula = surv ~ treatment, data = data)
##
## treatment=1
## time n.risk n.event survival std.err lower 95% CI upper 95% CI
## 12 23 1 0.957 0.0425 0.8767 1.000
## 13 22 1 0.913 0.0588 0.8049 1.000
## 20 21 1 0.870 0.0702 0.7423 1.000
## 21 20 1 0.826 0.0790 0.6848 0.996
## 22 19 1 0.783 0.0860 0.6310 0.971
## 24 18 1 0.739 0.0916 0.5798 0.942
## 25 17 1 0.696 0.0959 0.5309 0.912
## 26 16 1 0.652 0.0993 0.4839 0.879
## 29 15 2 0.565 0.1034 0.3950 0.809
## 30 13 1 0.522 0.1042 0.3528 0.772
## 31 12 1 0.478 0.1042 0.3121 0.733
## 32 11 1 0.435 0.1034 0.2728 0.693
## 33 10 3 0.304 0.0959 0.1641 0.565
## 37 7 1 0.261 0.0916 0.1311 0.519
## 50 6 1 0.217 0.0860 0.1001 0.472
## 67 5 1 0.174 0.0790 0.0714 0.424
## 95 4 1 0.130 0.0702 0.0454 0.375
## 132 3 1 0.087 0.0588 0.0231 0.327
##
## treatment=2
## time n.risk n.event survival std.err lower 95% CI upper 95% CI
## 1 27 1 0.9630 0.0363 0.8943 1.000
## 2 26 1 0.9259 0.0504 0.8322 1.000
## 3 25 2 0.8519 0.0684 0.7279 0.997
## 4 23 1 0.8148 0.0748 0.6807 0.975
## 5 22 2 0.7407 0.0843 0.5926 0.926
## 6 20 1 0.7037 0.0879 0.5509 0.899
## 8 19 1 0.6667 0.0907 0.5106 0.870
## 9 18 1 0.6296 0.0929 0.4715 0.841
## 10 17 2 0.5556 0.0956 0.3965 0.778
## 11 15 2 0.4815 0.0962 0.3255 0.712
## 15 13 2 0.4074 0.0946 0.2585 0.642
## 19 11 1 0.3704 0.0929 0.2265 0.606
## 22 10 1 0.3333 0.0907 0.1955 0.568
## 24 9 1 0.2963 0.0879 0.1657 0.530
## 29 8 1 0.2593 0.0843 0.1370 0.490
## 33 7 1 0.2222 0.0800 0.1097 0.450
## 34 6 1 0.1852 0.0748 0.0839 0.409
## 36 5 1 0.1481 0.0684 0.0600 0.366
## 41 4 1 0.1111 0.0605 0.0382 0.323
## 62 3 1 0.0741 0.0504 0.0195 0.281
ggsurvplot(fit1,
data=data,
conf.int = T,
risk.table = T,
risk.table.col='strata',
surv.median.line = 'hv',
ggtheme = theme_bw())## Warning in geom_segment(aes(x = 0, y = max(y2), xend = max(x1), yend = max(y2)), : All aesthetics have length 1, but the data has 2 rows.
## ℹ Please consider using `annotate()` or provide this layer with data containing
## a single row.
## All aesthetics have length 1, but the data has 2 rows.
## ℹ Please consider using `annotate()` or provide this layer with data containing
## a single row.
## All aesthetics have length 1, but the data has 2 rows.
## ℹ Please consider using `annotate()` or provide this layer with data containing
## a single row.
## All aesthetics have length 1, but the data has 2 rows.
## ℹ Please consider using `annotate()` or provide this layer with data containing
## a single row.
Grafik hazard treatment menunjukkan bahwa treatment 1 memiliki peluang bertahan hidup lebih tinggi daripada treatment 2.
## Call: survfit(formula = surv ~ cancer, data = data)
##
## cancer=1
## time n.risk n.event survival std.err lower 95% CI upper 95% CI
## 1 34 1 0.9706 0.0290 0.9154 1.000
## 2 33 1 0.9412 0.0404 0.8653 1.000
## 3 32 2 0.8824 0.0553 0.7804 0.998
## 4 30 1 0.8529 0.0607 0.7418 0.981
## 5 29 1 0.8235 0.0654 0.7049 0.962
## 9 28 1 0.7941 0.0693 0.6692 0.942
## 10 27 1 0.7647 0.0727 0.6346 0.921
## 11 26 2 0.7059 0.0781 0.5682 0.877
## 12 24 1 0.6765 0.0802 0.5362 0.853
## 13 23 1 0.6471 0.0820 0.5048 0.829
## 15 22 2 0.5882 0.0844 0.4440 0.779
## 19 20 1 0.5588 0.0852 0.4145 0.753
## 21 19 1 0.5294 0.0856 0.3856 0.727
## 22 18 1 0.5000 0.0857 0.3573 0.700
## 24 17 2 0.4412 0.0852 0.3022 0.644
## 25 15 1 0.4118 0.0844 0.2755 0.615
## 26 14 1 0.3824 0.0833 0.2494 0.586
## 29 13 3 0.2941 0.0781 0.1747 0.495
## 31 10 1 0.2647 0.0757 0.1512 0.464
## 33 9 2 0.2059 0.0693 0.1064 0.398
## 34 7 1 0.1765 0.0654 0.0854 0.365
## 36 6 1 0.1471 0.0607 0.0655 0.330
## 37 5 1 0.1176 0.0553 0.0469 0.295
## 50 4 1 0.0882 0.0486 0.0299 0.260
## 62 3 1 0.0588 0.0404 0.0153 0.226
##
## cancer=2
## time n.risk n.event survival std.err lower 95% CI upper 95% CI
## 5 13 1 0.923 0.0739 0.7890 1.000
## 6 12 1 0.846 0.1001 0.6711 1.000
## 8 11 1 0.769 0.1169 0.5711 1.000
## 30 10 1 0.692 0.1280 0.4819 0.995
## 32 9 1 0.615 0.1349 0.4004 0.946
## 33 8 2 0.462 0.1383 0.2566 0.830
## 41 6 1 0.385 0.1349 0.1934 0.765
## 67 5 1 0.308 0.1280 0.1361 0.695
## 95 4 1 0.231 0.1169 0.0855 0.623
## 132 3 1 0.154 0.1001 0.0430 0.550
##
## cancer=3
## time n.risk n.event survival std.err lower 95% CI upper 95% CI
## 10 3 1 0.667 0.272 0.2995 1
## 20 2 1 0.333 0.272 0.0673 1
## 22 1 1 0.000 NaN NA NA
ggsurvplot(fit2,
data=data,
conf.int = T,
risk.table = T,
risk.table.col='strata',
surv.median.line = 'hv',
ggtheme = theme_bw())## Warning in geom_segment(aes(x = 0, y = max(y2), xend = max(x1), yend = max(y2)), : All aesthetics have length 1, but the data has 3 rows.
## ℹ Please consider using `annotate()` or provide this layer with data containing
## a single row.
## All aesthetics have length 1, but the data has 3 rows.
## ℹ Please consider using `annotate()` or provide this layer with data containing
## a single row.
## All aesthetics have length 1, but the data has 3 rows.
## ℹ Please consider using `annotate()` or provide this layer with data containing
## a single row.
## All aesthetics have length 1, but the data has 3 rows.
## ℹ Please consider using `annotate()` or provide this layer with data containing
## a single row.
Tampak bahwa penderita cancer jenis 1 memiliki peluang bertahan hidup lebih tinggi daripada penderita cancer jenis lainnya.
Log Rank
## Call:
## survdiff(formula = Surv(time, status) ~ treatment, data = data)
##
## N Observed Expected (O-E)^2/E (O-E)^2/V
## treatment=1 23 21 27.7 1.64 4.33
## treatment=2 27 25 18.3 2.49 4.33
##
## Chisq= 4.3 on 1 degrees of freedom, p= 0.04
fit_rank1=survfit(Surv(time,status)~treatment,data=data)
ggsurvplot(fit_rank1,
data=data,
pval = T,
conf.int = T,
conf.int.style='step',
break.time.by=25,
xlab='Waktu (hari)',
xlim=c(0,190),
surv.median.line = 'hv',
ggtheme = theme_bw())## Warning in geom_segment(aes(x = 0, y = max(y2), xend = max(x1), yend = max(y2)), : All aesthetics have length 1, but the data has 2 rows.
## ℹ Please consider using `annotate()` or provide this layer with data containing
## a single row.
## All aesthetics have length 1, but the data has 2 rows.
## ℹ Please consider using `annotate()` or provide this layer with data containing
## a single row.
p-value log rank pada peubah treatment <0,05 sehingga tolak H0 atau dapat disimpulkan bahwa kedua strata memiliki perbedaan kondisi daya tahan.
## Call:
## survdiff(formula = Surv(time, status) ~ cancer, data = data)
##
## N Observed Expected (O-E)^2/E (O-E)^2/V
## cancer=1 34 32 26.59 1.10 2.81
## cancer=2 13 11 18.03 2.74 5.00
## cancer=3 3 3 1.37 1.92 2.08
##
## Chisq= 6.5 on 2 degrees of freedom, p= 0.04
fit_rank2=survfit(Surv(time,status)~cancer,data=data)
ggsurvplot(fit_rank2,
data=data,
pval = T,
conf.int = T,
conf.int.style='step',
break.time.by=25,
xlab='Waktu (hari)',
xlim=c(0,190),
surv.median.line = 'hv',
ggtheme = theme_bw())## Warning in geom_segment(aes(x = 0, y = max(y2), xend = max(x1), yend = max(y2)), : All aesthetics have length 1, but the data has 3 rows.
## ℹ Please consider using `annotate()` or provide this layer with data containing
## a single row.
## All aesthetics have length 1, but the data has 3 rows.
## ℹ Please consider using `annotate()` or provide this layer with data containing
## a single row.
p-value log rank pada peubah cancer <0,05 sehingga tolak H0 atau dapat disimpulkan bahwa ketiga strata memiliki perbedaan kondisi daya tahan.
Hazard Function
hazard=bshazard(Surv(time,status)~1,
data=data,
verbose=F)
plot(hazard)
#png(filename="hazard.png")
plot(hazard)Berdasarkan plot function hazard yang telah divisualisasikan, dapat disimpulkan bahwa fungsi bahwa mengikuti sebaran Weibull dengan kondisi Weibull menurun atau Log-Logistik karena kondisi sebaran yang naik turun.
# Uji Kesesuaian untuk beberapa distribusi
fit.weibull <- fitdist(data$time, "weibull")
fit.loglogistic <- fitdist(data$time, "llogis")
fit.exponential <- fitdist(data$time, "exp")
fit.lognormal <- fitdist(data$time, "lnorm")
# Grafik GoF
par(mfrow = c(2, 2))
print ("Plot Weibull =", plot(fit.weibull))## [1] "Plot Weibull ="
## [1] "Plot Log-Logistic ="
## [1] "Plot Exponential ="
## [1] "Plot Log-Normal ="
Log-Logistik memiliki P-P plot yang mengikuti garis diagonal dengan baik.
## [1] 487.5446
## [1] 473.0836
## [1] 495.0648
## [1] 475.0916
Model terbaik menggunakan model Log-Logistik karena AIC terkecil.
Pengujian Asumsi Proportional Hazard Menggunakan Log-Log Plot
Peubah Treatment (Jenis Treatment yang Diberikan pada Pasien)
plot(survfit(Surv(time,status)~treatment, data=data),
col=c("black", "orange"), fun="cloglog",
xlab='log(t)', ylab='-log(-log(S(t)))')Meskipun plot hampir bersinggungan, namun masih dapat dikategorikan paralel sehingga peubah treatment disimpulkan dapat tidak melanggar asumsi PH.
Peubah Cancer (Jenis Kanker)
plot(survfit(Surv(time,status)~cancer, data=data),
col=c("black", "red"), fun="cloglog",
xlab='log(t)', ylab='-log(-log(S(t)))')Plot bertabrakan sehingga dapat disimpulkan bahwa peubah cancer melanggar asumsi PH.
Peubah Score (Skor Karnovsky)
plot(survfit(Surv(time,status)~score, data=data),
col=c("black", "red"), fun="cloglog",
xlab='log(t)', ylab='-log(-log(S(t)))')Skor Karnovsky tidak melanggar asumsi PH karena kurva paralel.
Analisis Parametrik
Model Log-Logistik (Sebaran Terbaik pada Model Parametrik)
Menggunakan kovariat yang tidak melanggar asumsi PH, yaitu treatment, score, month.
logis_ph.aman=survreg(Surv(time,status)~treatment+score+month, data=data, dist='loglogistic')
summary(logis_ph.aman)##
## Call:
## survreg(formula = Surv(time, status) ~ treatment + score + month,
## data = data, dist = "loglogistic")
## Value Std. Error z p
## (Intercept) 3.249 0.339 9.60 < 2e-16
## treatment2 -0.622 0.264 -2.36 0.018
## score1 -0.472 0.281 -1.68 0.092
## month1 1.056 0.271 3.90 9.6e-05
## Log(scale) -0.662 0.126 -5.27 1.4e-07
##
## Scale= 0.516
##
## Log logistic distribution
## Loglik(model)= -198.9 Loglik(intercept only)= -213.4
## Chisq= 29 on 3 degrees of freedom, p= 2.2e-06
## Number of Newton-Raphson Iterations: 4
## n= 50
## treatment2 score1 month1
## 1.8617507 1.6034103 0.3476895
Ingat!! Default R pakai AFT
Berdasarkan hasil tersebut, faktor akselerasi treatment2 diperoleh sebesar 1,8617507 yang berasal dari exp(koefisien treatment2), faktor akselerasi score1 diperoleh sebesar 1,6034103 yang berasal dari exp(koefisien score1), faktor akselerasi month1 diperoleh sebesar 0,3476895 yang berasal dari exp(koefisien month1), shape parameter 1,938 didapat dari (1/scale).
## $vars
## Estimate SE
## lambda 0.001837058 0.001872739
## gamma 1.938929991 0.243713431
## treatment2 1.205078534 0.526636653
## score1 0.915432500 0.553964683
## month1 -2.048373997 0.573164707
##
## $HR
## HR LB UB
## treatment2 3.3370211 1.1887349 9.3676982
## score1 2.4978553 0.8433963 7.3978050
## month1 0.1289444 0.0419299 0.3965347
##
## $ETR
## ETR LB UB
## treatment2 0.5371288 0.3202611 0.9008505
## score1 0.6236707 0.3599053 1.0807429
## month1 2.8761298 1.6916178 4.8900660
Hazard ratio >1 artinya laju bahaya pasien yang menerima treatment2 (3,337 kali) lebih tinggi dari pasien yang menerima treatment1 Sementara, nilai faktor akselerasi <1 berarti daya tahan pasien yang menerima treatment2 lebih kecil (0,537 kali) lebih rendah daripada pasien yang menerima treatment1 Begitupula interpretasi untuk score1 dan month1.
Seleksi Peubah dari Model Parametrik
all <- psm(Surv(time,status)~age+month+treatment+cancer+score, data=data,dist='loglogistic')
anova(all)## Wald Statistics Response: Surv(time, status)
##
## Factor Chi-Square d.f. P
## age 1.17 1 0.2803
## month 11.40 1 0.0007
## treatment 6.31 1 0.0120
## cancer 0.94 2 0.6243
## score 2.81 1 0.0937
## TOTAL 39.91 6 <.0001
Peubah cancer, score, age tidak signifikan sehingga akan dibangun model hanya dengan menggunakan peubah month dan treatment.
logis_seleksi=survreg(Surv(time,status)~month+treatment,data=data,dist="loglogistic")
summary(logis_seleksi)##
## Call:
## survreg(formula = Surv(time, status) ~ month + treatment, data = data,
## dist = "loglogistic")
## Value Std. Error z p
## (Intercept) 2.845 0.240 11.84 < 2e-16
## month1 1.164 0.270 4.31 1.6e-05
## treatment2 -0.581 0.267 -2.17 0.03
## Log(scale) -0.631 0.125 -5.03 5.0e-07
##
## Scale= 0.532
##
## Log logistic distribution
## Loglik(model)= -200.3 Loglik(intercept only)= -213.4
## Chisq= 26.24 on 2 degrees of freedom, p= 2e-06
## Number of Newton-Raphson Iterations: 4
## n= 50
## $vars
## Estimate SE
## lambda 0.004771831 0.003865814
## gamma 1.878681554 0.235643230
## month1 -2.187295800 0.558756072
## treatment2 1.090801449 0.516162995
##
## $HR
## HR LB UB
## month1 0.1122198 0.03753665 0.3354931
## treatment2 2.9766588 1.08235646 8.1863025
##
## $ETR
## ETR LB UB
## month1 3.2035889 1.8873439 5.4377911
## treatment2 0.5595509 0.3313388 0.9449458
Cek AIC dari 2 Model Log-Logistik
AIC.parametrik <- data.frame(
Model = c("Model Parametrik dengan Peubah PH","Model Parametrik dengan Seleksi Peubah"),
Nilai_AIC = c(AIC(logis_ph.aman),AIC(logis_seleksi))
)
AIC.parametrik## Model Nilai_AIC
## 1 Model Parametrik dengan Peubah PH 407.8223
## 2 Model Parametrik dengan Seleksi Peubah 408.5817
Berdasarkan kedua model tersebut, model parametrik yang hanya menggunakan peubah yang memenuhi asumsi proportional hazard dan model parametrik yang melalui proses seleksi peubah, didapatkan bahwa model parametrik yang hanya menggunakan peubah memenuhi asumsi proportional hazard lebih baik digunakan karena memiliki nilai AIC lebih kecil.
Model Cox Proportional Hazard
cox_global<- survreg(Surv(time,status)~age+month+cancer+treatment+score,data=data,dist="loglogistic")
summary(cox_global)##
## Call:
## survreg(formula = Surv(time, status) ~ age + month + cancer +
## treatment + score, data = data, dist = "loglogistic")
## Value Std. Error z p
## (Intercept) 3.333 0.371 8.99 < 2e-16
## age1 -0.277 0.257 -1.08 0.28034
## month1 0.994 0.294 3.38 0.00073
## cancer2 0.235 0.310 0.76 0.44965
## cancer3 0.293 0.452 0.65 0.51600
## treatment2 -0.649 0.259 -2.51 0.01203
## score1 -0.457 0.273 -1.68 0.09365
## Log(scale) -0.690 0.127 -5.45 5.2e-08
##
## Scale= 0.501
##
## Log logistic distribution
## Loglik(model)= -198 Loglik(intercept only)= -213.4
## Chisq= 30.84 on 6 degrees of freedom, p= 2.7e-05
## Number of Newton-Raphson Iterations: 5
## n= 50
Uji Formal Asumsi PH Menggunakan Residual Schoenfeld
model_global <- coxph(Surv(time,status)~age+month+cancer+treatment+score, data=data)
summary(model_global)## Call:
## coxph(formula = Surv(time, status) ~ age + month + cancer + treatment +
## score, data = data)
##
## n= 50, number of events= 46
##
## coef exp(coef) se(coef) z Pr(>|z|)
## age1 0.34371 1.41017 0.32142 1.069 0.2849
## month1 -2.82764 0.05915 0.60723 -4.657 3.21e-06 ***
## cancer2 0.01169 1.01176 0.41164 0.028 0.9773
## cancer3 0.08060 1.08393 0.62979 0.128 0.8982
## treatment2 0.78688 2.19653 0.35000 2.248 0.0246 *
## score1 0.64631 1.90848 0.36986 1.747 0.0806 .
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## exp(coef) exp(-coef) lower .95 upper .95
## age1 1.41017 0.7091 0.75107 2.6477
## month1 0.05915 16.9056 0.01799 0.1945
## cancer2 1.01176 0.9884 0.45154 2.2671
## cancer3 1.08393 0.9226 0.31545 3.7246
## treatment2 2.19653 0.4553 1.10616 4.3617
## score1 1.90848 0.5240 0.92441 3.9401
##
## Concordance= 0.788 (se = 0.028 )
## Likelihood ratio test= 46.12 on 6 df, p=3e-08
## Wald test = 29.29 on 6 df, p=5e-05
## Score (logrank) test = 43.65 on 6 df, p=9e-08
## chisq df p
## age 2.326 1 0.1272
## month 3.179 1 0.0746
## cancer 3.357 2 0.1867
## treatment 6.112 1 0.0134
## score 0.209 1 0.6472
## GLOBAL 20.716 6 0.0021
## age1 month1 cancer2 cancer3 treatment2 score1
## 1.101747 1.110373 1.116876 1.026059 1.295584 1.196703
Model Global memiliki pelanggaran PH. Peubah yang memiliki pelanggaran PH adalah peubah treatment padah treatment ketika di cek sendiri tidak ada pelanggaran PH. Ada kemungkinan ketika treatment bersama-sama dengan peubah lain baru terjadi pelanggaran (ada interaksi dengan peubah lain).Ada pola tertentu pada sisaan treatment pada model simultan.
Penanganan PH
Seleksi Peubah Backward
## Start: AIC=256.48
## Surv(time, status) ~ age + month + cancer + treatment + score
##
## Df AIC
## - cancer 2 252.50
## - age 1 255.62
## <none> 256.48
## - score 1 257.69
## - treatment 1 259.60
## - month 1 286.59
##
## Step: AIC=252.5
## Surv(time, status) ~ age + month + treatment + score
##
## Df AIC
## - age 1 251.63
## <none> 252.50
## - score 1 253.70
## - treatment 1 255.69
## - month 1 287.18
##
## Step: AIC=251.63
## Surv(time, status) ~ month + treatment + score
##
## Df AIC
## <none> 251.63
## - score 1 252.82
## - treatment 1 253.99
## - month 1 285.38
## Call:
## coxph(formula = Surv(time, status) ~ month + treatment + score,
## data = data)
##
## n= 50, number of events= 46
##
## coef exp(coef) se(coef) z Pr(>|z|)
## month1 -2.76263 0.06313 0.57412 -4.812 1.49e-06 ***
## treatment2 0.69408 2.00187 0.33258 2.087 0.0369 *
## score1 0.63940 1.89535 0.36704 1.742 0.0815 .
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## exp(coef) exp(-coef) lower .95 upper .95
## month1 0.06313 15.8415 0.02049 0.1945
## treatment2 2.00187 0.4995 1.04315 3.8417
## score1 1.89535 0.5276 0.92313 3.8915
##
## Concordance= 0.782 (se = 0.025 )
## Likelihood ratio test= 44.97 on 3 df, p=9e-10
## Wald test = 28.46 on 3 df, p=3e-06
## Score (logrank) test = 42.95 on 3 df, p=3e-09
Pada seleksi peubah backward, didapatkan bahwa peubah yang signifikan berpengaruh adalah month, treatment, dan score sehingga hanya 3 peubah tersebut yang akan digunakan.
Model Regresi Terbaik
## Call:
## coxph(formula = Surv(time, status) ~ month + treatment + score,
## data = data)
##
## n= 50, number of events= 46
##
## coef exp(coef) se(coef) z Pr(>|z|)
## month1 -2.76263 0.06313 0.57412 -4.812 1.49e-06 ***
## treatment2 0.69408 2.00187 0.33258 2.087 0.0369 *
## score1 0.63940 1.89535 0.36704 1.742 0.0815 .
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## exp(coef) exp(-coef) lower .95 upper .95
## month1 0.06313 15.8415 0.02049 0.1945
## treatment2 2.00187 0.4995 1.04315 3.8417
## score1 1.89535 0.5276 0.92313 3.8915
##
## Concordance= 0.782 (se = 0.025 )
## Likelihood ratio test= 44.97 on 3 df, p=9e-10
## Wald test = 28.46 on 3 df, p=3e-06
## Score (logrank) test = 42.95 on 3 df, p=3e-09
## chisq df p
## month 2.67 1 0.1022
## treatment 6.16 1 0.0130
## score 0.23 1 0.6316
## GLOBAL 12.24 3 0.0066
## month1 treatment2 score1
## 1.000439 1.181804 1.181783
Pada model yang dibangun dari seleksi peubah backward, peubah treatment tetap mengalami pelanggaran asumsi proportional hazard.
Diagnostik Sisaan
#png(filename = "diagnostik1.png")
ggcoxdiagnostics(model3, type = "dfbeta",
linear.predictions = FALSE, ggtheme = theme_bw())## Warning: `gather_()` was deprecated in tidyr 1.2.0.
## ℹ Please use `gather()` instead.
## ℹ The deprecated feature was likely used in the survminer package.
## Please report the issue at <https://github.com/kassambara/survminer/issues>.
## This warning is displayed once every 8 hours.
## Call `lifecycle::last_lifecycle_warnings()` to see where this warning was
## generated.
## `geom_smooth()` using formula = 'y ~ x'
#png(filename = "diagnostik2.png")
ggcoxdiagnostics(model3, type = "deviance",
linear.predictions = FALSE, ggtheme = theme_bw())## `geom_smooth()` using formula = 'y ~ x'
Plot sisaan cenderung simetris terhadap 0 tetapi terdapat pencilan.
Deteksi dan Penghapusan Outlier
# Hitung residu deviance
deviance_res <- residuals(model3, type = "deviance")
# Tentukan ambang batas untuk outlier (misalnya 2 atau 3 standar deviasi)
threshold <- 2 * sd(deviance_res)
# Identifikasi pengamatan yang menjadi outlier
outliers <- which(abs(deviance_res) > threshold)
# Cetak pengamatan yang menjadi outlier
print(outliers)## 21 26
## 21 26
Amatan 21,26 merupakan outlier.
# Hitung residu deviance
deviance_res <- residuals(model3, type = "deviance")
# Tentukan ambang batas untuk leverage tinggi
p <- length(coef(model3)) # jumlah parameter
n <- nrow(data) # jumlah pengamatan
leverage_threshold <- 15 * (p + 1) / n
# Identifikasi pengamatan yang menjadi outlier
leverage <- which(abs(deviance_res) > leverage_threshold)
# Cetak pengamatan yang menjadi outlier
print(leverage)## 8 9 13 15 16 21 26 27 33 36 40 44 45 48
## 8 9 13 15 16 21 26 27 33 36 40 44 45 48
Hasil leverage tidak digunakan karena yang dianggap leverage terdiri dari banyak amatan Penghapusan amatan yang terlalu banyak akan menyebabkan model singular sehingga amatan yang dikeluarkan dari model hanya amatan outlier.
# Buat dataset baru tanpa outlier
no_outliers <- data[-outliers, ]
# Buat model Cox baru tanpa outlier
model_no <- coxph(Surv(time, status) ~ month+treatment+score, data=no_outliers)
# Tampilkan ringkasan model baru
summary(model_no)## Call:
## coxph(formula = Surv(time, status) ~ month + treatment + score,
## data = no_outliers)
##
## n= 48, number of events= 44
##
## coef exp(coef) se(coef) z Pr(>|z|)
## month1 -3.57651 0.02797 0.76901 -4.651 3.31e-06 ***
## treatment2 0.65398 1.92317 0.34524 1.894 0.0582 .
## score1 0.68735 1.98843 0.38480 1.786 0.0741 .
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## exp(coef) exp(-coef) lower .95 upper .95
## month1 0.02797 35.7487 0.006197 0.1263
## treatment2 1.92317 0.5200 0.977571 3.7835
## score1 1.98843 0.5029 0.935333 4.2272
##
## Concordance= 0.812 (se = 0.02 )
## Likelihood ratio test= 53.59 on 3 df, p=1e-11
## Wald test = 26.03 on 3 df, p=9e-06
## Score (logrank) test = 49.78 on 3 df, p=9e-11
## chisq df p
## month 0.101 1 0.750
## treatment 4.957 1 0.026
## score 0.392 1 0.531
## GLOBAL 8.054 3 0.045
Model Cox tanpa outlier tetap tidak mampu menangani pelanggaran asumsi proportional hazard peubah treatment namun terdapat sedikit peningkatan, dibuktikan dari nilai p-value peubah treatment yang lebih besar dari sebelumnya.
Model Non Proportional Hazard
Model Stratified Cox Proportional Hazard Tanpa Interaksi
#Model SC tanpa interaksi
model4 <- coxph(Surv(time,status)~month+score, data=no_outliers)
summary(model4)## Call:
## coxph(formula = Surv(time, status) ~ month + score, data = no_outliers)
##
## n= 48, number of events= 44
##
## coef exp(coef) se(coef) z Pr(>|z|)
## month1 -3.57801 0.02793 0.76373 -4.685 2.8e-06 ***
## score1 0.37652 1.45721 0.35035 1.075 0.283
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## exp(coef) exp(-coef) lower .95 upper .95
## month1 0.02793 35.8021 0.006252 0.1248
## score1 1.45721 0.6862 0.733334 2.8956
##
## Concordance= 0.779 (se = 0.021 )
## Likelihood ratio test= 50.01 on 2 df, p=1e-11
## Wald test = 23.71 on 2 df, p=7e-06
## Score (logrank) test = 48.37 on 2 df, p=3e-11
## chisq df p
## month 0.185 1 0.67
## score 0.317 1 0.57
## GLOBAL 0.486 2 0.78
## month1 score1
## 1.002896 1.002896
Pada model Stratified Cox Proportional Hazard Tanpa Interaksi, semua peubah dan model global terima H0 yang artinya tidak terdapat pelanggaran asumsi proportional hazard.
Model Stratified Cox Proportional Hazard Dengan Interaksi
#Model SC dengan interaksi
cox_models <- list()
# loop melalui setiap kategori treatment
for (treat in 1:2) {
# Subset data berdasarkan kategori umur
subset_data <- subset(no_outliers, data$treatment == treat)
# Pastikan subset data tidak kosong
if (nrow(subset_data) > 0) {
# Buat model Cox PH untuk subset data
cox_model <- coxph(Surv(time, status) ~ month + score, data = subset_data)
# Simpan model ke dalam list
cox_models[[as.character(treat)]] <- cox_model
# Cetak ringkasan model
cat("\nModel untuk kategori treatment:", treat, "\n")
print(summary(cox_model))
print(cox.zph(cox_model))
} else {
cat("\nTidak ada data untuk kategori treatment:", treat, "\n")
}
}##
## Model untuk kategori treatment: 1
## Call:
## coxph(formula = Surv(time, status) ~ month + score, data = subset_data)
##
## n= 23, number of events= 21
##
## coef exp(coef) se(coef) z Pr(>|z|)
## month1 -3.53476 0.02917 1.08687 -3.252 0.00114 **
## score1 0.20939 1.23293 0.59153 0.354 0.72335
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## exp(coef) exp(-coef) lower .95 upper .95
## month1 0.02917 34.2867 0.003465 0.2455
## score1 1.23293 0.8111 0.386749 3.9305
##
## Concordance= 0.761 (se = 0.031 )
## Likelihood ratio test= 21.1 on 2 df, p=3e-05
## Wald test = 11.16 on 2 df, p=0.004
## Score (logrank) test = 24.74 on 2 df, p=4e-06
##
## chisq df p
## month 0.00621 1 0.94
## score 0.19928 1 0.66
## GLOBAL 0.20664 2 0.90
##
## Model untuk kategori treatment: 2
## Call:
## coxph(formula = Surv(time, status) ~ month + score, data = subset_data)
##
## n= 25, number of events= 23
## (2 observations deleted due to missingness)
##
## coef exp(coef) se(coef) z Pr(>|z|)
## month1 -3.4770 0.0309 1.0891 -3.193 0.00141 **
## score1 0.7704 2.1607 0.4871 1.582 0.11373
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## exp(coef) exp(-coef) lower .95 upper .95
## month1 0.0309 32.3639 0.003655 0.2612
## score1 2.1607 0.4628 0.831715 5.6130
##
## Concordance= 0.79 (se = 0.033 )
## Likelihood ratio test= 25.66 on 2 df, p=3e-06
## Wald test = 12.02 on 2 df, p=0.002
## Score (logrank) test = 20.85 on 2 df, p=3e-05
##
## chisq df p
## month 0.411 1 0.52
## score 1.800 1 0.18
## GLOBAL 2.239 2 0.33
Pada model Stratified Cox Proportional Hazard Dengan Interaksi, semua peubah dan model global terima H0 yang artinya tidak terdapat pelanggaran asumsi proportional hazard.
## `geom_smooth()` using formula = 'y ~ x'
## `geom_smooth()` using formula = 'y ~ x'
Pemilihan Model Terbaik
AIC.model <- data.frame(
Model = c("Model Parametrik dengan Peubah PH",
"Model Parametrik dengan Seleksi Peubah",
"Model Cox Global",
"Model Cox Global Lalu Hapus Peubah Non-PH",
"Model Cox Backward",
"Model Cox Backward Hapus Outlier",
"Model Stratified Cox Proportional Hazard Tanpa Interaksi",
"Model Stratified Cox Proportional Hazard Dengan Interaksi"),
Nilai_AIC = c(AIC(logis_ph.aman),
AIC(logis_seleksi),
AIC(model_global),
AIC(model2),
AIC(back_model),
AIC(model_no),
AIC(model4),
AIC(cox_model))
)
AIC.model## Model Nilai_AIC
## 1 Model Parametrik dengan Peubah PH 407.82227
## 2 Model Parametrik dengan Seleksi Peubah 408.58169
## 3 Model Cox Global 256.47948
## 4 Model Cox Global Lalu Hapus Peubah Non-PH 259.60093
## 5 Model Cox Backward 251.62877
## 6 Model Cox Backward Hapus Outlier 227.40128
## 7 Model Stratified Cox Proportional Hazard Tanpa Interaksi 228.97781
## 8 Model Stratified Cox Proportional Hazard Dengan Interaksi 92.96498
Kesimpulan :
Berdasarkan nilai AIC, model terbaik untuk memodelkan daya tahan hidup penderita kanker paru adalah model Stratified Cox Proportional Hazard Dengan Interaksi.