🧩 Pendahuluan

Tuliskan latar belakang singkat mengenai analisis survival dan tujuan dari tugas ini.

Analisis survival digunakan untuk memodelkan waktu hingga terjadinya suatu peristiwa. Dalam tugas ini, metode Cox Proportional Hazards diterapkan pada dataset lung / ovarian untuk memahami faktor-faktor yang memengaruhi risiko kejadian.

📊 Eksplorasi & Deskriptif

1. Pilihan Dataset

Silahkan gunakan salah satu dari dua dataset berikut: (i) pilihan pertama:

data(lung)
head(lung)
##   inst time status age sex ph.ecog ph.karno pat.karno meal.cal wt.loss
## 1    3  306      2  74   1       1       90       100     1175      NA
## 2    3  455      2  68   1       0       90        90     1225      15
## 3    3 1010      1  56   1       0       90        90       NA      15
## 4    5  210      2  57   1       1       90        60     1150      11
## 5    1  883      2  60   1       0      100        90       NA       0
## 6   12 1022      1  74   1       1       50        80      513       0
  1. pilihan kedua:
data(ovarian)
head(ovarian)
##   futime fustat     age resid.ds rx ecog.ps
## 1     59      1 72.3315        2  1       1
## 2    115      1 74.4932        2  1       1
## 3    156      1 66.4658        2  1       2
## 4    421      0 53.3644        2  2       1
## 5    431      1 50.3397        2  1       1
## 6    448      0 56.4301        1  1       2

2. Deskripsi Data

summary(lung)
##       inst            time            status           age       
##  Min.   : 1.00   Min.   :   5.0   Min.   :1.000   Min.   :39.00  
##  1st Qu.: 3.00   1st Qu.: 166.8   1st Qu.:1.000   1st Qu.:56.00  
##  Median :11.00   Median : 255.5   Median :2.000   Median :63.00  
##  Mean   :11.09   Mean   : 305.2   Mean   :1.724   Mean   :62.45  
##  3rd Qu.:16.00   3rd Qu.: 396.5   3rd Qu.:2.000   3rd Qu.:69.00  
##  Max.   :33.00   Max.   :1022.0   Max.   :2.000   Max.   :82.00  
##  NA's   :1                                                       
##       sex           ph.ecog          ph.karno        pat.karno     
##  Min.   :1.000   Min.   :0.0000   Min.   : 50.00   Min.   : 30.00  
##  1st Qu.:1.000   1st Qu.:0.0000   1st Qu.: 75.00   1st Qu.: 70.00  
##  Median :1.000   Median :1.0000   Median : 80.00   Median : 80.00  
##  Mean   :1.395   Mean   :0.9515   Mean   : 81.94   Mean   : 79.96  
##  3rd Qu.:2.000   3rd Qu.:1.0000   3rd Qu.: 90.00   3rd Qu.: 90.00  
##  Max.   :2.000   Max.   :3.0000   Max.   :100.00   Max.   :100.00  
##                  NA's   :1        NA's   :1        NA's   :3       
##     meal.cal         wt.loss       
##  Min.   :  96.0   Min.   :-24.000  
##  1st Qu.: 635.0   1st Qu.:  0.000  
##  Median : 975.0   Median :  7.000  
##  Mean   : 928.8   Mean   :  9.832  
##  3rd Qu.:1150.0   3rd Qu.: 15.750  
##  Max.   :2600.0   Max.   : 68.000  
##  NA's   :47       NA's   :14
table(lung$status)
## 
##   1   2 
##  63 165

📈 Analisis Kaplan–Meier

3. Kurva Kaplan–Meier

km_fit <- survfit(Surv(time, status) ~ sex, data = lung)
ggsurvplot(km_fit, data = lung, pval = TRUE, risk.table = TRUE,
surv.median.line = "hv",
title = "Kaplan–Meier Curve by Sex")

4. Uji Log-Rank

survdiff(Surv(time, status) ~ sex, data = lung)
## Call:
## survdiff(formula = Surv(time, status) ~ sex, data = lung)
## 
##         N Observed Expected (O-E)^2/E (O-E)^2/V
## sex=1 138      112     91.6      4.55      10.3
## sex=2  90       53     73.4      5.68      10.3
## 
##  Chisq= 10.3  on 1 degrees of freedom, p= 0.001

⚙️ Pemodelan Cox Proportional Hazards

5. Estimasi Model

fit <- coxph(Surv(time, status) ~ age + sex + ph.ecog, data = lung)
summary(fit)
## Call:
## coxph(formula = Surv(time, status) ~ age + sex + ph.ecog, data = lung)
## 
##   n= 227, number of events= 164 
##    (1 observation deleted due to missingness)
## 
##              coef exp(coef)  se(coef)      z Pr(>|z|)    
## age      0.011067  1.011128  0.009267  1.194 0.232416    
## sex     -0.552612  0.575445  0.167739 -3.294 0.000986 ***
## ph.ecog  0.463728  1.589991  0.113577  4.083 4.45e-05 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
##         exp(coef) exp(-coef) lower .95 upper .95
## age        1.0111     0.9890    0.9929    1.0297
## sex        0.5754     1.7378    0.4142    0.7994
## ph.ecog    1.5900     0.6289    1.2727    1.9864
## 
## Concordance= 0.637  (se = 0.025 )
## Likelihood ratio test= 30.5  on 3 df,   p=1e-06
## Wald test            = 29.93  on 3 df,   p=1e-06
## Score (logrank) test = 30.5  on 3 df,   p=1e-06

6. Uji Asumsi Proportional Hazards

ph_test <- cox.zph(fit)
ph_test
##         chisq df    p
## age     0.188  1 0.66
## sex     2.305  1 0.13
## ph.ecog 2.054  1 0.15
## GLOBAL  4.464  3 0.22
ggcoxzph(ph_test)

7. Kurva Survival Terprediksi

ggadjustedcurves(fit, data = lung, variable = "sex",
legend.title = "Sex",
title = "Adjusted Survival Curves by Sex")

🧾 Kesimpulan

Ringkas temuan utama (variabel signifikan dan arah pengaruhnya).

Jelaskan implikasi hasil.

Sebutkan keterbatasan analisis.

🧮 Rubrik Penilaian (100 poin)

Komponen Deskripsi Poin
Eksplorasi & Deskriptif Statistik ringkas, Kaplan–Meier plot, uji Log-Rank 30
Model Cox PH Estimasi model, interpretasi Hazard Ratio (HR), uji PH, grafik residual 50
Laporan & Reproducibility Struktur laporan rapi, interpretasi hasil, dan kode dapat dijalankan ulang tanpa error 20
Total 100

📚 Referensi

Therneau, T. M., & Grambsch, P. M. (2000). Modeling Survival Data: Extending the Cox Model. Springer.