pkgs <- c(
  "tidyverse", "survival", "survminer",
  "kableExtra", "scales", "gridExtra",
  "lubridate", "patchwork"
)
invisible(lapply(pkgs, function(p) {
  if (!requireNamespace(p, quietly = TRUE)) install.packages(p)
  library(p, character.only = TRUE)
}))

Pendahuluan

Latar Belakang

Analisis ketahanan hidup (survival analysis) adalah cabang statistika yang menganalisis time-to-event data — waktu hingga terjadinya suatu peristiwa. Dalam konteks ini, peristiwa yang diamati adalah kesembuhan pasien COVID-19, dan waktu yang dianalisis adalah lama rawat inap di RSI Malahayati Medan.

Pandemi COVID-19 menuntut pemahaman mendalam mengenai pola kesembuhan pasien. Analisis survival memberikan kerangka matematis yang tepat karena:

  • Mampu mengestimasi fungsi peluang belum sembuh hingga waktu ke-\(t\)
  • Mendukung perbandingan antar kelompok (komorbid, jenis kelamin, usia)
  • Berlaku baik untuk data lengkap maupun tersensor

Penelitian ini mengacu pada skripsi Muhammad Chairul Imam (2021)Analisis Ketahanan Hidup pada Laju Kesembuhan Pasien COVID-19 di RSI Malahayati dengan Menggunakan Metode Kaplan-Meier — yang menggunakan data 138 pasien RSI Malahayati Medan, Januari–Mei 2021.

Konsep Dasar Analisis Survival

Terdapat empat fungsi utama dalam analisis survival:

1. Fungsi Survival \(S(t)\)

\[S(t) = P(T > t) = 1 - F(t)\]

Menyatakan probabilitas bahwa waktu event \(T\) melebihi waktu \(t\). Dimulai dari \(S(0) = 1\) dan bersifat monoton menurun.

2. Fungsi Distribusi Kumulatif \(F(t)\)

\[F(t) = P(T \leq t) = 1 - S(t)\]

Menyatakan proporsi subjek yang telah mengalami event pada atau sebelum waktu \(t\).

3. Fungsi Densitas \(f(t)\)

\[f(t) = \frac{dF(t)}{dt} = -\frac{dS(t)}{dt}\]

Laju perubahan probabilitas event sesaat di sekitar waktu \(t\).

4. Fungsi Hazard \(h(t)\)

\[h(t) = \lim_{\Delta t \to 0} \frac{P(t \leq T < t + \Delta t \mid T \geq t)}{\Delta t} = \frac{f(t)}{S(t)}\]

Laju sesaat terjadinya event, diberikan subjek masih belum mengalami event hingga waktu \(t\). Kumulatif hazard: \(H(t) = -\ln S(t)\).

Hubungan antar fungsi:

\[S(t) = \exp\left(-\int_0^t h(u)\,du\right) = \exp(-H(t))\]

Penyensoran (Censoring)

Penyensoran terjadi ketika informasi waktu event tidak lengkap. Jenis utama:

  • Penyensoran Kanan (Right Censoring): subjek belum mengalami event saat periode pengamatan berakhir. Ini adalah bentuk paling umum.
  • Penyensoran Kiri (Left Censoring): event diketahui terjadi sebelum titik pengamatan pertama.
  • Penyensoran Interval (Interval Censoring): event terjadi dalam interval waktu yang diketahui, tetapi waktu pastinya tidak diketahui.

Tipe Penyensoran Kanan:

  • Tipe I: pengamatan berakhir pada waktu tetap yang ditentukan sebelumnya
  • Tipe II: pengamatan berakhir setelah sejumlah event tetap tercapai
  • Acak (Random): waktu penyensoran bersifat acak dan independen dari event

Catatan Metodologis: Status Penyensoran pada Data Penelitian

Penting: Seluruh 138 pasien dalam dataset ini tercatat sembuh (status = 1), sehingga secara teknis tidak terdapat data tersensor kanan. Kondisi ini konsisten dengan desain penelitian Imam (2021), yang secara eksplisit membatasi subjek hanya pada pasien yang mencapai event kesembuhan selama periode pengamatan Januari–Mei 2021.

Konsekuensi statistik dari ketiadaan penyensoran adalah bahwa ketiga metode analisis survival yang digunakan dalam modul ini — Metode Empiris, Life Table, dan Kaplan-Meier — secara matematis menghasilkan estimasi \(\hat{S}(t)\) yang identik. Hal ini bukan kebetulan, melainkan konsekuensi langsung dari dua kondisi: (1) \(c_j = 0\) untuk semua interval \(j\) sehingga koreksi aktuaria pada Life Table menjadi nol, dan (2) lebar interval Life Table \(b = 1\) hari sehingga tidak ada kehilangan informasi akibat pengelompokan.

Mengapa skenario penyensoran artifisial tetap disajikan? Dalam praktik studi klinis nyata, penyensoran kanan terjadi ketika periode pengamatan berakhir sebelum seluruh subjek mengalami event — misalnya karena pasien drop-out, meninggal akibat sebab lain (competing risk), atau studi dihentikan lebih awal. Untuk mendemonstrasikan kemampuan Kaplan-Meier dalam menangani data tidak lengkap, modul ini menyertakan skenario penyensoran Tipe I (Kleinbaum & Klein, 2012) dengan batas waktu pengamatan 31 Maret 2021 sebagai perbandingan. Pasien yang belum keluar pada tanggal tersebut diklasifikasikan sebagai tersensor kanan (status = 0), sesuai definisi penyensoran Tipe I.

Analisis utama tetap menggunakan data asli (semua sembuh), dan skenario penyensoran disajikan secara terpisah pada Bab 9.


Data

Input Data

data_raw <- tribble(
  ~no, ~nama, ~jenis_kelamin, ~usia,
  ~tgl_masuk, ~tgl_keluar, ~lama_rawat,
  ~diagnosa, ~kategori_dx, ~komorbid,
  ~kelompok_usia, ~status,
  1,"Carolina","Perempuan",40,"27.01.2021","04.02.2021",8,"Positif Covid + komorbid","Positif","Komorbid",">= 40 Tahun",1,
  2,"Hikmah","Perempuan",63,"28.01.2021","08.02.2021",11,"Susp. Covid","Suspect","Non-Komorbid",">= 40 Tahun",1,
  3,"(Tidak Disebutkan)","Laki-laki",30,"28.01.2021","30.01.2021",2,"Susp. Covid tanpa komorbid","Suspect","Komorbid","< 40 Tahun",1,
  4,"Abdul Gani","Laki-laki",58,"29.01.2021","08.02.2021",10,"Susp. Covid + komorbid","Suspect","Komorbid",">= 40 Tahun",1,
  5,"Irwanto S","Laki-laki",61,"30.01.2021","02.02.2021",3,"Susp. Covid + komorbid","Suspect","Komorbid",">= 40 Tahun",1,
  6,"Christina H","Perempuan",70,"01.02.2021","10.02.2021",9,"Susp. Covid + komorbid","Suspect","Komorbid",">= 40 Tahun",1,
  7,"Ahmad L","Laki-laki",29,"01.02.2021","04.02.2021",3,"Susp. Covid","Suspect","Non-Komorbid","< 40 Tahun",1,
  8,"Hj. Nuraini","Perempuan",60,"01.02.2021","11.02.2021",10,"Susp. Covid + komorbid","Suspect","Komorbid",">= 40 Tahun",1,
  9,"Musnita","Perempuan",NA,"01.02.2021","04.02.2021",3,"Susp. Covid + komorbid","Suspect","Komorbid","Tidak Diketahui",1,
  10,"Dedi Kurniawan","Laki-laki",33,"02.02.2021","08.02.2021",6,"Positif Covid","Positif","Non-Komorbid","< 40 Tahun",1,
  11,"Juli Hafni Rikana","Perempuan",44,"02.02.2021","11.02.2021",9,"Susp. Covid","Suspect","Non-Komorbid",">= 40 Tahun",1,
  12,"Eliboe Lingga","Laki-laki",30,"03.02.2021","05.02.2021",2,"Susp. Covid","Suspect","Non-Komorbid","< 40 Tahun",1,
  13,"Heru Adi W","Laki-laki",31,"04.02.2021","14.02.2021",10,"Susp. Covid","Suspect","Non-Komorbid","< 40 Tahun",1,
  14,"Hj. Syahrita","Perempuan",59,"05.02.2021","07.02.2021",2,"Susp. Covid + komorbid","Suspect","Komorbid",">= 40 Tahun",1,
  15,"Akhmad R","Laki-laki",23,"05.02.2021","08.02.2021",3,"Susp. Covid","Suspect","Non-Komorbid","< 40 Tahun",1,
  16,"Diana Harlini","Perempuan",46,"08.02.2021","16.02.2021",8,"Susp. Covid + komorbid","Suspect","Komorbid",">= 40 Tahun",1,
  17,"Hj. Hanifah Hanum","Perempuan",76,"08.02.2021","11.02.2021",3,"Susp. Covid + komorbid","Suspect","Komorbid",">= 40 Tahun",1,
  18,"Eka Dian M","Perempuan",33,"10.02.2021","19.02.2021",9,"Susp. Covid","Suspect","Non-Komorbid","< 40 Tahun",1,
  19,"Marijono","Laki-laki",73,"11.02.2021","14.02.2021",3,"Susp. Covid + komorbid","Suspect","Komorbid",">= 40 Tahun",1,
  20,"Manuntun S","Laki-laki",64,"12.02.2021","14.02.2021",2,"Susp. Covid + komorbid","Suspect","Komorbid",">= 40 Tahun",1,
  21,"Mariana","Perempuan",37,"12.02.2021","20.02.2021",8,"Susp. Covid","Suspect","Non-Komorbid","< 40 Tahun",1,
  22,"Eva Sartika","Perempuan",31,"12.02.2021","15.02.2021",3,"Susp. Covid","Suspect","Non-Komorbid","< 40 Tahun",1,
  23,"T Zahrial Fauza","Laki-laki",52,"13.02.2021","16.02.2021",3,"Susp. Covid","Suspect","Non-Komorbid",">= 40 Tahun",1,
  24,"Lela Azwani","Perempuan",76,"15.02.2021","26.02.2021",11,"Susp. Covid","Suspect","Non-Komorbid",">= 40 Tahun",1,
  25,"Yunawati Siregar","Perempuan",42,"17.02.2021","01.03.2021",12,"Positif Covid","Positif","Non-Komorbid",">= 40 Tahun",1,
  26,"Sukma Lesmana","Laki-laki",46,"17.02.2021","01.03.2021",12,"Positif Covid + komorbid","Positif","Komorbid",">= 40 Tahun",1,
  27,"Amrie Darussamin","Laki-laki",56,"18.02.2021","21.02.2021",3,"Susp. Covid + komorbid","Suspect","Komorbid",">= 40 Tahun",1,
  28,"Saddam Husein","Laki-laki",28,"18.02.2021","20.02.2021",2,"Susp. Covid","Suspect","Non-Komorbid","< 40 Tahun",1,
  29,"Nining Ekawati","Perempuan",40,"19.02.2021","03.03.2021",12,"Susp. Covid + komorbid","Suspect","Komorbid",">= 40 Tahun",1,
  30,"Syifa Annisa","Perempuan",19,"20.02.2021","01.03.2021",9,"Susp. Covid","Suspect","Non-Komorbid","< 40 Tahun",1,
  31,"Hasan Salim","Laki-laki",82,"20.02.2021","23.02.2021",3,"Susp. Covid + komorbid","Suspect","Komorbid",">= 40 Tahun",1,
  32,"Srika Aryunita","Perempuan",19,"20.02.2021","02.03.2021",10,"Susp. Covid","Suspect","Non-Komorbid","< 40 Tahun",1,
  33,"Sugianto","Laki-laki",60,"22.02.2021","01.03.2021",7,"Susp. Covid","Suspect","Non-Komorbid",">= 40 Tahun",1,
  34,"Nadrah Zairina","Perempuan",56,"22.02.2021","26.02.2021",4,"Susp. Covid + komorbid","Suspect","Komorbid",">= 40 Tahun",1,
  35,"Sukarsih","Perempuan",59,"23.02.2021","04.03.2021",9,"Susp. Covid + komorbid","Suspect","Komorbid",">= 40 Tahun",1,
  36,"Taslim","Laki-laki",69,"24.02.2021","28.02.2021",4,"Susp. Covid","Suspect","Non-Komorbid",">= 40 Tahun",1,
  37,"Ellyta Puspa","Perempuan",53,"25.02.2021","06.03.2021",9,"Susp. Covid","Suspect","Non-Komorbid",">= 40 Tahun",1,
  38,"Bayu Andika S","Laki-laki",36,"28.02.2021","10.03.2021",10,"Susp. Covid","Suspect","Non-Komorbid","< 40 Tahun",1,
  39,"Diyatna Munajat","Laki-laki",30,"01.03.2021","03.03.2021",2,"Susp. Covid","Suspect","Non-Komorbid","< 40 Tahun",1,
  40,"Yanto S","Laki-laki",65,"02.03.2021","12.03.2021",10,"Confirm Covid","Positif","Non-Komorbid",">= 40 Tahun",1,
  41,"Marahalim Purba","Laki-laki",49,"04.03.2021","16.03.2021",12,"Susp. Covid + komorbid","Suspect","Komorbid",">= 40 Tahun",1,
  42,"Irhamul Khairi","Laki-laki",27,"07.03.2021","17.03.2021",10,"Susp. Covid","Suspect","Non-Komorbid","< 40 Tahun",1,
  43,"Sukarto","Laki-laki",43,"08.03.2021","11.03.2021",3,"Susp. Covid","Suspect","Non-Komorbid",">= 40 Tahun",1,
  44,"Farida Jayasumarta SE","Perempuan",61,"10.03.2021","19.03.2021",9,"Confirm Covid + komorbid","Positif","Komorbid",">= 40 Tahun",1,
  45,"Nursita Elisabeth","Perempuan",58,"10.03.2021","18.03.2021",8,"Susp. Covid + komorbid","Suspect","Komorbid",">= 40 Tahun",1,
  46,"Elvina Sari","Perempuan",24,"10.03.2021","18.03.2021",8,"Susp. Covid + komorbid","Suspect","Komorbid","< 40 Tahun",1,
  47,"Eka Syahputra","Laki-laki",48,"11.03.2021","22.03.2021",11,"Confirm Covid","Positif","Non-Komorbid",">= 40 Tahun",1,
  48,"Rahmad Syahputra","Laki-laki",46,"11.03.2021","17.03.2021",6,"Confirm Covid + komorbid","Positif","Komorbid",">= 40 Tahun",1,
  49,"Trimo","Laki-laki",57,"11.03.2021","14.03.2021",3,"Susp. Covid + komorbid","Suspect","Komorbid",">= 40 Tahun",1,
  50,"Dewani Harahap","Perempuan",77,"13.03.2021","14.03.2021",1,"Susp. Covid","Suspect","Non-Komorbid",">= 40 Tahun",1,
  51,"Albadri","Laki-laki",51,"13.03.2021","24.03.2021",11,"Susp. Covid + komorbid","Suspect","Komorbid",">= 40 Tahun",1,
  52,"Nurbaini Pasaribu","Perempuan",57,"14.03.2021","23.03.2021",9,"Susp. Covid + komorbid","Suspect","Komorbid",">= 40 Tahun",1,
  53,"Zulfikar Riyantoro","Laki-laki",44,"14.03.2021","23.03.2021",9,"Susp. Covid","Suspect","Non-Komorbid",">= 40 Tahun",1,
  54,"Piza Faurika","Laki-laki",24,"18.03.2021","26.03.2021",8,"Confirm Covid","Positif","Non-Komorbid","< 40 Tahun",1,
  55,"Rifky Budi Setiawan","Laki-laki",35,"18.03.2021","26.03.2021",8,"Confirm Covid","Positif","Non-Komorbid","< 40 Tahun",1,
  56,"Sri Astuti","Perempuan",51,"18.03.2021","28.03.2021",10,"Susp. Covid","Suspect","Non-Komorbid",">= 40 Tahun",1,
  57,"Banun Setiawaty","Perempuan",NA,"22.03.2021","23.03.2021",1,"Susp. Covid + komorbid","Suspect","Komorbid","Tidak Diketahui",1,
  58,"Rosmiati","Perempuan",57,"22.03.2021","31.03.2021",9,"Confirm Covid","Positif","Non-Komorbid",">= 40 Tahun",1,
  59,"Edward S.H","Laki-laki",61,"22.03.2021","25.03.2021",3,"Susp. Covid + komorbid","Suspect","Komorbid",">= 40 Tahun",1,
  60,"Nuraini","Perempuan",69,"23.03.2021","02.04.2021",10,"Susp. Covid + komorbid","Suspect","Komorbid",">= 40 Tahun",1,
  61,"Risha Rahmadani","Perempuan",21,"23.03.2021","03.04.2021",11,"Susp. Covid","Suspect","Non-Komorbid","< 40 Tahun",1,
  62,"Ernawati Siregar","Perempuan",51,"24.03.2021","26.03.2021",2,"Susp. Covid","Suspect","Non-Komorbid",">= 40 Tahun",1,
  63,"Sri Indriani","Perempuan",37,"24.03.2021","04.04.2021",11,"Susp. Covid","Suspect","Non-Komorbid","< 40 Tahun",1,
  64,"Dumasari","Perempuan",58,"25.03.2021","04.04.2021",10,"Susp. Covid","Suspect","Non-Komorbid",">= 40 Tahun",1,
  65,"Tria Rahayu","Perempuan",31,"25.03.2021","28.03.2021",3,"Susp. Covid","Suspect","Non-Komorbid","< 40 Tahun",1,
  66,"Sri Irmadhani","Perempuan",44,"25.03.2021","05.04.2021",11,"Susp. Covid","Suspect","Non-Komorbid",">= 40 Tahun",1,
  67,"Rafika Halim","Perempuan",27,"26.03.2021","28.03.2021",2,"Susp. Covid","Suspect","Non-Komorbid","< 40 Tahun",1,
  68,"Charles Surbakti","Laki-laki",37,"26.03.2021","31.03.2021",5,"Susp. Covid + komorbid","Suspect","Komorbid","< 40 Tahun",1,
  69,"Erlina","Perempuan",59,"29.03.2021","09.04.2021",11,"Susp. Covid + komorbid","Suspect","Komorbid",">= 40 Tahun",1,
  70,"Supriono","Laki-laki",34,"31.03.2021","10.04.2021",10,"Susp. Covid","Suspect","Non-Komorbid","< 40 Tahun",1,
  71,"Dewi Ayu Karina","Perempuan",34,"31.03.2021","06.04.2021",6,"Susp. Covid","Suspect","Non-Komorbid","< 40 Tahun",1,
  72,"Arfianti","Perempuan",52,"01.04.2021","15.04.2021",14,"Susp. Covid","Suspect","Non-Komorbid",">= 40 Tahun",1,
  73,"Dzul Hanief","Laki-laki",32,"02.04.2021","08.04.2021",6,"Terkonfirmasi Covid","Positif","Non-Komorbid","< 40 Tahun",1,
  74,"Akhramida","Perempuan",61,"05.04.2021","07.04.2021",2,"Susp. Covid + komorbid","Suspect","Komorbid",">= 40 Tahun",1,
  75,"Indra Jaya","Laki-laki",51,"05.04.2021","14.04.2021",9,"Susp. Covid + komorbid","Suspect","Komorbid",">= 40 Tahun",1,
  76,"Sunaryo","Laki-laki",39,"07.04.2021","18.04.2021",11,"Susp. Covid","Suspect","Non-Komorbid","< 40 Tahun",1,
  77,"Tusirin Saleh","Laki-laki",67,"08.04.2021","12.04.2021",4,"Susp. Covid","Suspect","Non-Komorbid",">= 40 Tahun",1,
  78,"Hari Susanto","Laki-laki",36,"09.04.2021","20.04.2021",11,"Susp. Covid","Suspect","Non-Komorbid","< 40 Tahun",1,
  79,"Inka Fitri Rambe","Perempuan",30,"09.04.2021","13.04.2021",4,"Terkonfirmasi Covid","Positif","Non-Komorbid","< 40 Tahun",1,
  80,"Rubiah","Perempuan",56,"10.04.2021","13.04.2021",3,"Susp. Covid","Suspect","Non-Komorbid",">= 40 Tahun",1,
  81,"Wulandari","Perempuan",36,"10.04.2021","17.04.2021",7,"Terkonfirmasi Covid","Positif","Non-Komorbid","< 40 Tahun",1,
  82,"Kurniawati Simarmata","Perempuan",35,"10.04.2021","22.04.2021",12,"Susp. Covid","Suspect","Non-Komorbid","< 40 Tahun",1,
  83,"Arbiah Syahrial","Perempuan",60,"11.04.2021","20.04.2021",9,"Susp. Covid","Suspect","Non-Komorbid",">= 40 Tahun",1,
  84,"Viodita Rani","Perempuan",26,"13.04.2021","17.04.2021",4,"Susp. Covid","Suspect","Non-Komorbid","< 40 Tahun",1,
  85,"Noni Sundari","Perempuan",25,"14.04.2021","19.04.2021",5,"Susp.Covid","Suspect","Non-Komorbid","< 40 Tahun",1,
  86,"Irham SH","Laki-laki",59,"15.04.2021","23.04.2021",8,"Susp. Covid + komorbid","Suspect","Komorbid",">= 40 Tahun",1,
  87,"Ismet SE","Laki-laki",50,"15.04.2021","19.04.2021",4,"Susp. Covid + komorbid","Suspect","Komorbid",">= 40 Tahun",1,
  88,"Rindy Utari","Perempuan",30,"18.04.2021","26.04.2021",8,"Terkonfirmasi Covid","Positif","Non-Komorbid","< 40 Tahun",1,
  89,"Djalaluddin","Laki-laki",60,"19.04.2021","22.04.2021",3,"Susp. Covid + komorbid","Suspect","Komorbid",">= 40 Tahun",1,
  90,"Adam","Laki-laki",26,"19.04.2021","26.04.2021",7,"Terkonfirmasi Covid","Positif","Non-Komorbid","< 40 Tahun",1,
  91,"Rini","Perempuan",44,"19.04.2021","28.04.2021",9,"Susp. Covid","Suspect","Non-Komorbid",">= 40 Tahun",1,
  92,"Juni Anshar Hrp","Laki-laki",50,"20.04.2021","30.04.2021",10,"Susp. Covid","Suspect","Non-Komorbid",">= 40 Tahun",1,
  93,"Elfianda Hutagalung","Perempuan",57,"20.04.2021","05.05.2021",15,"Terkonfirmasi Covid","Positif","Non-Komorbid",">= 40 Tahun",1,
  94,"M. Salfarizi Akbar Srg","Laki-laki",33,"21.04.2021","01.05.2021",10,"Terkonfirmasi Covid","Positif","Non-Komorbid","< 40 Tahun",1,
  95,"DT O Fahmi","Laki-laki",54,"21.04.2021","01.05.2021",10,"Susp. Covid","Suspect","Non-Komorbid",">= 40 Tahun",1,
  96,"Mirawati","Perempuan",40,"23.04.2021","02.05.2021",9,"Susp. Covid","Suspect","Non-Komorbid",">= 40 Tahun",1,
  97,"Dina","Perempuan",32,"23.04.2021","02.05.2021",9,"Susp. Covid","Suspect","Non-Komorbid","< 40 Tahun",1,
  98,"Tiomarni","Perempuan",54,"23.04.2021","03.05.2021",10,"Susp. Covid","Suspect","Non-Komorbid",">= 40 Tahun",1,
  99,"Zubaidah P","Perempuan",55,"26.04.2021","05.05.2021",9,"Susp. Covid","Suspect","Non-Komorbid",">= 40 Tahun",1,
  100,"Maruli Hamonangan D","Laki-laki",25,"27.04.2021","05.05.2021",8,"Susp. Covid","Suspect","Non-Komorbid","< 40 Tahun",1,
  101,"Dessy Laras Putri","Perempuan",31,"29.04.2021","07.05.2021",8,"Susp. Covid","Suspect","Non-Komorbid","< 40 Tahun",1,
  102,"Yuni Setiani","Perempuan",55,"30.04.2021","07.05.2021",7,"Susp. Covid","Suspect","Non-Komorbid",">= 40 Tahun",1,
  103,"Siau Ong","Laki-laki",53,"01.05.2021","09.05.2021",8,"Terkonfirmasi Covid","Positif","Non-Komorbid",">= 40 Tahun",1,
  104,"Ratna Farieda","Perempuan",51,"01.05.2021","11.05.2021",10,"Terkonfirmasi Covid","Positif","Non-Komorbid",">= 40 Tahun",1,
  105,"Ahmad Sadli","Laki-laki",37,"01.05.2021","05.05.2021",4,"Susp. Covid","Suspect","Non-Komorbid","< 40 Tahun",1,
  106,"Heri Pranoto","Laki-laki",49,"02.05.2021","08.05.2021",6,"Terkonfirmasi Covid","Positif","Non-Komorbid",">= 40 Tahun",1,
  107,"Muhammad Irsyad","Laki-laki",38,"03.05.2021","12.05.2021",9,"Susp. Covid","Suspect","Non-Komorbid","< 40 Tahun",1,
  108,"Andika Prasetyo S","Laki-laki",31,"04.05.2021","12.05.2021",8,"Susp. Covid","Suspect","Non-Komorbid","< 40 Tahun",1,
  109,"Indra Bayu","Laki-laki",37,"05.05.2021","10.05.2021",5,"Susp. Covid","Suspect","Non-Komorbid","< 40 Tahun",1,
  110,"Septian Joharis","Laki-laki",32,"06.05.2021","13.05.2021",7,"Konfirm Covid","Positif","Non-Komorbid","< 40 Tahun",1,
  111,"Putri Nazria","Perempuan",31,"06.05.2021","13.05.2021",7,"Konfirm Covid","Positif","Non-Komorbid","< 40 Tahun",1,
  112,"Budi Prahara","Laki-laki",39,"07.05.2021","14.05.2021",7,"Susp. Covid + komorbid","Suspect","Komorbid","< 40 Tahun",1,
  113,"Krisna Dame Siregar","Laki-laki",70,"07.05.2021","16.05.2021",9,"Susp. Covid + komorbid","Suspect","Komorbid",">= 40 Tahun",1,
  114,"M Reza Andika","Laki-laki",23,"07.05.2021","17.05.2021",10,"Susp. Covid","Suspect","Non-Komorbid","< 40 Tahun",1,
  115,"Hj. Rosmaini","Perempuan",62,"08.05.2021","15.05.2021",7,"Terkonfirmasi Covid + komorbid","Positif","Komorbid",">= 40 Tahun",1,
  116,"Mhd Baiturrahman","Laki-laki",28,"09.05.2021","16.05.2021",7,"Terkonfirmasi Covid","Positif","Non-Komorbid","< 40 Tahun",1,
  117,"Syahrawardi IR","Laki-laki",64,"10.05.2021","18.05.2021",8,"Terkonfirmasi Covid + komorbid","Positif","Komorbid",">= 40 Tahun",1,
  118,"Arrie Rahadi","Laki-laki",31,"12.05.2021","21.05.2021",9,"Terkonfirmasi Covid + komorbid","Positif","Komorbid","< 40 Tahun",1,
  119,"Duma Sari Rambe","Perempuan",59,"13.05.2021","23.05.2021",10,"Susp. Covid","Suspect","Non-Komorbid",">= 40 Tahun",1,
  120,"Pijor Nasution","Laki-laki",66,"13.05.2021","23.05.2021",10,"Susp. Covid","Suspect","Non-Komorbid",">= 40 Tahun",1,
  121,"Feaby Novianty G","Perempuan",27,"14.05.2021","18.05.2021",4,"Susp. Covid + komorbid","Suspect","Komorbid","< 40 Tahun",1,
  122,"Dewi Loriana Nst","Perempuan",27,"14.05.2021","20.05.2021",6,"Susp. Covid","Suspect","Non-Komorbid","< 40 Tahun",1,
  123,"Arianto","Laki-laki",45,"15.05.2021","25.05.2021",10,"Susp. Covid","Suspect","Non-Komorbid",">= 40 Tahun",1,
  124,"Dinni","Perempuan",43,"15.05.2021","25.05.2021",10,"Susp. Covid","Suspect","Non-Komorbid",">= 40 Tahun",1,
  125,"Doni S","Laki-laki",32,"15.05.2021","24.05.2021",9,"Susp. Covid","Suspect","Non-Komorbid","< 40 Tahun",1,
  126,"Siti Khodiah","Perempuan",30,"17.05.2021","25.05.2021",8,"Terkonfirmasi Covid + komorbid","Positif","Komorbid","< 40 Tahun",1,
  127,"Amartha S","Laki-laki",20,"17.05.2021","23.05.2021",6,"Susp. Covid","Suspect","Non-Komorbid","< 40 Tahun",1,
  128,"Jimmy N","Laki-laki",44,"18.05.2021","24.05.2021",6,"Terkonfirmasi Covid + komorbid","Positif","Komorbid",">= 40 Tahun",1,
  129,"Lindayati","Perempuan",55,"19.05.2021","26.05.2021",7,"Terkonfirmasi Covid + komorbid","Positif","Komorbid",">= 40 Tahun",1,
  130,"Basri Idris","Laki-laki",70,"21.05.2021","31.05.2021",10,"Terkonfirmasi Covid + komorbid","Positif","Komorbid",">= 40 Tahun",1,
  131,"Ahmad Bintang H","Laki-laki",NA,"22.05.2021","01.06.2021",10,"Susp. Covid","Suspect","Non-Komorbid","Tidak Diketahui",1,
  132,"Real D Hsb","Laki-laki",45,"24.05.2021","31.05.2021",7,"Terkonfirmasi Covid + komorbid","Positif","Komorbid",">= 40 Tahun",1,
  133,"Pipit H","Perempuan",43,"24.05.2021","31.05.2021",7,"Susp. Covid","Suspect","Non-Komorbid",">= 40 Tahun",1,
  134,"Ris Yunita","Perempuan",59,"24.05.2021","29.05.2021",5,"Susp. Covid","Suspect","Non-Komorbid",">= 40 Tahun",1,
  135,"Ramiono","Laki-laki",63,"24.05.2021","31.05.2021",7,"Susp. Covid + komorbid","Suspect","Komorbid",">= 40 Tahun",1,
  136,"Irni Faizzati","Perempuan",43,"25.05.2021","01.06.2021",7,"Susp.Covid","Suspect","Non-Komorbid",">= 40 Tahun",1,
  137,"Roswita","Perempuan",53,"25.05.2021","01.06.2021",7,"Susp. Covid","Suspect","Non-Komorbid",">= 40 Tahun",1,
  138,"Cut Ratna M","Perempuan",38,"25.05.2021","31.05.2021",6,"Terkonfirmasi Covid","Positif","Non-Komorbid","< 40 Tahun",1
)

# Pembersihan dan faktorisasi
data_covid <- data_raw |>
  mutate(
    tgl_masuk      = dmy(tgl_masuk),
    tgl_keluar     = dmy(tgl_keluar),
    jenis_kelamin  = factor(jenis_kelamin, levels = c("Laki-laki", "Perempuan")),
    komorbid       = factor(komorbid,      levels = c("Non-Komorbid", "Komorbid")),
    kategori_dx    = factor(kategori_dx,   levels = c("Suspect", "Positif")),
    kelompok_usia  = factor(kelompok_usia, levels = c("< 40 Tahun", ">= 40 Tahun", "Tidak Diketahui"))
  )

cat("Total observasi   :", nrow(data_covid), "\n")
## Total observasi   : 138
cat("Rentang lama rawat:", min(data_covid$lama_rawat), "-", max(data_covid$lama_rawat), "hari\n")
## Rentang lama rawat: 1 - 15 hari
cat("Jumlah event      :", sum(data_covid$status), "(semua sembuh)\n")
## Jumlah event      : 138 (semua sembuh)
cat("Jumlah tersensor  :", sum(data_covid$status == 0), "\n")
## Jumlah tersensor  : 0

Tampilan Data

data_covid |>
  select(no, jenis_kelamin, usia, lama_rawat, kategori_dx, komorbid, kelompok_usia, status) |>
  head(10) |>
  kable(
    caption    = "Tabel 1. Sepuluh Baris Pertama Data Pasien COVID-19 RSI Malahayati",
    col.names  = c("No", "Jenis Kelamin", "Usia", "Lama Rawat (Hari)",
                   "Kategori Dx", "Komorbid", "Kelompok Usia", "Status"),
    align      = "c"
  ) |>
  kable_styling(bootstrap_options = c("striped", "hover", "bordered"),
                full_width = TRUE, font_size = 11)
Tabel 1. Sepuluh Baris Pertama Data Pasien COVID-19 RSI Malahayati
No Jenis Kelamin Usia Lama Rawat (Hari) Kategori Dx Komorbid Kelompok Usia Status
1 Perempuan 40 8 Positif Komorbid >= 40 Tahun 1
2 Perempuan 63 11 Suspect Non-Komorbid >= 40 Tahun 1
3 Laki-laki 30 2 Suspect Komorbid < 40 Tahun 1
4 Laki-laki 58 10 Suspect Komorbid >= 40 Tahun 1
5 Laki-laki 61 3 Suspect Komorbid >= 40 Tahun 1
6 Perempuan 70 9 Suspect Komorbid >= 40 Tahun 1
7 Laki-laki 29 3 Suspect Non-Komorbid < 40 Tahun 1
8 Perempuan 60 10 Suspect Komorbid >= 40 Tahun 1
9 Perempuan NA 3 Suspect Komorbid Tidak Diketahui 1
10 Laki-laki 33 6 Positif Non-Komorbid < 40 Tahun 1

Distribusi Karakteristik

mk_tbl <- function(data, var, label_var, caption_txt) {
  data |>
    count({{ var }}, name = "Frekuensi") |>
    mutate(Persentase = paste0(round(Frekuensi / sum(Frekuensi) * 100, 1), "%")) |>
    rename(!!label_var := {{ var }}) |>
    kable(caption = caption_txt, align = c("l", "c", "c")) |>
    kable_styling(bootstrap_options = c("striped", "hover", "bordered"),
                  full_width = FALSE, font_size = 11)
}

mk_tbl(data_covid, jenis_kelamin, "Jenis Kelamin",       "Tabel 2a. Distribusi Jenis Kelamin")
Tabel 2a. Distribusi Jenis Kelamin
Jenis Kelamin Frekuensi Persentase
Laki-laki 68 49.3%
Perempuan 70 50.7%
mk_tbl(data_covid, komorbid,      "Status Komorbid",     "Tabel 2b. Distribusi Status Komorbid")
Tabel 2b. Distribusi Status Komorbid
Status Komorbid Frekuensi Persentase
Non-Komorbid 90 65.2%
Komorbid 48 34.8%
mk_tbl(data_covid, kategori_dx,   "Kategori Diagnosis",  "Tabel 2c. Distribusi Kategori Diagnosis")
Tabel 2c. Distribusi Kategori Diagnosis
Kategori Diagnosis Frekuensi Persentase
Suspect 105 76.1%
Positif 33 23.9%
mk_tbl(data_covid, kelompok_usia, "Kelompok Usia",       "Tabel 2d. Distribusi Kelompok Usia")
Tabel 2d. Distribusi Kelompok Usia
Kelompok Usia Frekuensi Persentase
< 40 Tahun 55 39.9%
>= 40 Tahun 80 58%
Tidak Diketahui 3 2.2%
data_covid |>
  count(lama_rawat, name = "Frekuensi") |>
  mutate(
    Persentase   = paste0(round(Frekuensi / sum(Frekuensi) * 100, 1), "%"),
    Kumulatif_n  = cumsum(Frekuensi),
    Kumulatif_pct = paste0(round(Kumulatif_n / sum(Frekuensi) * 100, 1), "%")
  ) |>
  rename(`Lama Rawat (Hari)` = lama_rawat) |>
  kable(
    caption   = "Tabel 2e. Distribusi Frekuensi Lama Rawat",
    col.names = c("Lama Rawat (Hari)", "Frekuensi", "Persentase",
                  "Kumulatif (n)", "Kumulatif (%)"),
    align     = rep("c", 5)
  ) |>
  kable_styling(bootstrap_options = c("striped", "hover", "bordered"),
                full_width = FALSE, font_size = 11)
Tabel 2e. Distribusi Frekuensi Lama Rawat
Lama Rawat (Hari) Frekuensi Persentase Kumulatif (n) Kumulatif (%)
1 2 1.4% 2 1.4%
2 9 6.5% 11 8%
3 16 11.6% 27 19.6%
4 8 5.8% 35 25.4%
5 4 2.9% 39 28.3%
6 9 6.5% 48 34.8%
7 15 10.9% 63 45.7%
8 15 10.9% 78 56.5%
9 20 14.5% 98 71%
10 23 16.7% 121 87.7%
11 10 7.2% 131 94.9%
12 5 3.6% 136 98.6%
14 1 0.7% 137 99.3%
15 1 0.7% 138 100%

Analisis Deskriptif

Statistik Ringkasan

n_total <- nrow(data_covid)

stats_lr <- data_covid |>
  summarise(
    N      = n(),
    Min    = min(lama_rawat),
    Maks   = max(lama_rawat),
    Rerata = round(mean(lama_rawat), 2),
    Median = median(lama_rawat),
    SD     = round(sd(lama_rawat), 2),
    Q1     = quantile(lama_rawat, 0.25),
    Q3     = quantile(lama_rawat, 0.75),
    IQR    = IQR(lama_rawat),
    Skewness = round(
      mean((lama_rawat - mean(lama_rawat))^3) / sd(lama_rawat)^3, 3
    ),
    Kurtosis = round(
      mean((lama_rawat - mean(lama_rawat))^4) / sd(lama_rawat)^4, 3
    )
  )

t(stats_lr) |>
  kable(
    caption   = "Tabel 3. Statistik Deskriptif Lama Rawat Pasien COVID-19",
    col.names = "Nilai",
    align     = "c"
  ) |>
  kable_styling(bootstrap_options = c("striped", "hover"),
                full_width = FALSE, font_size = 11)
Tabel 3. Statistik Deskriptif Lama Rawat Pasien COVID-19
Nilai
N 138.000
Min 1.000
Maks 15.000
Rerata 7.300
Median 8.000
SD 3.120
Q1 4.250
Q3 10.000
IQR 5.750
Skewness -0.273
Kurtosis 2.135

Nilai skewness positif menunjukkan distribusi sedikit miring ke kanan, konsisten dengan adanya beberapa pasien dengan lama rawat relatif panjang (\(\geq 12\) hari).

Visualisasi Distribusi dan Karakteristik

mk_bar <- function(data, var, fill_col, title_txt) {
  data |>
    count({{ var }}) |>
    mutate(pct = n / sum(n), label = paste0(n, "\n(", round(pct * 100, 1), "%)")) |>
    ggplot(aes(x = {{ var }}, y = n, fill = {{ var }})) +
    geom_col(alpha = 0.85, width = 0.6) +
    geom_text(aes(label = label), vjust = -0.3, size = 3.2, fontface = "bold") +
    scale_fill_manual(values = fill_col) +
    scale_y_continuous(expand = expansion(mult = c(0, 0.18))) +
    labs(title = title_txt, x = NULL, y = "Frekuensi") +
    theme_classic(base_size = 10) +
    theme(legend.position = "none",
          plot.title = element_text(face = "bold", size = 10),
          axis.text.x = element_text(size = 9))
}

p_bar_jk <- mk_bar(data_covid, jenis_kelamin,
                   c("Laki-laki" = "#42A5F5", "Perempuan" = "#EC407A"),
                   "Jenis Kelamin")

p_bar_km <- mk_bar(data_covid, komorbid,
                   c("Non-Komorbid" = "#66BB6A", "Komorbid" = "#FF7043"),
                   "Status Komorbid")

p_bar_dx <- mk_bar(data_covid, kategori_dx,
                   c("Suspect" = "#AB47BC", "Positif" = "#EF5350"),
                   "Kategori Diagnosis")

p_bar_us <- data_covid |>
  count(kelompok_usia) |>
  mutate(pct = n / sum(n), label = paste0(n, "\n(", round(pct * 100, 1), "%)")) |>
  ggplot(aes(x = kelompok_usia, y = n, fill = kelompok_usia)) +
  geom_col(alpha = 0.85, width = 0.6) +
  geom_text(aes(label = label), vjust = -0.3, size = 3.2, fontface = "bold") +
  scale_fill_manual(values = c("< 40 Tahun" = "#00897B",
                                ">= 40 Tahun" = "#F4511E",
                                "Tidak Diketahui" = "#9E9E9E")) +
  scale_y_continuous(expand = expansion(mult = c(0, 0.18))) +
  labs(title = "Kelompok Usia", x = NULL, y = "Frekuensi") +
  theme_classic(base_size = 10) +
  theme(legend.position = "none",
        plot.title = element_text(face = "bold", size = 10),
        axis.text.x = element_text(size = 8))

p_hist <- ggplot(data_covid, aes(x = lama_rawat)) +
  geom_histogram(aes(y = after_stat(density)), bins = 14,
                 fill = "#1976D2", color = "white", alpha = 0.85) +
  geom_density(color = "#C62828", linewidth = 1.1) +
  geom_vline(xintercept = mean(data_covid$lama_rawat),
             linetype = "dashed", color = "#F57F17", linewidth = 0.9) +
  geom_vline(xintercept = median(data_covid$lama_rawat),
             linetype = "dotted", color = "#2E7D32", linewidth = 0.9) +
  annotate("text", x = mean(data_covid$lama_rawat) + 0.4, y = 0.14,
           label = paste0("Rerata = ", round(mean(data_covid$lama_rawat), 1), " hari"),
           hjust = 0, color = "#F57F17", size = 3) +
  annotate("text", x = median(data_covid$lama_rawat) - 0.4, y = 0.12,
           label = paste0("Median = ", median(data_covid$lama_rawat), " hari"),
           hjust = 1, color = "#2E7D32", size = 3) +
  scale_x_continuous(breaks = 1:15) +
  labs(title = "Distribusi Lama Rawat", x = "Lama Rawat (Hari)", y = "Kepadatan") +
  theme_classic(base_size = 10) +
  theme(plot.title = element_text(face = "bold", size = 10))

# Boxplot per kelompok
mk_box <- function(data, var, fill_col, title_txt) {
  ggplot(data, aes(x = {{ var }}, y = lama_rawat, fill = {{ var }})) +
    geom_boxplot(alpha = 0.75, outlier.color = "#C62828") +
    geom_jitter(width = 0.15, alpha = 0.25, size = 1.3) +
    scale_fill_manual(values = fill_col) +
    labs(title = title_txt, x = NULL, y = "Hari") +
    theme_classic(base_size = 10) +
    theme(legend.position = "none",
          plot.title = element_text(face = "bold", size = 10))
}

p_box_jk <- mk_box(data_covid, jenis_kelamin,
                   c("Laki-laki" = "#42A5F5", "Perempuan" = "#EC407A"),
                   "Lama Rawat x Jenis Kelamin")

p_box_km <- mk_box(data_covid, komorbid,
                   c("Non-Komorbid" = "#66BB6A", "Komorbid" = "#FF7043"),
                   "Lama Rawat x Komorbid")

p_box_dx <- mk_box(data_covid, kategori_dx,
                   c("Suspect" = "#AB47BC", "Positif" = "#EF5350"),
                   "Lama Rawat x Kategori Dx")

p_box_us <- data_covid |>
  filter(kelompok_usia != "Tidak Diketahui") |>
  ggplot(aes(x = kelompok_usia, y = lama_rawat, fill = kelompok_usia)) +
  geom_boxplot(alpha = 0.75, outlier.color = "#C62828") +
  geom_jitter(width = 0.15, alpha = 0.25, size = 1.3) +
  scale_fill_manual(values = c("< 40 Tahun" = "#00897B", ">= 40 Tahun" = "#F4511E")) +
  labs(title = "Lama Rawat x Kelompok Usia", x = NULL, y = "Hari") +
  theme_classic(base_size = 10) +
  theme(legend.position = "none",
        plot.title = element_text(face = "bold", size = 10))

gridExtra::grid.arrange(
  p_bar_jk, p_bar_km, p_bar_dx, p_bar_us,
  p_hist, p_box_jk, p_box_km, p_box_dx,
  ncol = 4
)
Gambar 1. Distribusi Karakteristik dan Lama Rawat Pasien COVID-19 RSI Malahayati

Gambar 1. Distribusi Karakteristik dan Lama Rawat Pasien COVID-19 RSI Malahayati


Metode Empiris (Fungsi Survivor Nonparametrik)

Konsep dan Rumusan

Fungsi Survivor Empiris adalah estimator nonparametrik paling sederhana — tidak memerlukan asumsi distribusi. Karena seluruh 138 pasien mengalami event (sembuh), metode empiris memberikan estimasi langsung dari data observasi.

Fungsi Distribusi Kumulatif Empiris (ECDF):

\[\hat{F}(t) = \frac{\#\{i : T_i \leq t\}}{n} = \frac{\text{banyak pasien dengan lama rawat} \leq t}{n}\]

Fungsi Survivor Empiris:

\[\hat{S}(t) = 1 - \hat{F}(t) = \frac{\#\{i : T_i > t\}}{n} = \frac{\text{banyak pasien dengan lama rawat} > t}{n}\]

Sifat-sifat estimator empiris: - Tidak bias: \(E[\hat{S}(t)] = S(t)\) untuk semua \(t\) - Variansi: \(\text{Var}[\hat{S}(t)] = \frac{S(t)[1-S(t)]}{n}\) (berdasarkan distribusi binomial) - Konsisten: \(\hat{S}(t) \xrightarrow{p} S(t)\) saat \(n \to \infty\) (Teorema Glivenko-Cantelli)

Algoritma Perhitungan Manual

Langkah-langkah perhitungan fungsi survivor empiris:

  1. Urutkan data lama rawat: \(t_{(1)} \leq t_{(2)} \leq \cdots \leq t_{(n)}\)
  2. Identifikasi waktu-waktu unik: \(t_1 < t_2 < \cdots < t_k\)
  3. Pada setiap \(t\), hitung jumlah event kumulatif: \(D(t) = \#\{i : T_i \leq t\}\)
  4. Hitung \(\hat{F}(t) = D(t)/n\) dan \(\hat{S}(t) = 1 - \hat{F}(t)\)

Perhitungan Manual Lengkap

t_unik <- sort(unique(data_covid$lama_rawat))

empiris_tbl <- tibble(
  t       = t_unik,
  d_t     = map_int(t_unik, ~ sum(data_covid$lama_rawat == .x)),
  n_kum   = cumsum(map_int(t_unik, ~ sum(data_covid$lama_rawat == .x))),
  Ft      = round(n_kum / n_total, 6),
  St      = round(1 - Ft, 6),
  Var_St  = round(St * (1 - St) / n_total, 8),
  SE_St   = round(sqrt(Var_St), 6),
  CI_lo   = round(pmax(St - 1.96 * SE_St, 0), 6),
  CI_up   = round(pmin(St + 1.96 * SE_St, 1), 6)
)

empiris_display <- bind_rows(
  tibble(t = 0, d_t = 0, n_kum = 0, Ft = 0, St = 1,
         Var_St = 0, SE_St = 0, CI_lo = 1, CI_up = 1),
  empiris_tbl
)

empiris_display |>
  select(t, d_t, n_kum, Ft, St, SE_St, CI_lo, CI_up) |>
  kable(
    caption   = "Tabel 4. Fungsi Survivor Empiris — Pasien COVID-19 RSI Malahayati",
    col.names = c("t (Hari)", "d_t (event)", "Kum. Event",
                  "F-hat(t)", "S-hat(t)", "SE[S-hat(t)]",
                  "CI Bawah 95%", "CI Atas 95%"),
    digits    = 5,
    align     = rep("c", 8)
  ) |>
  kable_styling(bootstrap_options = c("striped", "hover", "bordered"),
                full_width = TRUE, font_size = 11)
Tabel 4. Fungsi Survivor Empiris — Pasien COVID-19 RSI Malahayati
t (Hari) d_t (event) Kum. Event F-hat(t) S-hat(t) SE[S-hat(t)] CI Bawah 95% CI Atas 95%
0 0 0 0.00000 1.00000 0.00000 1.00000 1.00000
1 2 2 0.01449 0.98551 0.01017 0.96557 1.00000
2 9 11 0.07971 0.92029 0.02306 0.87510 0.96548
3 16 27 0.19565 0.80435 0.03377 0.73816 0.87054
4 8 35 0.25362 0.74638 0.03704 0.67378 0.81897
5 4 39 0.28261 0.71739 0.03833 0.64227 0.79252
6 9 48 0.34783 0.65217 0.04054 0.57271 0.73164
7 15 63 0.45652 0.54348 0.04240 0.46037 0.62659
8 15 78 0.56522 0.43478 0.04220 0.35207 0.51749
9 20 98 0.71015 0.28985 0.03862 0.21416 0.36555
10 23 121 0.87681 0.12319 0.02798 0.06835 0.17802
11 10 131 0.94928 0.05072 0.01868 0.01411 0.08734
12 5 136 0.98551 0.01449 0.01017 0.00000 0.03443
14 1 137 0.99275 0.00725 0.00722 0.00000 0.02140
15 1 138 1.00000 0.00000 0.00000 0.00000 0.00000

Verifikasi Manual pada \(t = 7\) Hari

sebelum7 <- sum(data_covid$lama_rawat < 7)
tepat7   <- sum(data_covid$lama_rawat == 7)
lebih7   <- sum(data_covid$lama_rawat > 7)

verif_df <- tibble(
  Keterangan = c(
    "Pasien lama rawat < 7 hari (sudah sembuh sebelum hari ke-7)",
    "Pasien lama rawat = 7 hari (sembuh tepat hari ke-7)",
    "Pasien lama rawat > 7 hari (belum sembuh pada hari ke-7)",
    "F-hat(7) = jumlah lama rawat ≤ 7 / n",
    "S-hat(7) = jumlah lama rawat > 7 / n"
  ),
  Perhitungan = c(
    paste0(sebelum7, " pasien"),
    paste0(tepat7, " pasien"),
    paste0(lebih7, " pasien"),
    paste0("(", sebelum7 + tepat7, ") / ", n_total, " = ",
           round((sebelum7 + tepat7) / n_total, 5)),
    paste0(lebih7, " / ", n_total, " = ",
           round(lebih7 / n_total, 5))
  )
)

verif_df |>
  kable(caption = "Tabel 4a. Verifikasi Manual S-hat(t) pada t = 7 Hari",
        align = c("l", "c")) |>
  kable_styling(bootstrap_options = c("striped", "hover", "bordered"),
                full_width = TRUE, font_size = 11)
Tabel 4a. Verifikasi Manual S-hat(t) pada t = 7 Hari
Keterangan Perhitungan
Pasien lama rawat < 7 hari (sudah sembuh sebelum hari ke-7) 48 pasien
Pasien lama rawat = 7 hari (sembuh tepat hari ke-7) 15 pasien
Pasien lama rawat > 7 hari (belum sembuh pada hari ke-7) 75 pasien
F-hat(7) = jumlah lama rawat ≤ 7 / n
  1. / 138 = 0.45652
S-hat(7) = jumlah lama rawat > 7 / n 75 / 138 = 0.54348

Visualisasi Fungsi Survivor Empiris

ggplot(empiris_display, aes(x = t, y = St)) +
  geom_ribbon(aes(ymin = CI_lo, ymax = CI_up),
              fill = "#BBDEFB", alpha = 0.45) +
  geom_step(color = "#1565C0", linewidth = 1.3) +
  geom_point(color = "#1565C0", size = 2.5) +
  geom_step(aes(y = Ft), color = "#B71C1C", linewidth = 1, linetype = "dashed") +
  geom_point(aes(y = Ft), color = "#B71C1C", size = 2) +
  annotate("text", x = 11, y = 0.72,
           label = "F-hat(t) — Dist. Kumulatif",
           color = "#B71C1C", size = 3.5, fontface = "bold") +
  annotate("text", x = 4, y = 0.52,
           label = "S-hat(t) — Fungsi Survivor",
           color = "#1565C0", size = 3.5, fontface = "bold") +
  annotate("text", x = 7, y = 0.92,
           label = "Area abu: CI 95% S-hat(t)",
           color = "#1565C0", size = 3, fontface = "italic") +
  scale_x_continuous(breaks = 0:15) +
  scale_y_continuous(labels = percent_format(), limits = c(0, 1.05)) +
  labs(
    title    = "Gambar 2. Fungsi Survivor Empiris — Pasien COVID-19 RSI Malahayati",
    subtitle = "Estimasi Nonparametrik Langsung (n = 138, semua sembuh)",
    x        = "Lama Rawat t (Hari)",
    y        = "Probabilitas"
  ) +
  theme_classic(base_size = 11) +
  theme(plot.title    = element_text(face = "bold", size = 12),
        plot.subtitle = element_text(color = "grey50"))
Gambar 2. Fungsi Survivor Empiris dan CDF Kumulatif — Pasien COVID-19 RSI Malahayati

Gambar 2. Fungsi Survivor Empiris dan CDF Kumulatif — Pasien COVID-19 RSI Malahayati

Visualisasi Empiris per Kelompok

calc_ecdf_group <- function(data, group_var) {
  data |>
    group_by({{ group_var }}) |>
    group_modify(~ {
      grp_data <- .x
      t_u <- sort(unique(grp_data$lama_rawat))
      n_g <- nrow(grp_data)
      tibble(
        t  = c(0, t_u),
        St = c(1, map_dbl(t_u, function(tv) sum(grp_data$lama_rawat > tv) / n_g))
      )
    }) |>
    ungroup()
}

ecdf_km  <- calc_ecdf_group(data_covid, komorbid)
ecdf_jk  <- calc_ecdf_group(data_covid, jenis_kelamin)
ecdf_dx  <- calc_ecdf_group(data_covid, kategori_dx)
ecdf_us  <- data_covid |>
  filter(kelompok_usia != "Tidak Diketahui") |>
  calc_ecdf_group(kelompok_usia)

mk_ecdf_plot <- function(ecdf_data, group_col, pal, title_txt) {
  ggplot(ecdf_data, aes(x = t, y = St,
                        color = {{ group_col }},
                        linetype = {{ group_col }})) +
    geom_step(linewidth = 1.2) +
    geom_point(size = 2) +
    scale_color_manual(values = pal) +
    scale_x_continuous(breaks = 0:15) +
    scale_y_continuous(labels = percent_format(), limits = c(0, 1.05)) +
    labs(title = title_txt, x = "Lama Rawat (Hari)", y = "S-hat(t)",
         color = NULL, linetype = NULL) +
    theme_classic(base_size = 10) +
    theme(legend.position = "bottom",
          plot.title = element_text(face = "bold", size = 10))
}

p_ecdf_km <- mk_ecdf_plot(ecdf_km, komorbid,
                          c("Non-Komorbid" = "#43A047", "Komorbid" = "#EF5350"),
                          "Empiris: Komorbid")

p_ecdf_jk <- mk_ecdf_plot(ecdf_jk, jenis_kelamin,
                          c("Laki-laki" = "#1E88E5", "Perempuan" = "#E91E63"),
                          "Empiris: Jenis Kelamin")

p_ecdf_dx <- mk_ecdf_plot(ecdf_dx, kategori_dx,
                          c("Suspect" = "#7B1FA2", "Positif" = "#D32F2F"),
                          "Empiris: Kategori Diagnosis")

p_ecdf_us <- mk_ecdf_plot(ecdf_us, kelompok_usia,
                          c("< 40 Tahun" = "#00897B", ">= 40 Tahun" = "#F4511E"),
                          "Empiris: Kelompok Usia")

gridExtra::grid.arrange(p_ecdf_km, p_ecdf_jk, p_ecdf_dx, p_ecdf_us, ncol = 2)
Gambar 2b. Fungsi Survivor Empiris per Kelompok

Gambar 2b. Fungsi Survivor Empiris per Kelompok


Metode Life Table (Tabel Aktuaria)

Konsep dan Asumsi

Life Table (tabel kehidupan aktuaria) mengelompokkan waktu survival ke dalam interval diskret. Metode ini dikembangkan dari tabel kematian aktuaria dan merupakan pendekatan yang paling tua dalam analisis survival.

Asumsi utama: Individu tersensor diasumsikan keluar di pertengahan interval (koreksi aktuaria), sehingga kontribusinya ke populasi berisiko adalah \(c_j/2\).

Keunggulan Life Table: - Mampu merangkum data besar dalam tabel kompak - Menyediakan estimasi hazard per interval - Sesuai untuk data yang sudah dikelompokkan

Keterbatasan: - Memerlukan penentuan lebar interval yang tepat - Asumsi uniform dalam interval dapat mengurangi presisi

Notasi dan Formulasi

Untuk interval \(j\) yang dimulai pada \(t_{j-1}\) dan berakhir pada \(t_j\):

Langkah Simbol Rumus Keterangan
1 \(n_j\) \(n_j = n_{j-1} - d_{j-1} - c_{j-1}\) Jumlah masuk interval
2 \(n_j^*\) \(n_j^* = n_j - c_j/2\) Efektif berisiko (koreksi aktuaria)
3 \(\hat{q}_j\) \(\hat{q}_j = d_j / n_j^*\) Proporsi event dalam interval
4 \(\hat{p}_j\) \(\hat{p}_j = 1 - \hat{q}_j\) Proporsi survivor interval
5 \(\hat{S}(t_j)\) \(\hat{S}(t_j) = \prod_{i=1}^{j} \hat{p}_i\) Fungsi survivor kumulatif
6 \(\hat{F}(t_j)\) \(\hat{F}(t_j) = 1 - \hat{S}(t_j)\) Distribusi kumulatif
7 \(\hat{h}(t_m)\) \(\hat{h}(t_m) = \dfrac{2\,d_j}{n_j^*\,[\hat{S}(t_{j-1}) + \hat{S}(t_j)]\,b_j}\) Hazard titik tengah
8 \(\widehat{SE}\) \(\widehat{SE} = \hat{S}(t_j)\sqrt{\displaystyle\sum_{i=1}^{j} \dfrac{d_i}{n_i^*(n_i^* - d_i)}}\) Greenwood SE

di mana \(b_j\) adalah lebar interval \(j\), dan \(t_m = (t_{j-1} + t_j)/2\) adalah titik tengah interval.

Perhitungan Manual Life Table

lt_raw <- data_covid |>
  group_by(lama_rawat) |>
  summarise(
    d_j = sum(status == 1),
    c_j = sum(status == 0),
    .groups = "drop"
  ) |>
  arrange(lama_rawat) |>
  rename(t_j = lama_rawat)

lt_calc <- lt_raw |>
  mutate(
    n_j    = n_total - cumsum(lag(d_j, default = 0)) - cumsum(lag(c_j, default = 0)),
    n_j_str = n_j - c_j / 2,             # koreksi aktuaria
    q_j    = round(d_j / n_j_str, 6),    # proporsi event
    p_j    = round(1 - q_j, 6),          # proporsi survivor
    St     = round(cumprod(p_j), 6),     # S(t) kumulatif
    Ft     = round(1 - St, 6)            # F(t) kumulatif
  ) |>
  mutate(
    St_prev = lag(St, default = 1),
    ft      = round((St_prev - St) / 1, 6),          # f(t) lebar = 1
    ht      = round(2 * d_j / (n_j_str * (St_prev + St) * 1), 6),  # hazard
    SE      = round(St * sqrt(cumsum(d_j / (n_j_str * (n_j_str - d_j)))), 6),
    CI_lo   = round(pmax(St - 1.96 * SE, 0), 6),
    CI_up   = round(pmin(St + 1.96 * SE, 1), 6)
  )

lt_display <- bind_rows(
  tibble(t_j = 0, d_j = 0, c_j = 0, n_j = n_total,
         n_j_str = as.double(n_total), q_j = 0, p_j = 1,
         St = 1, Ft = 0, St_prev = NA, ft = NA, ht = NA,
         SE = 0, CI_lo = 1, CI_up = 1),
  lt_calc
)

lt_display |>
  select(t_j, n_j, d_j, c_j, n_j_str, q_j, p_j, St, Ft, SE, CI_lo, CI_up) |>
  kable(
    caption   = "Tabel 5. Life Table Manual — Pasien COVID-19 RSI Malahayati",
    col.names = c("t (Hari)", "n_j", "d_j", "c_j", "n_j*",
                  "q-hat_j", "p-hat_j", "S-hat(t)", "F-hat(t)",
                  "SE[S-hat(t)]", "CI Bawah 95%", "CI Atas 95%"),
    digits    = 5,
    align     = rep("c", 12)
  ) |>
  kable_styling(bootstrap_options = c("striped", "hover", "bordered"),
                full_width = TRUE, font_size = 10)
Tabel 5. Life Table Manual — Pasien COVID-19 RSI Malahayati
t (Hari) n_j d_j c_j n_j* q-hat_j p-hat_j S-hat(t) F-hat(t) SE[S-hat(t)] CI Bawah 95% CI Atas 95%
0 138 0 0 138 0.00000 1.00000 1.00000 0.00000 0.00000 1.00000 1.00000
1 138 2 0 138 0.01449 0.98551 0.98551 0.01449 0.01017 0.96557 1.00000
2 136 9 0 136 0.06618 0.93382 0.92029 0.07971 0.02306 0.87510 0.96548
3 127 16 0 127 0.12598 0.87402 0.80435 0.19565 0.03377 0.73816 0.87054
4 111 8 0 111 0.07207 0.92793 0.74638 0.25362 0.03704 0.67378 0.81897
5 103 4 0 103 0.03884 0.96117 0.71739 0.28261 0.03833 0.64227 0.79252
6 99 9 0 99 0.09091 0.90909 0.65217 0.34783 0.04054 0.57271 0.73164
7 90 15 0 90 0.16667 0.83333 0.54348 0.45652 0.04240 0.46037 0.62659
8 75 15 0 75 0.20000 0.80000 0.43478 0.56522 0.04220 0.35207 0.51749
9 60 20 0 60 0.33333 0.66667 0.28985 0.71015 0.03862 0.21416 0.36555
10 40 23 0 40 0.57500 0.42500 0.12319 0.87681 0.02798 0.06835 0.17802
11 17 10 0 17 0.58823 0.41176 0.05072 0.94928 0.01868 0.01411 0.08734
12 7 5 0 7 0.71429 0.28571 0.01449 0.98551 0.01017 0.00000 0.03443
14 2 1 0 2 0.50000 0.50000 0.00725 0.99275 0.00722 0.00000 0.02140
15 1 1 0 1 1.00000 0.00000 0.00000 1.00000 NaN NaN NaN

Justifikasi Pemilihan Titik Verifikasi

Dalam proses verifikasi manual perhitungan fungsi survival menggunakan metode Life Table, tidak semua titik waktu perlu dihitung kembali secara manual. Oleh karena itu, pemilihan titik verifikasi dilakukan pada waktu yang paling informatif, yaitu waktu ketika terjadi perubahan probabilitas survival yang relatif besar akibat banyaknya kejadian (event) pada interval tersebut.

Berdasarkan hasil perhitungan Life Table, penurunan nilai \(\hat{S}(t)\) yang paling signifikan terjadi pada \(t = 9\) dan \(t = 10\). Pada \(t = 9\) tercatat \(d_9 = 20\) kejadian, menyebabkan \(\hat{S}(t)\) menurun dari \(0{,}435\) menjadi \(0{,}290\) (\(\Delta S \approx 14{,}5\%\)). Pada \(t = 10\) terjadi jumlah kejadian terbesar dalam seluruh dataset, yaitu \(d_{10} = 23\), sehingga \(\hat{S}(t)\) kembali turun tajam dari \(0{,}290\) menjadi \(0{,}123\) (\(\Delta S \approx 16{,}7\%\)). Besarnya jumlah kejadian pada kedua waktu tersebut menjadikannya titik yang representatif untuk memverifikasi kembali perhitungan probabilitas survival secara manual.

Pendekatan ini bertujuan memastikan bahwa proses verifikasi tidak hanya bersifat ilustratif, tetapi juga mampu merepresentasikan interval waktu yang memberikan kontribusi terbesar terhadap perubahan fungsi survival dalam data penelitian.

Verifikasi Manual pada \(t = 9\) dan \(t = 10\)

b8  <- lt_calc |> filter(t_j == 8)
b9  <- lt_calc |> filter(t_j == 9)
b10 <- lt_calc |> filter(t_j == 10)

verif9_df <- tibble(
  Langkah = c(
    "n_9 (individu memasuki interval t=9)",
    "d_9 (event/sembuh dalam interval)",
    "c_9 (tersensor dalam interval)",
    "n_9* = n_9 - c_9/2 (koreksi aktuaria)",
    "q-hat_9 = d_9 / n_9*",
    "p-hat_9 = 1 - q-hat_9",
    "S-hat(8) dari baris sebelumnya",
    "S-hat(9) = p-hat_9 x S-hat(8)",
    "Verifikasi"
  ),
  `Perhitungan Manual` = c(
    as.character(b9$n_j),
    as.character(b9$d_j),
    as.character(b9$c_j),
    paste0(b9$n_j, " - ", b9$c_j, "/2 = ", b9$n_j_str),
    paste0(b9$d_j, " / ", b9$n_j_str, " = ", round(b9$q_j, 6)),
    paste0("1 - ", round(b9$q_j, 6), " = ", round(b9$p_j, 6)),
    round(b8$St, 6),
    paste0(round(b9$p_j, 6), " x ", round(b8$St, 6), " = ",
           round(b9$p_j * b8$St, 6)),
    ifelse(abs(b9$p_j * b8$St - b9$St) < 1e-4, "SESUAI", "TIDAK SESUAI")
  )
)

verif9_df |>
  kable(caption = "Tabel 5a. Verifikasi Manual Life Table pada t = 9 Hari",
        align = c("l", "c")) |>
  kable_styling(bootstrap_options = c("striped", "hover", "bordered"),
                full_width = TRUE, font_size = 11)
Tabel 5a. Verifikasi Manual Life Table pada t = 9 Hari
Langkah Perhitungan Manual
n_9 (individu memasuki interval t=9) 60
d_9 (event/sembuh dalam interval) 20
c_9 (tersensor dalam interval) 0
n_9* = n_9 - c_9/2 (koreksi aktuaria) 60 - 0/2 = 60
q-hat_9 = d_9 / n_9* 20 / 60 = 0.333333
p-hat_9 = 1 - q-hat_9 1 - 0.333333 = 0.666667
S-hat(8) dari baris sebelumnya 0.434783
S-hat(9) = p-hat_9 x S-hat(8) 0.666667 x 0.434783 = 0.289855
Verifikasi SESUAI
verif10_df <- tibble(
  Langkah = c(
    "n_10 (individu memasuki interval t=10)",
    "d_10 (event/sembuh dalam interval)",
    "c_10 (tersensor dalam interval)",
    "n_10* = n_10 - c_10/2 (koreksi aktuaria)",
    "q-hat_10 = d_10 / n_10*",
    "p-hat_10 = 1 - q-hat_10",
    "S-hat(9) dari baris sebelumnya",
    "S-hat(10) = p-hat_10 x S-hat(9)",
    "Penurunan S(t): S(9) - S(10)",
    "Verifikasi"
  ),
  `Perhitungan Manual` = c(
    as.character(b10$n_j),
    as.character(b10$d_j),
    as.character(b10$c_j),
    paste0(b10$n_j, " - ", b10$c_j, "/2 = ", b10$n_j_str),
    paste0(b10$d_j, " / ", b10$n_j_str, " = ", round(b10$q_j, 6)),
    paste0("1 - ", round(b10$q_j, 6), " = ", round(b10$p_j, 6)),
    round(b9$St, 6),
    paste0(round(b10$p_j, 6), " x ", round(b9$St, 6), " = ",
           round(b10$p_j * b9$St, 6)),
    paste0(round(b9$St, 6), " - ", round(b10$St, 6), " = ",
           round(b9$St - b10$St, 6)),
    ifelse(abs(b10$p_j * b9$St - b10$St) < 1e-4, "SESUAI", "TIDAK SESUAI")
  )
)

verif10_df |>
  kable(caption = "Tabel 5b. Verifikasi Manual Life Table pada t = 10 Hari",
        align = c("l", "c")) |>
  kable_styling(bootstrap_options = c("striped", "hover", "bordered"),
                full_width = TRUE, font_size = 11)
Tabel 5b. Verifikasi Manual Life Table pada t = 10 Hari
Langkah Perhitungan Manual
n_10 (individu memasuki interval t=10) 40
d_10 (event/sembuh dalam interval) 23
c_10 (tersensor dalam interval) 0
n_10* = n_10 - c_10/2 (koreksi aktuaria) 40 - 0/2 = 40
q-hat_10 = d_10 / n_10* 23 / 40 = 0.575
p-hat_10 = 1 - q-hat_10 1 - 0.575 = 0.425
S-hat(9) dari baris sebelumnya 0.289855
S-hat(10) = p-hat_10 x S-hat(9) 0.425 x 0.289855 = 0.123188
Penurunan S(t): S(9) - S(10) 0.289855 - 0.123189 = 0.166666
Verifikasi SESUAI

Visualisasi Life Table

lt_plot <- lt_display |>
  select(t_j, St, Ft) |>
  pivot_longer(cols = c(St, Ft), names_to = "fungsi", values_to = "nilai") |>
  mutate(fungsi = ifelse(fungsi == "St",
                         "Fungsi Survivor S-hat(t)",
                         "Distribusi Kumulatif F-hat(t)"))

p_lt_surv <- ggplot(lt_plot, aes(x = t_j, y = nilai, color = fungsi, linetype = fungsi)) +
  geom_step(linewidth = 1.2) +
  geom_point(size = 2.5) +
  scale_color_manual(values = c("Fungsi Survivor S-hat(t)" = "#1565C0",
                                 "Distribusi Kumulatif F-hat(t)" = "#C62828")) +
  scale_linetype_manual(values = c("Fungsi Survivor S-hat(t)" = "solid",
                                    "Distribusi Kumulatif F-hat(t)" = "dashed")) +
  scale_x_continuous(breaks = 0:15) +
  scale_y_continuous(labels = percent_format(), limits = c(0, 1.05)) +
  labs(title = "Life Table: Fungsi Survivor dan Distribusi Kumulatif",
       x = "Lama Rawat t (Hari)", y = "Probabilitas",
       color = NULL, linetype = NULL) +
  theme_classic(base_size = 11) +
  theme(legend.position = c(0.68, 0.5),
        legend.background = element_rect(fill = "white", color = "grey80"),
        plot.title = element_text(face = "bold"))

p_lt_hazard <- ggplot(lt_calc, aes(x = t_j, y = ht)) +
  geom_col(fill = "#EF5350", alpha = 0.75, width = 0.7) +
  geom_line(color = "#B71C1C", linewidth = 1) +
  geom_point(color = "#B71C1C", size = 2) +
  scale_x_continuous(breaks = 1:15) +
  labs(title = "Fungsi Hazard h-hat(t) — Life Table",
       subtitle = "Laju sesaat terjadinya kesembuhan per hari",
       x = "Lama Rawat t (Hari)", y = "Hazard h-hat(t)") +
  theme_classic(base_size = 11) +
  theme(plot.title    = element_text(face = "bold"),
        plot.subtitle = element_text(color = "grey50"))

gridExtra::grid.arrange(p_lt_surv, p_lt_hazard, nrow = 2)
Gambar 3. Kurva Life Table — Pasien COVID-19 RSI Malahayati

Gambar 3. Kurva Life Table — Pasien COVID-19 RSI Malahayati


Metode Kaplan-Meier (Product-Limit Estimator)

Konsep dan Latar Belakang

Estimator Kaplan-Meier (Kaplan & Meier, 1958) adalah metode nonparametrik product-limit yang menggunakan setiap waktu event sebagai titik estimasi. Berbeda dari Life Table yang mengelompokkan waktu, KM bekerja pada waktu-waktu event yang sesungguhnya.

Keunggulan Kaplan-Meier: - Tidak memerlukan pengelompokan interval - Memanfaatkan semua informasi dari data tersensor - Merupakan estimator Maximum Likelihood Nonparametrik (NPMLE) untuk \(S(t)\)

Formulasi Matematis

Estimator Product-Limit:

\[\hat{S}(t) = \prod_{j:\,t_{(j)} \leq t} \left(1 - \frac{d_j}{n_j}\right) = \prod_{j:\,t_{(j)} \leq t} \frac{n_j - d_j}{n_j}\]

di mana: - \(t_{(j)}\): waktu event ke-\(j\) yang diurutkan - \(d_j\): jumlah event pada \(t_{(j)}\) - \(n_j\): jumlah individu berisiko sesaat sebelum \(t_{(j)}\) (yang belum mengalami event dan belum tersensor)

Standar Error Greenwood (1926):

\[\widehat{SE}[\hat{S}(t)] = \hat{S}(t)\,\sqrt{\sum_{j:\,t_{(j)} \leq t} \frac{d_j}{n_j(n_j - d_j)}}\]

Interval Kepercayaan 95% (Metode Linear):

\[\hat{S}(t) \pm 1{,}96 \cdot \widehat{SE}[\hat{S}(t)]\]

Median Survival: waktu \(t_{0.5}\) terkecil sehingga \(\hat{S}(t_{0.5}) \leq 0{,}5\).

Perhitungan Manual Kaplan-Meier

km_raw <- data_covid |>
  filter(status == 1) |>
  group_by(lama_rawat) |>
  summarise(d_j = n(), .groups = "drop") |>
  arrange(lama_rawat) |>
  rename(t_j = lama_rawat)

km_calc <- km_raw |>
  mutate(
    n_j    = n_total - cumsum(lag(d_j, default = 0)),
    rasio  = round((n_j - d_j) / n_j, 6),
    St     = round(cumprod(rasio), 6),
    SE     = round(St * sqrt(cumsum(d_j / (n_j * (n_j - d_j)))), 6),
    CI_lo  = round(pmax(St - 1.96 * SE, 0), 6),
    CI_up  = round(pmin(St + 1.96 * SE, 1), 6)
  )

km_display <- bind_rows(
  tibble(t_j = 0, d_j = 0, n_j = n_total, rasio = 1,
         St = 1, SE = 0, CI_lo = 1, CI_up = 1),
  km_calc
)

km_display |>
  kable(
    caption   = "Tabel 6. Estimasi Kaplan-Meier Manual — Pasien COVID-19 RSI Malahayati",
    col.names = c("t (Hari)", "d_j", "n_j", "(n_j - d_j)/n_j",
                  "S-hat(t)", "SE[S-hat(t)]", "CI Bawah 95%", "CI Atas 95%"),
    digits    = 6,
    align     = rep("c", 8)
  ) |>
  kable_styling(bootstrap_options = c("striped", "hover", "bordered"),
                full_width = TRUE, font_size = 11)
Tabel 6. Estimasi Kaplan-Meier Manual — Pasien COVID-19 RSI Malahayati
t (Hari) d_j n_j (n_j - d_j)/n_j S-hat(t) SE[S-hat(t)] CI Bawah 95% CI Atas 95%
0 0 138 1.000000 1.000000 0.000000 1.000000 1.000000
1 2 138 0.985507 0.985507 0.010173 0.965568 1.000000
2 9 136 0.933824 0.920290 0.023056 0.875100 0.965480
3 16 127 0.874016 0.804348 0.033770 0.738159 0.870537
4 8 111 0.927928 0.746377 0.037037 0.673784 0.818970
5 4 103 0.961165 0.717392 0.038329 0.642267 0.792517
6 9 99 0.909091 0.652174 0.040544 0.572708 0.731640
7 15 90 0.833333 0.543478 0.042402 0.460370 0.626586
8 15 75 0.800000 0.434783 0.042199 0.352073 0.517493
9 20 60 0.666667 0.289855 0.038621 0.214158 0.365552
10 23 40 0.425000 0.123189 0.027977 0.068354 0.178024
11 10 17 0.411765 0.050725 0.018680 0.014112 0.087338
12 5 7 0.285714 0.014493 0.010174 0.000000 0.034434
14 1 2 0.500000 0.007246 0.007220 0.000000 0.021397
15 1 1 0.000000 0.000000 NaN NaN NaN

Justifikasi Pemilihan Titik Verifikasi Kaplan-Meier

Pada metode Kaplan-Meier, proses verifikasi manual difokuskan pada \(t = 10\), karena waktu tersebut memiliki jumlah kejadian paling tinggi dalam seluruh dataset (\(d_{10} = 23\)). Dalam estimasi Kaplan-Meier, setiap kejadian secara langsung memengaruhi nilai fungsi survival melalui faktor pengali \((n_j - d_j)/n_j\), sehingga interval waktu dengan jumlah kejadian terbesar akan memberikan kontribusi perubahan paling signifikan terhadap bentuk kurva survival.

Dengan demikian, pemilihan titik verifikasi pada \(t = 10\) didasarkan pada pertimbangan besarnya jumlah kejadian serta besarnya perubahan \(\hat{S}(t)\) yang dihasilkan, sehingga verifikasi yang dilakukan dapat merepresentasikan dinamika utama yang terjadi dalam data. Pendekatan ini memastikan bahwa verifikasi tidak hanya bersifat ilustratif, tetapi mampu menguji keakuratan perhitungan pada titik yang paling berpengaruh terhadap estimasi fungsi survival secara keseluruhan.

Verifikasi Manual Kaplan-Meier pada \(t = 10\)

bkm10 <- km_calc |> filter(t_j == 10)
bkm9  <- km_calc |> filter(t_j == 9)

# Hitung Greenwood sum
gw10 <- km_calc |>
  filter(t_j <= 10) |>
  summarise(g = sum(d_j / (n_j * (n_j - d_j)))) |>
  pull(g)

se10_calc <- bkm10$St * sqrt(gw10)

verif_km_df <- tibble(
  Langkah = c(
    "n_10 (berisiko sesaat sebelum t=10)",
    "d_10 (event pada t=10)",
    "Rasio = (n_10 - d_10) / n_10",
    "S-hat(9) dari langkah sebelumnya",
    "S-hat(10) = S-hat(9) x Rasio",
    "Verifikasi S-hat(10)",
    "Penurunan: S(9) - S(10)",
    "Sum Greenwood hingga t=10",
    "SE(S-hat(10)) = S-hat(10) x sqrt(Sum Greenwood)",
    "CI 95% Bawah = max(0, S-hat(10) - 1.96 x SE)",
    "CI 95% Atas  = min(1, S-hat(10) + 1.96 x SE)"
  ),
  `Perhitungan Detail` = c(
    as.character(bkm10$n_j),
    as.character(bkm10$d_j),
    paste0("(", bkm10$n_j, " - ", bkm10$d_j, ") / ", bkm10$n_j,
           " = ", round(bkm10$rasio, 6)),
    round(bkm9$St, 6),
    paste0(round(bkm9$St, 6), " x ", round(bkm10$rasio, 6),
           " = ", round(bkm9$St * bkm10$rasio, 6)),
    ifelse(abs(bkm9$St * bkm10$rasio - bkm10$St) < 1e-4,
           "SESUAI", "TIDAK SESUAI"),
    paste0(round(bkm9$St, 6), " - ", round(bkm10$St, 6),
           " = ", round(bkm9$St - bkm10$St, 6)),
    round(gw10, 8),
    paste0(round(bkm10$St, 6), " x sqrt(", round(gw10, 8), ") = ",
           round(se10_calc, 6)),
    paste0("max(0, ", round(bkm10$St, 6), " - 1.96 x ",
           round(se10_calc, 6), ") = ",
           round(max(bkm10$St - 1.96 * se10_calc, 0), 6)),
    paste0("min(1, ", round(bkm10$St, 6), " + 1.96 x ",
           round(se10_calc, 6), ") = ",
           round(min(bkm10$St + 1.96 * se10_calc, 1), 6))
  )
)

verif_km_df |>
  kable(caption = "Tabel 6a. Verifikasi Manual Kaplan-Meier pada t = 10 Hari (d = 23, Event Terbanyak)",
        align = c("l", "c")) |>
  kable_styling(bootstrap_options = c("striped", "hover", "bordered"),
                full_width = TRUE, font_size = 11)
Tabel 6a. Verifikasi Manual Kaplan-Meier pada t = 10 Hari (d = 23, Event Terbanyak)
Langkah Perhitungan Detail
n_10 (berisiko sesaat sebelum t=10) 40
d_10 (event pada t=10) 23
Rasio = (n_10 - d_10) / n_10 (40 - 23) / 40 = 0.425
S-hat(9) dari langkah sebelumnya 0.289855
S-hat(10) = S-hat(9) x Rasio 0.289855 x 0.425 = 0.123188
Verifikasi S-hat(10) SESUAI
Penurunan: S(9) - S(10) 0.289855 - 0.123189 = 0.166666
Sum Greenwood hingga t=10 0.05157715
SE(S-hat(10)) = S-hat(10) x sqrt(Sum Greenwood) 0.123189 x sqrt(0.05157715) = 0.027977
CI 95% Bawah = max(0, S-hat(10) - 1.96 x SE) max(0, 0.123189 - 1.96 x 0.027977) = 0.068354
CI 95% Atas = min(1, S-hat(10) + 1.96 x SE) min(1, 0.123189 + 1.96 x 0.027977) = 0.178024

Implementasi dengan survfit() di R

surv_obj <- Surv(time = data_covid$lama_rawat, event = data_covid$status)
km_fit   <- survfit(surv_obj ~ 1, data = data_covid)
km_sum   <- summary(km_fit)

tibble(
  `t (Hari)` = km_sum$time,
  n.risk      = km_sum$n.risk,
  n.event     = km_sum$n.event,
  `S-hat(t)`  = round(km_sum$surv,   5),
  SE          = round(km_sum$std.err, 5),
  `CI Bawah`  = round(km_sum$lower,  5),
  `CI Atas`   = round(km_sum$upper,  5)
) |>
  kable(caption = "Tabel 7. Estimasi Kaplan-Meier via survfit() — R",
        align = rep("c", 7)) |>
  kable_styling(bootstrap_options = c("striped", "hover", "bordered"),
                full_width = TRUE, font_size = 11)
Tabel 7. Estimasi Kaplan-Meier via survfit() — R
t (Hari) n.risk n.event S-hat(t) SE CI Bawah CI Atas
1 138 2 0.98551 0.01017 0.96577 1.00000
2 136 9 0.92029 0.02306 0.87619 0.96661
3 127 16 0.80435 0.03377 0.74081 0.87333
4 111 8 0.74638 0.03704 0.67720 0.82261
5 103 4 0.71739 0.03833 0.64607 0.79659
6 99 9 0.65217 0.04054 0.57736 0.73668
7 90 15 0.54348 0.04240 0.46641 0.63327
8 75 15 0.43478 0.04220 0.35946 0.52588
9 60 20 0.28986 0.03862 0.22324 0.37635
10 40 23 0.12319 0.02798 0.07893 0.19226
11 17 10 0.05072 0.01868 0.02465 0.10439
12 7 5 0.01449 0.01017 0.00366 0.05737
14 2 1 0.00725 0.00722 0.00103 0.05108
15 1 1 0.00000 NaN NA NA

Kurva Kaplan-Meier Keseluruhan

ggsurvplot(
  km_fit,
  data           = data_covid,
  conf.int       = TRUE,
  conf.int.fill  = "#BBDEFB",
  conf.int.alpha = 0.35,
  surv.median.line = "hv",
  palette        = "#1565C0",
  title          = "Kurva Kaplan-Meier — Pasien COVID-19 RSI Malahayati",
  subtitle       = "Fungsi Survivor dengan Interval Kepercayaan 95%",
  xlab           = "Lama Rawat (Hari)",
  ylab           = "Probabilitas Belum Sembuh S-hat(t)",
  break.time.by  = 1,
  risk.table     = TRUE,
  risk.table.col = "strata",
  risk.table.height = 0.28,
  risk.table.title  = "Jumlah Berisiko",
  ggtheme        = theme_classic(base_size = 11),
  font.main      = c(12, "bold"),
  font.submain   = c(10, "plain", "grey50")
)
Gambar 4. Kurva Kaplan-Meier Keseluruhan — Pasien COVID-19 RSI Malahayati

Gambar 4. Kurva Kaplan-Meier Keseluruhan — Pasien COVID-19 RSI Malahayati

Interpretasi Kurva Kaplan-Meier

Membaca Kurva Survival Secara Umum

Kurva Kaplan-Meier pada Gambar 4 menampilkan estimasi fungsi survival \(\hat{S}(t)\), yaitu probabilitas seorang pasien belum sembuh hingga hari ke-\(t\). Kurva berbentuk tangga (step function) yang turun setiap kali terjadi satu atau lebih kejadian (kesembuhan). Lebar pita abu-abu merepresentasikan interval kepercayaan 95%, yang semakin melebar di ujung kurva seiring berkurangnya jumlah individu berisiko (risk set).

Fase Penurunan Survival

Berdasarkan kurva dan tabel estimasi, pola penurunan \(\hat{S}(t)\) dapat dibagi menjadi tiga fase:

Fase awal (\(t = 1\) hingga \(t = 6\) hari): Penurunan survival berlangsung bertahap. Pada hari ke-1 hingga ke-6, probabilitas belum sembuh turun dari 1,000 menjadi 0,652. Rata-rata penurunan per hari sekitar 5–7%. Hal ini mengindikasikan bahwa sebagian kecil pasien sembuh lebih cepat, kemungkinan karena kondisi klinis yang lebih ringan atau respons imun yang lebih baik.

Fase tengah (\(t = 7\) hingga \(t = 10\) hari): Terjadi penurunan paling tajam. Pada rentang ini, \(\hat{S}(t)\) turun dari 0,543 menjadi 0,123 — penurunan kumulatif sebesar \(\approx 42\%\) hanya dalam 4 hari. Puncak kejadian terjadi pada \(t = 10\) (\(d = 23\)), menjadikannya hari dengan laju kesembuhan tertinggi. Secara epidemiologis, fase ini merepresentasikan periode kritis pemulihan pasien COVID-19 — mayoritas pasien mencapai kriteria pemulangan klinis pada hari ke-7 hingga ke-10.

Fase akhir (\(t = 11\) hingga \(t = 15\) hari): Penurunan semakin melambat karena jumlah pasien yang masih dirawat semakin sedikit. Pada \(t = 11\) tersisa 17 pasien berisiko, dan pita kepercayaan melebar secara substansial, mencerminkan ketidakpastian estimasi yang lebih besar.

Makna Median Survival

Garis median yang ditunjukkan pada kurva (\(\hat{S}(t_{0,5}) = 0{,}5\)) berada pada 8 hari, dengan interval kepercayaan 95% sebesar [8, 9] hari. Artinya, setengah dari seluruh pasien sembuh dalam waktu 8 hari atau kurang sejak masuk rumah sakit. Nilai ini konsisten dengan median lama rawat COVID-19 ringan-sedang yang dilaporkan dalam literatur internasional (WHO, 2021), yaitu sekitar 7–14 hari untuk kasus yang tidak memerlukan perawatan intensif.

Implikasi Epidemiologis

Dari perspektif manajemen rumah sakit, estimasi ini memiliki implikasi praktis: perencanaan kapasitas tempat tidur perlu mempertimbangkan bahwa \(\approx 88\%\) pasien akan sembuh dalam 10 hari pertama. Hanya sekitar \(5\%\) pasien (\(\hat{S}(11) = 0{,}051\)) yang masih memerlukan rawat inap hingga hari ke-11 atau lebih, menandakan kasus-kasus dengan penyulit klinis.

Catatan: Identitas Ketiga Estimator pada Data Ini

Perlu dicatat bahwa pada dataset ini, ketiga metode — Empiris, Life Table, dan Kaplan-Meier — menghasilkan nilai \(\hat{S}(t)\) yang identik. Hal ini merupakan konsekuensi logis dari dua kondisi:

  1. Tidak ada data tersensor (\(c_j = 0\) untuk semua \(j\)): Koreksi aktuaria pada Life Table menjadi \(c_j/2 = 0\), sehingga \(n_j^* = n_j\), dan rumus Life Table menyederhanakan tepat ke rumus Kaplan-Meier.
  2. Lebar interval Life Table \(b = 1\) hari: Dengan interval unit, Life Table beroperasi pada resolusi yang sama dengan KM, sehingga tidak ada kehilangan informasi akibat pengelompokan.

Identitas ketiga estimator ini justru menjadi validasi silang (cross-validation) yang memperkuat keandalan hasil analisis. Perbedaan antar metode akan muncul bila terdapat data tersensor atau bila Life Table menggunakan lebar interval \(b > 1\).

Kaplan-Meier per Kelompok

Berdasarkan Status Komorbid

km_km <- survfit(surv_obj ~ komorbid, data = data_covid)

surv_median(km_km) |>
  mutate(strata = gsub(".*=", "", strata)) |>
  kable(
    caption   = "Tabel 8. Median Waktu Rawat — Status Komorbid",
    col.names = c("Kelompok", "Median (Hari)", "CI Bawah 95%", "CI Atas 95%"),
    align     = c("l", "c", "c", "c")
  ) |>
  kable_styling(bootstrap_options = c("striped", "hover", "bordered"),
                full_width = FALSE, font_size = 11)
Tabel 8. Median Waktu Rawat — Status Komorbid
Kelompok Median (Hari) CI Bawah 95% CI Atas 95%
Non-Komorbid 8 7 9
Komorbid 7 6 8
ggsurvplot(
  km_km,
  data              = data_covid,
  conf.int          = TRUE,
  conf.int.alpha    = 0.2,
  surv.median.line  = "hv",
  palette           = c("#43A047", "#EF5350"),
  legend.labs       = c("Non-Komorbid", "Komorbid"),
  legend.title      = "Status Komorbid",
  title             = "Kurva KM — Status Komorbid",
  xlab              = "Lama Rawat (Hari)",
  ylab              = "S-hat(t)",
  break.time.by     = 1,
  risk.table        = TRUE,
  risk.table.col    = "strata",
  risk.table.height = 0.30,
  pval              = TRUE,
  pval.method       = TRUE,
  ggtheme           = theme_classic(base_size = 11),
  font.main         = c(12, "bold")
)
Gambar 5. Kurva KM — Status Komorbid

Gambar 5. Kurva KM — Status Komorbid

Berdasarkan Jenis Kelamin

km_jk <- survfit(surv_obj ~ jenis_kelamin, data = data_covid)

surv_median(km_jk) |>
  mutate(strata = gsub(".*=", "", strata)) |>
  kable(
    caption   = "Tabel 9. Median Waktu Rawat — Jenis Kelamin",
    col.names = c("Kelompok", "Median (Hari)", "CI Bawah 95%", "CI Atas 95%"),
    align     = c("l", "c", "c", "c")
  ) |>
  kable_styling(bootstrap_options = c("striped", "hover", "bordered"),
                full_width = FALSE, font_size = 11)
Tabel 9. Median Waktu Rawat — Jenis Kelamin
Kelompok Median (Hari) CI Bawah 95% CI Atas 95%
Laki-laki 7 6 9
Perempuan 8 7 9
ggsurvplot(
  km_jk,
  data              = data_covid,
  conf.int          = TRUE,
  conf.int.alpha    = 0.2,
  surv.median.line  = "hv",
  palette           = c("#1E88E5", "#E91E63"),
  legend.labs       = c("Laki-laki", "Perempuan"),
  legend.title      = "Jenis Kelamin",
  title             = "Kurva KM — Jenis Kelamin",
  xlab              = "Lama Rawat (Hari)",
  ylab              = "S-hat(t)",
  break.time.by     = 1,
  risk.table        = TRUE,
  risk.table.col    = "strata",
  risk.table.height = 0.30,
  pval              = TRUE,
  pval.method       = TRUE,
  ggtheme           = theme_classic(base_size = 11),
  font.main         = c(12, "bold")
)
Gambar 6. Kurva KM — Jenis Kelamin

Gambar 6. Kurva KM — Jenis Kelamin

Berdasarkan Kategori Diagnosis

km_dx <- survfit(surv_obj ~ kategori_dx, data = data_covid)

surv_median(km_dx) |>
  mutate(strata = gsub(".*=", "", strata)) |>
  kable(
    caption   = "Tabel 10. Median Waktu Rawat — Kategori Diagnosis",
    col.names = c("Kelompok", "Median (Hari)", "CI Bawah 95%", "CI Atas 95%"),
    align     = c("l", "c", "c", "c")
  ) |>
  kable_styling(bootstrap_options = c("striped", "hover", "bordered"),
                full_width = FALSE, font_size = 11)
Tabel 10. Median Waktu Rawat — Kategori Diagnosis
Kelompok Median (Hari) CI Bawah 95% CI Atas 95%
Suspect 8 7 9
Positif 8 7 9
ggsurvplot(
  km_dx,
  data              = data_covid,
  conf.int          = TRUE,
  conf.int.alpha    = 0.2,
  surv.median.line  = "hv",
  palette           = c("#7B1FA2", "#D32F2F"),
  legend.labs       = c("Suspect", "Positif"),
  legend.title      = "Kategori Diagnosis",
  title             = "Kurva KM — Kategori Diagnosis",
  xlab              = "Lama Rawat (Hari)",
  ylab              = "S-hat(t)",
  break.time.by     = 1,
  risk.table        = TRUE,
  risk.table.col    = "strata",
  risk.table.height = 0.30,
  pval              = TRUE,
  pval.method       = TRUE,
  ggtheme           = theme_classic(base_size = 11),
  font.main         = c(12, "bold")
)
Gambar 7. Kurva KM — Kategori Diagnosis

Gambar 7. Kurva KM — Kategori Diagnosis

Berdasarkan Kelompok Usia

data_usia <- data_covid |>
  filter(kelompok_usia != "Tidak Diketahui") |>
  mutate(usia_grp = factor(
    ifelse(kelompok_usia == "< 40 Tahun", "Di bawah 40 Tahun", "40 Tahun ke atas"),
    levels = c("Di bawah 40 Tahun", "40 Tahun ke atas")
  ))

surv_usia <- Surv(data_usia$lama_rawat, data_usia$status)
km_usia   <- survfit(surv_usia ~ usia_grp, data = data_usia)

surv_median(km_usia) |>
  mutate(strata = gsub(".*=", "", strata)) |>
  kable(
    caption   = "Tabel 11. Median Waktu Rawat — Kelompok Usia",
    col.names = c("Kelompok", "Median (Hari)", "CI Bawah 95%", "CI Atas 95%"),
    align     = c("l", "c", "c", "c")
  ) |>
  kable_styling(bootstrap_options = c("striped", "hover", "bordered"),
                full_width = FALSE, font_size = 11)
Tabel 11. Median Waktu Rawat — Kelompok Usia
Kelompok Median (Hari) CI Bawah 95% CI Atas 95%
Di bawah 40 Tahun 7 6 8
40 Tahun ke atas 9 7 9
ggsurvplot(
  km_usia,
  data              = data_usia,
  conf.int          = TRUE,
  conf.int.alpha    = 0.2,
  surv.median.line  = "hv",
  palette           = c("#00897B", "#F4511E"),
  legend.labs       = c("Di bawah 40 Tahun", "40 Tahun ke atas"),
  legend.title      = "Kelompok Usia",
  title             = "Kurva KM — Kelompok Usia",
  xlab              = "Lama Rawat (Hari)",
  ylab              = "S-hat(t)",
  break.time.by     = 1,
  risk.table        = TRUE,
  risk.table.col    = "strata",
  risk.table.height = 0.30,
  pval              = TRUE,
  pval.method       = TRUE,
  ggtheme           = theme_classic(base_size = 11),
  font.main         = c(12, "bold")
)
Gambar 8. Kurva KM — Kelompok Usia

Gambar 8. Kurva KM — Kelompok Usia


Uji Log-Rank

Dasar Teori

Uji Log-Rank (Mantel, 1966) adalah uji nonparametrik untuk membandingkan dua atau lebih kurva survival. Uji ini memberikan bobot yang sama pada seluruh waktu event.

Hipotesis:

\[H_0: S_1(t) = S_2(t) = \cdots = S_g(t) \quad \text{untuk semua } t\] \[H_1: \text{minimal satu } S_i(t) \neq S_j(t)\]

Statistik Uji:

Pada setiap waktu event ke-\(k\) dengan \(d_k\) total event dan \(n_{ik}\) individu berisiko dari kelompok \(i\):

\[E_{ik} = \frac{n_{ik}}{n_k} \cdot d_k\]

Statistik uji keseluruhan:

\[\chi^2_{\text{hit}} = \sum_{i=1}^{g} \frac{(O_i - E_i)^2}{E_i} \sim \chi^2_{g-1}\]

di mana \(O_i = \sum_k d_{ik}\) dan \(E_i = \sum_k E_{ik}\).

Keputusan: Tolak \(H_0\) jika \(\chi^2_{\text{hit}} > \chi^2_{\alpha, g-1}\) atau \(p\text{-value} < \alpha\).

Hasil Uji Log-Rank

lr_km <- survdiff(surv_obj ~ komorbid,      data = data_covid)
lr_jk <- survdiff(surv_obj ~ jenis_kelamin, data = data_covid)
lr_dx <- survdiff(surv_obj ~ kategori_dx,   data = data_covid)
lr_us <- survdiff(surv_usia ~ usia_grp,     data = data_usia)
data.frame(
  Kelompok      = c("Non-Komorbid", "Komorbid"),
  n             = c(sum(data_covid$komorbid == "Non-Komorbid"),
                    sum(data_covid$komorbid == "Komorbid")),
  O             = lr_km$obs,
  E             = round(lr_km$exp, 4),
  `(O-E)^2/E`  = round((lr_km$obs - lr_km$exp)^2 / lr_km$exp, 4),
  check.names   = FALSE
) |>
  kable(caption = "Tabel 12a. Detail Uji Log-Rank — Status Komorbid",
        align = c("l", "c", "c", "c", "c")) |>
  kable_styling(bootstrap_options = c("striped", "hover", "bordered"),
                full_width = FALSE, font_size = 11)
Tabel 12a. Detail Uji Log-Rank — Status Komorbid
Kelompok n O E (O-E)^2/E
Non-Komorbid 90 90 97.5361 0.5823
Komorbid 48 48 40.4639 1.4035
pv_km <- pchisq(lr_km$chisq, 1, lower.tail = FALSE)
pv_jk <- pchisq(lr_jk$chisq, 1, lower.tail = FALSE)
pv_dx <- pchisq(lr_dx$chisq, 1, lower.tail = FALSE)
pv_us <- pchisq(lr_us$chisq, 1, lower.tail = FALSE)

chi_krit <- round(qchisq(0.95, df = 1), 4)

rekap <- data.frame(
  Faktor        = c("Status Komorbid", "Jenis Kelamin",
                    "Kategori Diagnosis", "Kelompok Usia"),
  `chi2_hit`    = round(c(lr_km$chisq, lr_jk$chisq, lr_dx$chisq, lr_us$chisq), 4),
  `chi2_krit`   = chi_krit,
  df            = c(1, 1, 1, 1),
  `p-value`     = round(c(pv_km, pv_jk, pv_dx, pv_us), 4),
  Keputusan     = ifelse(c(pv_km, pv_jk, pv_dx, pv_us) < 0.05,
                         "Tolak H0", "Gagal Tolak H0"),
  Kesimpulan    = ifelse(c(pv_km, pv_jk, pv_dx, pv_us) < 0.05,
                         "Kurva berbeda signifikan",
                         "Kurva tidak berbeda signifikan"),
  check.names   = FALSE
)

rekap |>
  kable(
    caption = "Tabel 12b. Rekapitulasi Hasil Uji Log-Rank (alpha = 0.05)",
    align   = c("l", "c", "c", "c", "c", "c", "l")
  ) |>
  kable_styling(bootstrap_options = c("striped", "hover", "bordered"),
                full_width = TRUE, font_size = 11)
Tabel 12b. Rekapitulasi Hasil Uji Log-Rank (alpha = 0.05)
Faktor chi2_hit chi2_krit df p-value Keputusan Kesimpulan
Status Komorbid 2.7453 3.8415 1 0.0975 Gagal Tolak H0 Kurva tidak berbeda signifikan
Jenis Kelamin 1.6805 3.8415 1 0.1949 Gagal Tolak H0 Kurva tidak berbeda signifikan
Kategori Diagnosis 0.3460 3.8415 1 0.5564 Gagal Tolak H0 Kurva tidak berbeda signifikan
Kelompok Usia 3.4853 3.8415 1 0.0619 Gagal Tolak H0 Kurva tidak berbeda signifikan

Skenario Penyensoran: Struktur Data Survival dengan Data Tersensor

Mekanisme Penyensoran dalam Analisis Survival

Dalam analisis survival, tidak semua individu yang diamati selalu mengalami peristiwa (event) selama periode penelitian berlangsung. Beberapa individu mungkin belum mengalami peristiwa hingga penelitian berakhir, sehingga waktu kejadian sebenarnya tidak dapat diamati secara lengkap. Kondisi tersebut dikenal sebagai data tersensor (censored data). Menurut teori analisis survival, data tersensor umumnya muncul ketika:

  • pengamatan dihentikan pada waktu tertentu yang telah ditetapkan,
  • individu keluar dari penelitian sebelum mengalami peristiwa (dropout), atau
  • individu belum mengalami peristiwa hingga akhir periode observasi.

Pada skenario ini, mekanisme penyensoran yang digunakan adalah penyensoran sebelah kanan tipe I (Type I right censoring), yaitu kondisi ketika penelitian dihentikan pada waktu pengamatan yang telah ditentukan sebelumnya. Dengan kata lain, waktu kejadian hanya dapat diamati sampai batas akhir periode observasi. Individu yang belum mengalami peristiwa hingga batas waktu tersebut dikategorikan sebagai data tersensor. Pendekatan ini umum digunakan dalam penelitian medis maupun epidemiologi, terutama ketika periode pengamatan dibatasi oleh waktu penelitian (Kleinbaum & Klein, 2012).

Konstruksi Data Tersensor (Cut-off: 31 Maret 2021)

Dalam konteks penelitian ini, data pasien COVID-19 mencakup periode Januari hingga Mei 2021. Untuk keperluan demonstrasi analisis survival dengan data tersensor, batas akhir periode observasi ditetapkan pada 31 Maret 2021. Oleh karena itu:

  • Pasien yang telah keluar atau sembuh sebelum atau pada tanggal tersebut dicatat sebagai individu yang mengalami peristiwa (status = 1).
  • Pasien yang tanggal keluarnya melewati batas akhir pengamatan dikategorikan sebagai data tersensor kanan (status = 0) — waktu observasinya dipotong pada tanggal 31 Maret 2021.

Struktur data yang dihasilkan terdiri dari dua kondisi: individu yang mengalami peristiwa kesembuhan selama periode pengamatan, dan individu yang tidak mengalami peristiwa hingga akhir periode observasi. Struktur ini memungkinkan penerapan metode analisis survival nonparametrik — Life Table, Kaplan-Meier, serta Uji Log-Rank — yang secara khusus dirancang untuk mengakomodasi keberadaan data tersensor dalam proses estimasi fungsi survival.

“Pada penyensoran sebelah kanan tipe I, penelitian berakhir apabila waktu pengamatan yang ditentukan telah tercapai.” — Imam (2021), BAB II

Penyensoran kanan tipe I merupakan salah satu mekanisme penyensoran yang paling umum digunakan dalam analisis survival ketika periode penelitian dibatasi oleh waktu pengamatan tertentu.

Konstruksi Data Tersensor

cutoff_date <- ymd("2021-03-31")

data_sensor <- data_covid |>
  mutate(
    tgl_keluar_obs  = pmin(tgl_keluar, cutoff_date),
    lama_rawat_obs  = as.integer(tgl_keluar_obs - tgl_masuk),
    lama_rawat_obs  = pmax(lama_rawat_obs, 1),
    status_obs      = as.integer(tgl_keluar <= cutoff_date)
  )

stat_sensor <- tibble(
  Keterangan    = c("Total pasien", "Event (sembuh sebelum cut-off)",
                    "Tersensor kanan", "Persentase tersensor"),
  Nilai         = c(
    nrow(data_sensor),
    sum(data_sensor$status_obs == 1),
    sum(data_sensor$status_obs == 0),
    paste0(round(mean(data_sensor$status_obs == 0) * 100, 1), "%")
  )
)

stat_sensor |>
  kable(caption = "Tabel 13. Statistik Data dengan Penyensoran Tipe I (Cut-off: 31 Maret 2021)",
        align = c("l", "c")) |>
  kable_styling(bootstrap_options = c("striped", "hover", "bordered"),
                full_width = FALSE, font_size = 11)
Tabel 13. Statistik Data dengan Penyensoran Tipe I (Cut-off: 31 Maret 2021)
Keterangan Nilai
Total pasien 138
Event (sembuh sebelum cut-off) 63
Tersensor kanan 75
Persentase tersensor 54.3%

Kaplan-Meier dengan Penyensoran

surv_sensor <- Surv(data_sensor$lama_rawat_obs, data_sensor$status_obs)
km_sensor   <- survfit(surv_sensor ~ 1, data = data_sensor)

ggsurvplot(
  km_sensor,
  data              = data_sensor,
  conf.int          = TRUE,
  conf.int.fill     = "#FFCCBC",
  conf.int.alpha    = 0.35,
  palette           = "#E64A19",
  surv.median.line  = "hv",
  title             = "Kurva KM: Penyensoran Tipe I (Cut-off 31 Maret 2021)",
  subtitle          = "Tanda '+' menunjukkan observasi tersensor kanan",
  xlab              = "Lama Rawat (Hari)",
  ylab              = "S-hat(t) — Probabilitas Belum Sembuh",
  break.time.by     = 1,
  censor            = TRUE,
  censor.shape      = "+",
  censor.size       = 4,
  risk.table        = TRUE,
  risk.table.col    = "strata",
  risk.table.height = 0.28,
  ggtheme           = theme_classic(base_size = 11),
  font.main         = c(12, "bold"),
  font.submain      = c(10, "plain", "grey50")
)
Gambar 9. Kurva KM dengan Penyensoran Tipe I — Cut-off 31 Maret 2021

Gambar 9. Kurva KM dengan Penyensoran Tipe I — Cut-off 31 Maret 2021

Perbandingan: Data Asli vs. Data Tersensor

df_asli   <- tibble(t = c(0, km_sum$time),    St = c(1, km_sum$surv),    Sumber = "Data Asli (n=138, 0 sensor)")
df_sensor <- tibble(t = c(0, summary(km_sensor)$time),
                    St = c(1, summary(km_sensor)$surv),
                    Sumber = "Data Tersensor Tipe I (cut-off 31 Mar)")

bind_rows(df_asli, df_sensor) |>
  ggplot(aes(x = t, y = St, color = Sumber, linetype = Sumber)) +
  geom_step(linewidth = 1.2) +
  scale_color_manual(values = c("Data Asli (n=138, 0 sensor)" = "#1565C0",
                                 "Data Tersensor Tipe I (cut-off 31 Mar)" = "#E64A19")) +
  scale_x_continuous(breaks = 0:15) +
  scale_y_continuous(labels = percent_format(), limits = c(0, 1.05)) +
  labs(title = "Perbandingan KM: Data Asli vs. Data Tersensor",
       x = "Lama Rawat (Hari)", y = "S-hat(t)",
       color = NULL, linetype = NULL) +
  theme_classic(base_size = 11) +
  theme(legend.position = "bottom",
        plot.title = element_text(face = "bold", size = 12))
Gambar 9b. Perbandingan Kurva KM: Data Asli vs. Data Tersensor

Gambar 9b. Perbandingan Kurva KM: Data Asli vs. Data Tersensor


Perbandingan Tiga Metode

Visualisasi Bersama

df_empiris  <- empiris_display |> select(t, St) |> rename(t_j = t) |> mutate(Metode = "Empiris")
df_lt       <- lt_display       |> select(t_j, St) |>                   mutate(Metode = "Life Table")
df_km_plot  <- tibble(t_j = c(0, km_sum$time), St = c(1, km_sum$surv), Metode = "Kaplan-Meier")

df_all <- bind_rows(df_empiris, df_lt, df_km_plot)

ggplot(df_all, aes(x = t_j, y = St, color = Metode, linetype = Metode)) +
  geom_step(linewidth = 1.1) +
  scale_color_manual(values = c("Empiris" = "#F57F17",
                                 "Life Table" = "#C62828",
                                 "Kaplan-Meier" = "#1565C0")) +
  scale_linetype_manual(values = c("Empiris" = "dotted",
                                    "Life Table" = "dashed",
                                    "Kaplan-Meier" = "solid")) +
  scale_x_continuous(breaks = 0:15) +
  scale_y_continuous(labels = percent_format(), limits = c(0, 1.05)) +
  labs(
    title    = "Gambar 10. Perbandingan Estimasi S-hat(t): Empiris vs Life Table vs Kaplan-Meier",
    subtitle = "Pasien COVID-19 RSI Malahayati (n = 138)",
    x        = "Lama Rawat t (Hari)",
    y        = "Probabilitas S-hat(t)"
  ) +
  theme_classic(base_size = 11) +
  theme(legend.position = c(0.82, 0.65),
        legend.background = element_rect(fill = "white", color = "grey80"),
        legend.title = element_text(face = "bold"),
        plot.title   = element_text(face = "bold", size = 12),
        plot.subtitle = element_text(color = "grey50"))
Gambar 10. Perbandingan Estimasi S-hat(t): Empiris vs Life Table vs Kaplan-Meier

Gambar 10. Perbandingan Estimasi S-hat(t): Empiris vs Life Table vs Kaplan-Meier

Tabel Perbandingan Numerik

get_St <- function(df, t_col, s_col, t) {
  d   <- df |> rename(t_ = {{ t_col }}, S_ = {{ s_col }})
  idx <- which(d$t_ <= t)
  if (length(idx) == 0) return(1)
  d$S_[max(idx)]
}

t_bersama <- sort(unique(c(empiris_display$t, lt_display$t_j, 0, km_sum$time)))

tbl_perb <- tibble(
  `t (Hari)`     = t_bersama,
  `Empiris`      = map_dbl(t_bersama, ~ get_St(empiris_display, t, St, .x)),
  `Life Table`   = map_dbl(t_bersama, ~ get_St(lt_display, t_j, St, .x)),
  `Kaplan-Meier` = map_dbl(t_bersama, ~ {
    km_df <- tibble(t_ = c(0, km_sum$time), S_ = c(1, km_sum$surv))
    get_St(km_df, t_, S_, .x)
  })
) |>
  mutate(
    `Selisih LT-KM`  = round(abs(`Life Table`   - `Kaplan-Meier`), 6),
    `Selisih Emp-KM` = round(abs(`Empiris`       - `Kaplan-Meier`), 6)
  ) |>
  mutate(across(where(is.double), ~ round(.x, 5)))

tbl_perb |>
  kable(
    caption = "Tabel 14. Perbandingan Nilai S-hat(t) dari Tiga Metode",
    align   = rep("c", 6)
  ) |>
  kable_styling(bootstrap_options = c("striped", "hover", "bordered"),
                full_width = TRUE, font_size = 11)
Tabel 14. Perbandingan Nilai S-hat(t) dari Tiga Metode
t (Hari) Empiris Life Table Kaplan-Meier Selisih LT-KM Selisih Emp-KM
0 1.00000 1.00000 1.00000 0 0
1 0.98551 0.98551 0.98551 0 0
2 0.92029 0.92029 0.92029 0 0
3 0.80435 0.80435 0.80435 0 0
4 0.74638 0.74638 0.74638 0 0
5 0.71739 0.71739 0.71739 0 0
6 0.65217 0.65217 0.65217 0 0
7 0.54348 0.54348 0.54348 0 0
8 0.43478 0.43478 0.43478 0 0
9 0.28985 0.28985 0.28986 0 0
10 0.12319 0.12319 0.12319 0 0
11 0.05072 0.05072 0.05072 0 0
12 0.01449 0.01449 0.01449 0 0
14 0.00725 0.00725 0.00725 0 0
15 0.00000 0.00000 0.00000 0 0

Perbandingan Konseptual Tiga Metode

tbl_konsep <- tibble(
  Aspek = c(
    "Dasar Teori", "Pengelompokan Waktu",
    "Koreksi Tersensor", "Estimasi Hazard",
    "Standar Error", "Kecocokan Data Ini",
    "Referensi Utama"
  ),
  `Metode Empiris` = c(
    "ECDF (empiris langsung)",
    "Tidak — setiap waktu unik",
    "Tidak diperlukan (semua event)",
    "Tidak eksplisit",
    "Var binomial: S(t)(1-S(t))/n",
    "Sangat sesuai (0 sensor)",
    "Statistik nonparametrik dasar"
  ),
  `Life Table` = c(
    "Aktuaria (interval diskret)",
    "Ya — interval lebar b",
    "Koreksi aktuaria c_j/2",
    "h(t_m) pada titik tengah",
    "Formula Greenwood (varian)",
    "Sesuai; identik KM karena c_j=0",
    "Reed & Merrell (1939)"
  ),
  `Kaplan-Meier` = c(
    "Product-limit (NPMLE)",
    "Tidak — setiap waktu event",
    "n_j dikurangi sebelum estimasi",
    "Dari H(t) = -log S(t)",
    "Formula Greenwood (1926)",
    "Sangat sesuai; metode standar",
    "Kaplan & Meier (1958)"
  )
)

tbl_konsep |>
  kable(caption = "Tabel 15. Perbandingan Konseptual Tiga Metode Analisis Survival",
        align = c("l", "l", "l", "l")) |>
  kable_styling(bootstrap_options = c("striped", "hover", "bordered"),
                full_width = TRUE, font_size = 11)
Tabel 15. Perbandingan Konseptual Tiga Metode Analisis Survival
Aspek Metode Empiris Life Table Kaplan-Meier
Dasar Teori ECDF (empiris langsung) Aktuaria (interval diskret) Product-limit (NPMLE)
Pengelompokan Waktu Tidak — setiap waktu unik Ya — interval lebar b Tidak — setiap waktu event
Koreksi Tersensor Tidak diperlukan (semua event) Koreksi aktuaria c_j/2 n_j dikurangi sebelum estimasi
Estimasi Hazard Tidak eksplisit h(t_m) pada titik tengah Dari H(t) = -log S(t)
Standar Error Var binomial: S(t)(1-S(t))/n Formula Greenwood (varian) Formula Greenwood (1926)
Kecocokan Data Ini Sangat sesuai (0 sensor) Sesuai; identik KM karena c_j=0 Sangat sesuai; metode standar
Referensi Utama Statistik nonparametrik dasar Reed & Merrell (1939) Kaplan & Meier (1958)

Hasil dan Pembahasan

Karakteristik Data

Data yang dianalisis terdiri dari 138 pasien COVID-19 yang dirawat di RSI Malahayati Medan, Januari–Mei 2021. Rerata lama rawat adalah 7,30 hari (rentang: 1–15 hari, median: 8 hari). Distribusi relatif simetris dengan sedikit kemiringan positif (skewness = -0.273).

Tabel Gabungan Hasil Analisis Survival

t_all <- sort(unique(data_covid$lama_rawat))

step_lookup <- function(t_vec, s_vec, t_query) {
  map_dbl(t_query, function(t) {
    idx <- max(which(t_vec <= t), na.rm = TRUE)
    if (is.na(idx) || is.infinite(idx)) 1 else s_vec[idx]
  })
}

tbl_gabungan <- tibble(
  `t (Hari)`     = c(0, t_all),
  `d (Event)`    = c(0, map_int(t_all, ~ sum(data_covid$lama_rawat == .x))),
  `n (Berisiko)` = c(n_total, map_int(t_all, ~ sum(data_covid$lama_rawat >= .x))),
) |>
  mutate(
    `Empiris S(t)`      = c(1, step_lookup(empiris_tbl$t, empiris_tbl$St, t_all)),
    `Life Table S(t)`   = c(1, step_lookup(lt_calc$t_j, lt_calc$St, t_all)),
    `Kaplan-Meier S(t)` = c(1, step_lookup(km_calc$t_j, km_calc$St, t_all)),
    `F(t) KM`           = round(1 - `Kaplan-Meier S(t)`, 5),
  ) |>
  mutate(across(where(is.double), ~ round(.x, 5)))

tbl_gabungan |>
  kable(
    caption = "Tabel 16. Rekapitulasi Hasil Analisis Survival — Ketiga Metode",
    align   = rep("c", 7)
  ) |>
  kable_styling(bootstrap_options = c("striped", "hover", "bordered"),
                full_width = TRUE, font_size = 11)
Tabel 16. Rekapitulasi Hasil Analisis Survival — Ketiga Metode
t (Hari) d (Event) n (Berisiko) Empiris S(t) Life Table S(t) Kaplan-Meier S(t) F(t) KM
0 0 138 1.00000 1.00000 1.00000 0.00000
1 2 138 0.98551 0.98551 0.98551 0.01449
2 9 136 0.92029 0.92029 0.92029 0.07971
3 16 127 0.80435 0.80435 0.80435 0.19565
4 8 111 0.74638 0.74638 0.74638 0.25362
5 4 103 0.71739 0.71739 0.71739 0.28261
6 9 99 0.65217 0.65217 0.65217 0.34783
7 15 90 0.54348 0.54348 0.54348 0.45652
8 15 75 0.43478 0.43478 0.43478 0.56522
9 20 60 0.28985 0.28985 0.28985 0.71015
10 23 40 0.12319 0.12319 0.12319 0.87681
11 10 17 0.05072 0.05072 0.05072 0.94928
12 5 7 0.01449 0.01449 0.01449 0.98551
14 1 2 0.00725 0.00725 0.00725 0.99275
15 1 1 0.00000 0.00000 0.00000 1.00000

Median Survival per Kelompok

med_km_km <- surv_median(survfit(surv_obj ~ komorbid,      data = data_covid))
med_km_jk <- surv_median(survfit(surv_obj ~ jenis_kelamin, data = data_covid))
med_km_dx <- surv_median(survfit(surv_obj ~ kategori_dx,   data = data_covid))
med_km_us <- surv_median(km_usia)

bind_rows(
  med_km_jk |> mutate(Faktor = "Jenis Kelamin"),
  med_km_km |> mutate(Faktor = "Status Komorbid"),
  med_km_dx |> mutate(Faktor = "Kategori Diagnosis"),
  med_km_us |> mutate(Faktor = "Kelompok Usia")
) |>
  select(Faktor, strata, median, lower, upper) |>
  mutate(strata = gsub(".*=", "", strata)) |>
  rename(
    Kelompok        = strata,
    `Median (Hari)` = median,
    `CI Bawah 95%`  = lower,
    `CI Atas 95%`   = upper
  ) |>
  kable(
    caption = "Tabel 17. Estimasi Median Waktu Rawat per Kelompok (Kaplan-Meier)",
    align   = c("l", "l", "c", "c", "c")
  ) |>
  kable_styling(bootstrap_options = c("striped", "hover", "bordered"),
                full_width = FALSE, font_size = 12) |>
  collapse_rows(columns = 1, valign = "middle")
Tabel 17. Estimasi Median Waktu Rawat per Kelompok (Kaplan-Meier)
Faktor Kelompok Median (Hari) CI Bawah 95% CI Atas 95%
Jenis Kelamin Laki-laki 7 6 9
Perempuan 8 7 9
Status Komorbid Non-Komorbid 8 7 9
Komorbid 7 6 8
Kategori Diagnosis Suspect 8 7 9
Positif 8 7 9
Kelompok Usia Di bawah 40 Tahun 7 6 8
40 Tahun ke atas 9 7 9

Ringkasan Interpretasi

tibble(
  Aspek = c(
    "Median lama rawat (keseluruhan)",
    "S-hat(7): Probabilitas belum sembuh pada hari ke-7",
    "S-hat(8): Probabilitas belum sembuh pada hari ke-8",
    "Probabilitas sembuh dalam 10 hari pertama",
    "Median — Non-Komorbid",
    "Median — Komorbid",
    "Median — Laki-laki",
    "Median — Perempuan",
    "Median — Usia < 40 Tahun",
    "Median — Usia >= 40 Tahun"
  ),
  Nilai = c(
    "8 hari",
    paste0(round(approx(km_calc$t_j, km_calc$St, xout = 7)$y * 100, 1), "%"),
    paste0(round(approx(km_calc$t_j, km_calc$St, xout = 8)$y * 100, 1), "%"),
    paste0(round((1 - approx(km_calc$t_j, km_calc$St, xout = 10)$y) * 100, 1), "%"),
    paste0(surv_median(km_km)$median[1], " hari"),
    paste0(surv_median(km_km)$median[2], " hari"),
    paste0(surv_median(km_jk)$median[1], " hari"),
    paste0(surv_median(km_jk)$median[2], " hari"),
    paste0(surv_median(km_usia)$median[1], " hari"),
    paste0(surv_median(km_usia)$median[2], " hari")
  )
) |>
  kable(caption = "Tabel 18. Ringkasan Hasil Analisis Survival COVID-19 RSI Malahayati",
        align = c("l", "c")) |>
  kable_styling(bootstrap_options = c("striped", "hover", "bordered"),
                full_width = FALSE, font_size = 12)
Tabel 18. Ringkasan Hasil Analisis Survival COVID-19 RSI Malahayati
Aspek Nilai
Median lama rawat (keseluruhan) 8 hari
S-hat(7): Probabilitas belum sembuh pada hari ke-7 54.3%
S-hat(8): Probabilitas belum sembuh pada hari ke-8 43.5%
Probabilitas sembuh dalam 10 hari pertama 87.7%
Median — Non-Komorbid 8 hari
Median — Komorbid 7 hari
Median — Laki-laki 7 hari
Median — Perempuan 8 hari
Median — Usia < 40 Tahun 7 hari
Median — Usia >= 40 Tahun 9 hari

Uji Log-Rank — Interpretasi

Uji Log-Rank dilakukan untuk menguji apakah terdapat perbedaan yang signifikan secara statistik pada fungsi survival \(S(t)\) antar kelompok. Pengujian dilakukan pada taraf nyata \(\alpha = 0{,}05\) dengan hipotesis \(H_0: S_1(t) = S_2(t)\) untuk semua \(t\).

Status Komorbid (\(\chi^2 = 2.7453\), \(p = 0.0975\)): Gagal tolak \(H_0\). Tidak terdapat perbedaan yang signifikan secara statistik pada fungsi survival antara kelompok Non-Komorbid dan Komorbid. Meskipun estimasi median lama rawat kelompok Non-Komorbid (8 hari) lebih panjang satu hari dibandingkan Komorbid (7 hari), perbedaan ini tidak cukup besar untuk ditangkap secara statistik pada ukuran sampel yang tersedia. Secara epidemiologis, hal ini dapat diinterpretasikan bahwa komorbiditas pada populasi penelitian ini tidak secara substansial memperpanjang waktu pemulihan, kemungkinan karena pasien komorbid yang masuk ke RSI sudah mendapatkan penanganan medis yang terstandar sehingga laju kesembuhannya tidak berbeda jauh.

Jenis Kelamin (\(\chi^2 = 1.6805\), \(p = 0.1949\)): Gagal tolak \(H_0\). Tidak terdapat perbedaan signifikan fungsi survival antara pasien laki-laki (median 7 hari) dan perempuan (median 8 hari). Hasil ini konsisten dengan beberapa studi COVID-19 yang menunjukkan bahwa jenis kelamin bukan prediktor independen yang kuat terhadap lama rawat pada kasus tanpa komplikasi berat.

Kategori Diagnosis (\(\chi^2 = 0.346\), \(p = 0.5564\)): Gagal tolak \(H_0\). Tidak terdapat perbedaan signifikan antara kurva survival pasien Suspect dan Positif COVID-19. Kedua kelompok memiliki median lama rawat yang sama (8 hari), mengindikasikan bahwa protokol perawatan yang diterapkan di RSI Malahayati pada periode tersebut tidak menghasilkan perbedaan lama rawat yang berarti berdasarkan status konfirmasi diagnostik.

Kelompok Usia (\(\chi^2 = 3.4853\), \(p = 0.0619\)): Gagal tolak \(H_0\). Tidak terdapat perbedaan signifikan antara kelompok usia di bawah 40 tahun (median 7 hari) dan 40 tahun ke atas (median 9 hari). Walaupun arah perbedaan mediannya konsisten dengan ekspektasi klinis — bahwa pasien yang lebih tua cenderung memerlukan waktu pemulihan lebih lama — perbedaan tersebut tidak mencapai signifikansi statistik.

Catatan kritis: Uji Log-Rank memberikan bobot yang sama pada seluruh waktu event dan paling powerful ketika rasio hazard antar kelompok bersifat proporsional sepanjang waktu (proportional hazards assumption). Pada dataset ini dengan \(n = 138\) dan rentang waktu yang sempit (1–15 hari), statistical power uji mungkin tidak cukup untuk mendeteksi perbedaan kecil. Tidak adanya perbedaan signifikan bukan berarti fungsi survival kedua kelompok identik secara populasi (absence of evidence is not evidence of absence).

Temuan ini konsisten dengan kesimpulan Imam (2021) yang menyatakan tidak terdapat perbedaan signifikan pada laju kesembuhan berdasarkan faktor-faktor tersebut.


Simpulan

Berdasarkan analisis ketahanan hidup pada data 138 pasien COVID-19 RSI Malahayati Medan periode Januari–Mei 2021, dapat disimpulkan hal-hal berikut.

  1. Metode Empiris menghasilkan estimasi fungsi survival \(\hat{S}(t)\) secara langsung dari data observasi tanpa asumsi distribusi parametrik. Estimator ini tidak bias dengan variansi binomial \(S(t)[1-S(t)]/n\), dan konsisten secara asimtotik berdasarkan Teorema Glivenko-Cantelli. Karena seluruh subjek mengalami event, metode ini berlaku penuh tanpa pembatasan.

  2. Life Table dengan interval harian (\(b = 1\) hari) dan koreksi aktuaria \(c_j/2\) menghasilkan estimasi yang identik dengan Kaplan-Meier. Identitas ini merupakan konsekuensi matematis dari dua kondisi: tidak adanya data tersensor (\(c_j = 0\) untuk semua \(j\)) dan lebar interval unit. Fungsi hazard \(\hat{h}(t)\) mencapai puncak pada \(t = 10\) hari (\(\hat{h} = 2.7842\)) dan \(t = 11\) hari, mengindikasikan periode kritis pemulihan.

  3. Kaplan-Meier menghasilkan estimator product-limit yang merupakan Nonparametric Maximum Likelihood Estimator (NPMLE) untuk \(S(t)\) — konsisten, efisien, dan merupakan standar analisis survival modern. Median lama rawat keseluruhan adalah 8 hari (CI 95%: 8–9 hari), artinya 50% pasien sembuh dalam 8 hari atau kurang sejak masuk rumah sakit. Sekitar 87,7% pasien sembuh dalam 10 hari pertama.

  4. Uji Log-Rank pada \(\alpha = 0{,}05\) gagal menolak \(H_0\) untuk seluruh faktor yang diuji — komorbiditas, jenis kelamin, kategori diagnosis, dan kelompok usia. Tidak adanya perbedaan signifikan perlu diinterpretasikan secara hati-hati: dengan \(n = 138\) dan rentang waktu yang sempit, statistical power mungkin tidak cukup untuk mendeteksi efek kecil. Temuan ini konsisten dengan kesimpulan Imam (2021).

  5. Skenario penyensoran Tipe I dengan cut-off 31 Maret 2021 menghasilkan 75 observasi tersensor (\(\approx 54{,}3\%\)) dan 63 event. Skenario ini mendemonstrasikan secara metodologis bahwa Kaplan-Meier mampu mengakomodasi data tersensor secara valid melalui reduksi risk set \(n_j\) sebelum setiap titik estimasi, tanpa mengabaikan kontribusi informasi dari subjek tersensor.

  6. Validasi silang antar metode: Identitas numerik antara estimasi Empiris, Life Table, dan Kaplan-Meier pada dataset ini berfungsi sebagai validasi silang (cross-validation) yang memperkuat keandalan hasil analisis. Perbedaan antar metode hanya akan muncul bila terdapat data tersensor (\(c_j > 0\)) atau bila Life Table menggunakan lebar interval \(b > 1\) hari.