Using the AML data set, calculate the KM estimate for survival and draw the KM curves for both groups on the same graph. Compare the two curves using Log Rank Test.
Below we can see AML dataset. It contains a clinical trial in patients with Acute Myelogenous Leukemia (AML) comparing two groups of patients having maintenance chemotherapy vs no maintenance chemotherapy.
library(readxl)
library(knitr)
aml <- read_excel("AML.xlsx")
kable(aml)
| Time | Status | Group |
|---|---|---|
| 9 | 1 | Maintained |
| 13 | 1 | Maintained |
| 13 | 0 | Maintained |
| 18 | 1 | Maintained |
| 23 | 1 | Maintained |
| 28 | 0 | Maintained |
| 31 | 1 | Maintained |
| 34 | 1 | Maintained |
| 45 | 0 | Maintained |
| 48 | 1 | Maintained |
| 161 | 0 | Maintained |
| 5 | 1 | NonMaintained |
| 5 | 1 | NonMaintained |
| 8 | 1 | NonMaintained |
| 8 | 1 | NonMaintained |
| 12 | 1 | NonMaintained |
| 16 | 0 | NonMaintained |
| 23 | 1 | NonMaintained |
| 27 | 1 | NonMaintained |
| 30 | 0 | NonMaintained |
| 33 | 1 | NonMaintained |
| 43 | 1 | NonMaintained |
| 45 | 1 | NonMaintained |
attach(aml)
In these data sets, Status = 1 means event occur(die) and Status = 0 means lost to follow(censored) data.
library(survival)
##
## Attaching package: 'survival'
## The following object is masked _by_ '.GlobalEnv':
##
## aml
aml.surv <- Surv(time = Time, event = Status)
aml.surv
## [1] 9 13 13+ 18 23 28+ 31 34 45+ 48 161+ 5 5 8 8
## [16] 12 16+ 23 27 30+ 33 43 45
aml.fit <- survfit(Surv(time = Time, event = Status) ~ Group, data = aml)
summary(aml.fit)
## Call: survfit(formula = Surv(time = Time, event = Status) ~ Group,
## data = aml)
##
## Group=Maintained
## time n.risk n.event survival std.err lower 95% CI upper 95% CI
## 9 11 1 0.909 0.0867 0.7541 1.000
## 13 10 1 0.818 0.1163 0.6192 1.000
## 18 8 1 0.716 0.1397 0.4884 1.000
## 23 7 1 0.614 0.1526 0.3769 0.999
## 31 5 1 0.491 0.1642 0.2549 0.946
## 34 4 1 0.368 0.1627 0.1549 0.875
## 48 2 1 0.184 0.1535 0.0359 0.944
##
## Group=NonMaintained
## time n.risk n.event survival std.err lower 95% CI upper 95% CI
## 5 12 2 0.833 0.108 0.6470 1.000
## 8 10 2 0.667 0.136 0.4468 0.995
## 12 8 1 0.583 0.142 0.3616 0.941
## 23 6 1 0.486 0.148 0.2675 0.883
## 27 5 1 0.389 0.147 0.1854 0.816
## 33 3 1 0.259 0.144 0.0871 0.771
## 43 2 1 0.130 0.117 0.0222 0.756
## 45 1 1 0.000 NaN NA NA
plot(aml.fit, mark.time = T,
col = c("red","blue"), lwd = 1:1, lty = 1:1, cex = 2,main="KM Curves",
xlab = "Time(t in weeks)", ylab = "Survival Probability = S(t)")
legend("topright", legend = c("Maintained Chemotherapy", "NonMaintained Chemotherapy"),
fill = c("red","blue"), lty = 1:1)
aml.fit
## Call: survfit(formula = Surv(time = Time, event = Status) ~ Group,
## data = aml)
##
## n events median 0.95LCL 0.95UCL
## Group=Maintained 11 7 31 18 NA
## Group=NonMaintained 12 10 23 8 NA
To compare survival between gropus we can use the log rank test. Hypotheis,
\(H_0\) = The two survival curves are identicall (i.e. \(S_{1t} = S_{2t}\)) vs \(H_1\) = The two survival curves are not identicall (i.e. \(S_{1t} \neq S_{2t}\))
survdiff(Surv(time = Time, event = Status) ~ Group, data = aml)
## Call:
## survdiff(formula = Surv(time = Time, event = Status) ~ Group,
## data = aml)
##
## N Observed Expected (O-E)^2/E (O-E)^2/V
## Group=Maintained 11 7 10.13 0.969 2.61
## Group=NonMaintained 12 10 6.87 1.430 2.61
##
## Chisq= 2.6 on 1 degrees of freedom, p= 0.1
Is the \(\alpha = 0.05\) since \(p-value > \alpha\) we failed to reject \(H_{0}\)(null hypothesis).
In fact calculated Chi-Squre value is 2.6. And table value of Chi-Squre \(\alpha=0.05\) df \((k-1) = 2-1 = 1\) where \(k\) is number of groups. So table value is 3.34.
Therefore calculated value is less than critical value. Then we failed to reject \(H_0\).
That means we have enough evidence to tell that the two Survival curves are identical for 5% significance level.