Survival Analysis is a set of statistical methods for analyzing the occurrence of events over time.The two key functions in survival analysis are the survival function and the hazard function.The survival function, conventionally denoted by S, is the probability that the event has not occurred yet.
A popular estimate for the survival function S(t) is the Kaplan–Meier estimate
Loading the packages required
library(OIsurv)
## Loading required package: survival
## Loading required package: KMsurv
library(ggplot2)
library(plotly)
##
## Attaching package: 'plotly'
## The following object is masked from 'package:ggplot2':
##
## last_plot
## The following object is masked from 'package:stats':
##
## filter
## The following object is masked from 'package:graphics':
##
## layout
library(ggfortify)
Setting up the credentials to publish the graphs on plotly
Sys.setenv("plotly_username"="gupta.ruch")
Sys.setenv("plotly_api_key"="kuaxzjdqnl")
Loading the dataset tongue
data(tongue)
Printing the summary of the dataset
summary(tongue)
## type time delta
## Min. :1.00 Min. : 1.00 Min. :0.0000
## 1st Qu.:1.00 1st Qu.: 23.75 1st Qu.:0.0000
## Median :1.00 Median : 69.50 Median :1.0000
## Mean :1.35 Mean : 73.83 Mean :0.6625
## 3rd Qu.:2.00 3rd Qu.:101.75 3rd Qu.:1.0000
## Max. :2.00 Max. :400.00 Max. :1.0000
Attaching the dataset to the R search path.The dataset will be searched by R when evaluating a variable, so objects objects in the dataset cann be accessed by simoly giving their names.Creating a Survival object with Surv() function and it is usually used as a responce variable in a model formula.For right-censored data, only two arguments are needed in the Surv() function: a vector of times and a vector indicating which times are observed and censored.
attach(tongue)
tongue.surv <- Surv(time[type==1], delta[type==1])
tongue.surv
## [1] 1 3 3 4 10 13 13 16 16 24 26 27 28 30
## [15] 30 32 41 51 65 67 70 72 73 77 91 93 96 100
## [29] 104 157 167 61+ 74+ 79+ 80+ 81+ 87+ 87+ 88+ 89+ 93+ 97+
## [43] 101+ 104+ 108+ 109+ 120+ 131+ 150+ 231+ 240+ 400+
Kaplan-Meier estimate and pointwise bounds: The Kaplan-Meier estimate is fit in R using the function survfit(). The simplest fit takes as input a formula of a survival object against an intercept
surv.fit <- survfit(tongue.surv~1)
surv.fit
## Call: survfit(formula = tongue.surv ~ 1)
##
## n events median 0.95LCL 0.95UCL
## 52 31 93 67 NA
summary of the survival function which returns a list
summary(surv.fit)
## Call: survfit(formula = tongue.surv ~ 1)
##
## time n.risk n.event survival std.err lower 95% CI upper 95% CI
## 1 52 1 0.981 0.0190 0.944 1.000
## 3 51 2 0.942 0.0323 0.881 1.000
## 4 49 1 0.923 0.0370 0.853 0.998
## 10 48 1 0.904 0.0409 0.827 0.988
## 13 47 2 0.865 0.0473 0.777 0.963
## 16 45 2 0.827 0.0525 0.730 0.936
## 24 43 1 0.808 0.0547 0.707 0.922
## 26 42 1 0.788 0.0566 0.685 0.908
## 27 41 1 0.769 0.0584 0.663 0.893
## 28 40 1 0.750 0.0600 0.641 0.877
## 30 39 2 0.712 0.0628 0.598 0.846
## 32 37 1 0.692 0.0640 0.578 0.830
## 41 36 1 0.673 0.0651 0.557 0.813
## 51 35 1 0.654 0.0660 0.537 0.797
## 65 33 1 0.634 0.0669 0.516 0.780
## 67 32 1 0.614 0.0677 0.495 0.762
## 70 31 1 0.594 0.0683 0.475 0.745
## 72 30 1 0.575 0.0689 0.454 0.727
## 73 29 1 0.555 0.0693 0.434 0.709
## 77 27 1 0.534 0.0697 0.414 0.690
## 91 19 1 0.506 0.0715 0.384 0.667
## 93 18 1 0.478 0.0728 0.355 0.644
## 96 16 1 0.448 0.0741 0.324 0.620
## 100 14 1 0.416 0.0754 0.292 0.594
## 104 12 1 0.381 0.0767 0.257 0.566
## 157 5 1 0.305 0.0918 0.169 0.550
## 167 4 1 0.229 0.0954 0.101 0.518
The Kaplan-Meier estimate may be plotted using plot(surv.fit).
plot(surv.fit, main='Kaplan-Meier estimate with 95% confidence bounds', xlab='time', ylab='survival function')
plotting the same graph using ggplot
ggplot <- autoplot(surv.fit,data = tongue, main='Kaplan-Meier estimate with 95% confidence bounds', xlab='time', ylab='survival function',surv.colour = 'orange', censor.colour = 'red')
ggplot
Plotting the same graph using plotly
plotly <- ggplotly(ggplot)
## Warning in geom2trace.default(dots[[1L]][[1L]], dots[[2L]][[1L]], dots[[3L]][[1L]]): geom_GeomConfint() has yet to be implemented in plotly.
## If you'd like to see this geom implemented,
## Please open an issue with your example code at
## https://github.com/ropensci/plotly/issues
plotly
Publishing graphs to your online plotly account
plotly_POST(plotly, "Survival Analysis using Kaplan–Meier estimate")
## No encoding supplied: defaulting to UTF-8.
## Success! Modified your plotly here -> https://plot.ly/~gupta.ruch/2