R Markdown

This is an R Markdown document. Markdown is a simple formatting syntax for authoring HTML, PDF, and MS Word documents. For more details on using R Markdown see http://rmarkdown.rstudio.com.

When you click the Knit button a document will be generated that includes both content as well as the output of any embedded R code chunks within the document. You can embed an R code chunk like this:

summary(cars)
##      speed           dist       
##  Min.   : 4.0   Min.   :  2.00  
##  1st Qu.:12.0   1st Qu.: 26.00  
##  Median :15.0   Median : 36.00  
##  Mean   :15.4   Mean   : 42.98  
##  3rd Qu.:19.0   3rd Qu.: 56.00  
##  Max.   :25.0   Max.   :120.00

Including Plots

You can also embed plots, for example:

Note that the echo = FALSE parameter was added to the code chunk to prevent printing of the R code that generated the plot.

library(survival)
library(survminer)
## Loading required package: ggplot2
## Loading required package: ggpubr
## 
## Attaching package: 'survminer'
## The following object is masked from 'package:survival':
## 
##     myeloma
mydata<-lung
head(lung)
##   inst time status age sex ph.ecog ph.karno pat.karno meal.cal wt.loss
## 1    3  306      2  74   1       1       90       100     1175      NA
## 2    3  455      2  68   1       0       90        90     1225      15
## 3    3 1010      1  56   1       0       90        90       NA      15
## 4    5  210      2  57   1       1       90        60     1150      11
## 5    1  883      2  60   1       0      100        90       NA       0
## 6   12 1022      1  74   1       1       50        80      513       0
#create Survival object
surv_obj<-Surv(time=lung$time,event=lung$status)
#Fit Kaplan-Meier model
km_fit<-survfit(surv_obj~1,data=lung)
#plot the survival curve
ggsurvplot(km_fit,data=lung,
           conf.int=TRUE,#confidence interval
           pval=TRUE,#log rank test p-value
           risk.table=TRUE,#show risk table
           title="Kaplan-Meier survival curve")
## Warning in .pvalue(fit, data = data, method = method, pval = pval, pval.coord = pval.coord, : There are no survival curves to be compared. 
##  This is a null model.

The survival curve shows the probability of survival over time. The confidence interval indicates uncertainity. The risk table shows the number of subjects still at risk at different time points.

#fit Nelson Aalen model
na_fit<-survfit(coxph(surv_obj~1,data=lung),type="aalen")
#plot cumulative hazard
ggsurvplot(na_fit,data=lung,fun="cumhaz",title="Nelson-Aalen Cumulative Hazard Function")

This plot shows how the risk of death accumulates over time. A steeper plot suggests a higher hazard rate.

#Log-Rank test
km_fit_grouped<-survfit(surv_obj~sex,data=lung)
survdiff(surv_obj~sex,data=lung)
## Call:
## survdiff(formula = surv_obj ~ sex, data = lung)
## 
##         N Observed Expected (O-E)^2/E (O-E)^2/V
## sex=1 138      112     91.6      4.55      10.3
## sex=2  90       53     73.4      5.68      10.3
## 
##  Chisq= 10.3  on 1 degrees of freedom, p= 0.001

The p-value is less than 0.05 which suggests a statistically significant difference in survival between groups hence sex is an important predictor of survival.

#Cox Proportional Hazards Model
cox_model<-coxph(surv_obj~age+sex+ph.ecog,data=lung)
summary(cox_model)
## Call:
## coxph(formula = surv_obj ~ age + sex + ph.ecog, data = lung)
## 
##   n= 227, number of events= 164 
##    (1 observation deleted due to missingness)
## 
##              coef exp(coef)  se(coef)      z Pr(>|z|)    
## age      0.011067  1.011128  0.009267  1.194 0.232416    
## sex     -0.552612  0.575445  0.167739 -3.294 0.000986 ***
## ph.ecog  0.463728  1.589991  0.113577  4.083 4.45e-05 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
##         exp(coef) exp(-coef) lower .95 upper .95
## age        1.0111     0.9890    0.9929    1.0297
## sex        0.5754     1.7378    0.4142    0.7994
## ph.ecog    1.5900     0.6289    1.2727    1.9864
## 
## Concordance= 0.637  (se = 0.025 )
## Likelihood ratio test= 30.5  on 3 df,   p=1e-06
## Wald test            = 29.93  on 3 df,   p=1e-06
## Score (logrank) test = 30.5  on 3 df,   p=1e-06