class: center, middle, inverse, title-slide # Statistics with R ## R for Actuarial Students --- ### Survival Analysis Load the dataset “ovarian.csv” and save this as a data frame called ‘ovarian’. The columns in the dataset can be defined as: * ***futime***: survival or censoring time * ***fustat***: censoring status * ***age***: in years * ***resid.ds***: residual disease present (1=no,2=yes) * ***rx***: treatment group * ***ecog.ps***: ECOG performance status Carry out the following analysis: ```r library(tidyverse) library(survival) ovarian <- read_csv("ovarian.csv") ``` --- ```r ovarian %>% filter(rx==1) %>% arrange(futime) ``` ``` ## # A tibble: 13 x 6 ## futime fustat age resid.ds rx ecog.ps ## <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> ## 1 59 1 72.3 2 1 1 ## 2 115 1 74.5 2 1 1 ## 3 156 1 66.5 2 1 2 ## 4 268 1 74.5 2 1 2 ## 5 329 1 43.1 2 1 1 ## 6 431 1 50.3 2 1 1 ## 7 448 0 56.4 1 1 2 ## 8 477 0 64.2 2 1 1 ## 9 638 1 56.8 1 1 2 ## 10 803 0 39.3 1 1 1 ## 11 855 0 43.1 1 1 2 ## 12 1040 0 38.9 2 1 2 ## 13 1106 0 44.6 1 1 1 ``` --- ```r ovarian %>% filter(rx==2) %>% arrange(futime) ``` ``` ## # A tibble: 13 x 6 ## futime fustat age resid.ds rx ecog.ps ## <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> ## 1 353 1 63.2 1 2 2 ## 2 365 1 64.4 2 2 1 ## 3 377 0 58.3 1 2 1 ## 4 421 0 53.4 2 2 1 ## 5 464 1 56.9 2 2 2 ## 6 475 1 59.9 2 2 2 ## 7 563 1 55.2 1 2 2 ## 8 744 0 50.1 1 2 1 ## 9 769 0 59.6 2 2 2 ## 10 770 0 57.1 2 2 1 ## 11 1129 0 53.9 1 2 1 ## 12 1206 0 44.2 2 2 1 ## 13 1227 0 59.6 1 2 2 ``` --- ### Exercises 1. Compute the Cox regression using the column <tt>rx</tt> as the covariate. 2. Summarise the results of the regression. 3. Comment on the hazard ratio and statistical significance of the <tt>rx</tt> 4. Compute the Kaplan-Meier estimators for the two groups of <tt>rx</tt> and save the results to an object called ‘KM’. 5. Summarise the Kaplan-Meier estimator results. 6. Plot the ‘KM’ results and set the colour for group 1 of <tt>rx</tt> to ‘red’ and group 2 to ‘green’. Add a plot label and x and y axes labels to your plot. 7. Comment on the KM analysis using the summary and plot results. --- ### Part 1 ```r library(survival) cph1r <- coxph(Surv(futime,fustat)~rx, data=ovarian) ``` --- ### Part 2 ```r summary(cph1r) ``` ``` ## Call: ## coxph(formula = Surv(futime, fustat) ~ rx, data = ovarian) ## ## n= 26, number of events= 12 ## ## coef exp(coef) se(coef) z Pr(>|z|) ## rx -0.5964 0.5508 0.5870 -1.016 0.31 ## ## exp(coef) exp(-coef) lower .95 upper .95 ## rx 0.5508 1.816 0.1743 1.74 ## ## Concordance= 0.608 (se = 0.07 ) ## Likelihood ratio test= 1.05 on 1 df, p=0.3 ## Wald test = 1.03 on 1 df, p=0.3 ## Score (logrank) test = 1.06 on 1 df, p=0.3 ``` --- ### Part 3 * The p-value of rx is 0.31 which is greater than a significance level of 0.05 and hence is not a siginificant predictor * The hazard ratio indicates that an rx value of 2 has a 0.5508 or ~50% lower hazard rate than rx value of 1' --- ### Part 4 ```r KM <- survfit(Surv(futime,fustat)~rx,ovarian) KM ``` ``` ## Call: survfit(formula = Surv(futime, fustat) ~ rx, data = ovarian) ## ## n events median 0.95LCL 0.95UCL ## rx=1 13 7 638 268 NA ## rx=2 13 5 NA 475 NA ``` --- ### Part 5 ```r summary(KM) ``` ``` ## Call: survfit(formula = Surv(futime, fustat) ~ rx, data = ovarian) ## ## rx=1 ## time n.risk n.event survival std.err lower 95% CI upper 95% CI ## 59 13 1 0.923 0.0739 0.789 1.000 ## 115 12 1 0.846 0.1001 0.671 1.000 ## 156 11 1 0.769 0.1169 0.571 1.000 ## 268 10 1 0.692 0.1280 0.482 0.995 ## 329 9 1 0.615 0.1349 0.400 0.946 ## 431 8 1 0.538 0.1383 0.326 0.891 ## 638 5 1 0.431 0.1467 0.221 0.840 ## ## rx=2 ## time n.risk n.event survival std.err lower 95% CI upper 95% CI ## 353 13 1 0.923 0.0739 0.789 1.000 ## 365 12 1 0.846 0.1001 0.671 1.000 ## 464 9 1 0.752 0.1256 0.542 1.000 ## 475 8 1 0.658 0.1407 0.433 1.000 ## 563 7 1 0.564 0.1488 0.336 0.946 ``` --- ### Part 6 ```r plot(KM,col=c("red","green"), main="Kaplan-Meier Survival Curve", xlab="Time",ylab="Survival Probability") ``` --- ### Part 6 <!-- --> --- ### Part 7 * <tt>rx 2</tt> seems to perform better than <tt>rx 1</tt> initially upto time 353 post which there is a decline in the Survival Probability. * There is a steep decline in the numbers at risk between times 365 and 464 for rx at 2' ---