##Introduction In this analysis we will perform a survival analysis using the ovarian cancer data set.The analysis will include the following data set Kaplan-Meier estimate to estimate the survival probability of ovarian patients Nelson-Aaelen cumulative harzard to estiate the cummultive harzard of survival. Log-Rank test to compare survival distributions between different group such as age and sex whether female or male Cox proportional hazard model to model the effect of covariates on survival in ovarian cancer patients.
These methods will help us understand the survival parttens of ovarian cancer patients and identify factors influencing survival rate.
#Load dataset data(cancer, package = “survival”) library(survival)
##Kaplan-Meier Estimate
data(cancer, package = “survival”) library(survival) km_fit<-survfit(Surv(time, status) ~ 1,data = cancer) plot(km_fit, xlab = “Time”, ylab = “Survival Probability”, main = “Kaplan-Meier Survival Curve”)
over time the female are more likely to be at risk of getting cancer since the slope is more steep. A significant difference in the curves shows that one group has a lower survival probability over time.
##Nelson-Aalen Cumulative hazard model
It helps understand the risk of events such as death accumulated over time.
data(cancer, package = “survival”) library(survival) surv_data <-Surv(cancer\(time, cancer\)status) na_fit <-survfit(surv_data ~ 1,type =“fleming-harrington” ) plot(na_fit, xlab = “Tme”, ylab = “Cumlative Harzard”, main =“Nelson-Aalen Cumulative Harzard”) surv_data <-Surv(cancer\(time, cancer\)status)
The cummulative hazard increases over time since the slop is step.this means over time the according to the gender they ar likely to be at rsik of getting overain cancer.
##Log-rank test To test weher the survival curves for both groups are significantly different.
library(survival) data(cancer, package = “survival”) surv_data <-Surv(cancer\(time, cancer\)status) logrank_test <-survdiff(surv_data ~ cancer$sex) logrank_test
the P value is less than 0.005 hence statistically significant.This suggests that the patience in group i have compard to group 2 The T statistics shows that there isnt much difference between the groups.
##Cox Proportional Harzards Model assess effects of on or more covariates.
library(survival) data(cancer, package = “survival”) surv_data <-Surv(cancer\(time, cancer\)status) cox_model <-coxph(surv_data ~ cancer$sex) library(survival) summary(cox_model)
The coefficients are positive that means the risk of the 288 patients in getting cancer according to their sex is higher by 0.588. the harzard ratio is greater than one hence the risk of getting ovarian cancer increases as the sex increases. the p value is less than 0.05 hence sex is significantly related to ovarian cancer.