data = read.csv("/Users/rudymartinez/Desktop/MSDA/Data-Analytics-Practicum-II-Spring-22/Exercises/Exercise 3/Question 1_Data.csv", header = 1)
print(data[0:6])## Time At_Risk Num_Died Risk_of_Dying Prob_not_Dying Survival
## 1 0 10 0 0/10 10/10 1
## 2 1 10 1 1/10 9/10 9/10 = 0.9
## 3 4 9 1 1/9 8/9 0.9 * 8/9 = 0.8
## 4 5 8 1 1/8 7/8 0.8 * 7/8 = 0.7
## 5 7 6 1 1/6 5/6 0.7 * 5/6 = 0.583
## 6 12 3 1 1/3 2/3 0.583 * 2/3 = 0.3886
## 7 14 2 1 1/2 1/2 0.3886 * 1/2 = 0.1943
x_axis = 0:6
y_axis = data %>% select(Survival.Total)
krusty_chart = data.frame(x_axis, y_axis)
ggplot(krusty_chart, aes(x=x_axis, y=Survival.Total)) + geom_line()survival_data = read.csv("/Users/rudymartinez/Desktop/MSDA/Data-Analytics-Practicum-II-Spring-22/Exercises/Exercise 3/Question 2_Data.csv", header = 1)
head(survival_data)## Group Time Event
## 1 1 681 0
## 2 1 602 0
## 3 1 996 0
## 4 1 1162 0
## 5 1 833 0
## 6 1 477 0
survival_model = survfit(Surv(Time,Event)~Group, data=survival_data)
ggsurvplot(survival_model,
conf.int=FALSE,
pval=FALSE,
risk.table=FALSE,
legend.labs=c("Group 1", "Group 2", "Group 3"),
legend.title="Groups:",
palette=c("steelblue", "grey", "black"),
title="Kaplan-Meier Curves",
risk.table.height=.20)## Call:
## survdiff(formula = Surv(Time, Event) ~ Group, data = survival_data)
##
## N Observed Expected (O-E)^2/E (O-E)^2/V
## Group=1 38 24 12.3 11.07 13.66
## Group=2 54 25 46.0 9.58 22.50
## Group=3 45 34 24.7 3.51 5.04
##
## Chisq= 25.7 on 2 degrees of freedom, p= 3e-06
Null Hypothesis: There is not a significant difference between the three drug groups in terms of survivability.
Alternative Hypothesis: There is a survival differential between the three drug groups (there is a significant difference between the three drug groups in terms of survivability).
The Chi-Squared test statistic is 27.6 with 2 degree of freedom and the corresponding p-value is less than .05 (p= 3e-06). Therefore we reject the null hypothesis, and we conclude that there is a survival differential between the three drug groups (there is a significant difference between the three drug groups in terms of survivability).
Group 1 and 2: Group 1 shows a drastic decrease in Survival probability within the first 500 units of time and an abrupt stop just after unit 1000. On the contrary, Group 2 maintains a higher Survival probability for the full duration of the study in comparison to Group with a controlled and gradual decrease during this period.
Group 2 and 3: Although both groups maintain a probability of Survival for the duration of the study, Group 3 shows a much more abrupt and less controlled decrease in Survival probability within the first 1000 units of time. Group 2 also maintains a higher Survival probability in comparison to Group 3.
Group 3 and 1: Both groups exhibit an abrupt decrease in Survival probability within the first 500 units of time; however, Group 3 maintains a higher probability during this first segment. Between 500 and 1000 units of time, Group 1’s Survival Probability improves; however, this is cut short when Group 1 looks to have no more participants. At this point, Group 3 then maintains a long lasting Survival probability for the remainder of the study.