Part1 -
a. Fit a Kaplan-Meier survival model for the pbc2.id data for the
covariate “sex” and show the K-M plot
km.model1 = survfit(Surv(pbc2.id$years, pbc2.id$status2 ) ~ pbc2.id$sex)
summary(km.model1)
## Call: survfit(formula = Surv(pbc2.id$years, pbc2.id$status2) ~ pbc2.id$sex)
##
## pbc2.id$sex=male
## time n.risk n.event survival std.err lower 95% CI upper 95% CI
## 0.383 36 1 0.972 0.0274 0.9200 1.000
## 0.523 35 1 0.944 0.0382 0.8725 1.000
## 1.511 33 1 0.916 0.0465 0.8290 1.000
## 1.698 32 1 0.887 0.0532 0.7889 0.998
## 2.086 31 1 0.859 0.0586 0.7510 0.982
## 2.188 30 1 0.830 0.0633 0.7148 0.964
## 2.437 28 1 0.800 0.0676 0.6782 0.944
## 2.735 27 1 0.771 0.0713 0.6429 0.924
## 2.771 26 1 0.741 0.0745 0.6086 0.902
## 2.946 25 1 0.711 0.0772 0.5752 0.880
## 3.154 24 1 0.682 0.0794 0.5426 0.857
## 3.332 23 1 0.652 0.0813 0.5107 0.833
## 3.551 22 1 0.622 0.0829 0.4795 0.808
## 3.724 21 1 0.593 0.0840 0.4490 0.783
## 4.205 20 1 0.563 0.0849 0.4191 0.757
## 4.583 19 1 0.534 0.0855 0.3898 0.730
## 4.611 18 1 0.504 0.0857 0.3611 0.703
## 4.887 17 1 0.474 0.0856 0.3329 0.676
## 6.533 14 1 0.440 0.0860 0.3004 0.646
## 7.362 12 1 0.404 0.0863 0.2655 0.614
## 7.655 11 1 0.367 0.0859 0.2320 0.581
## 9.295 9 1 0.326 0.0855 0.1952 0.545
## 10.013 8 1 0.285 0.0840 0.1604 0.508
## 10.686 7 1 0.245 0.0813 0.1276 0.469
## 11.168 6 1 0.204 0.0773 0.0970 0.429
## 11.475 5 1 0.163 0.0718 0.0688 0.386
##
## pbc2.id$sex=female
## time n.risk n.event survival std.err lower 95% CI upper 95% CI
## 0.112 276 1 0.996 0.00362 0.989 1.000
## 0.140 275 1 0.993 0.00511 0.983 1.000
## 0.194 274 1 0.989 0.00624 0.977 1.000
## 0.211 273 1 0.986 0.00719 0.972 1.000
## 0.301 272 1 0.982 0.00803 0.966 0.998
## 0.361 271 1 0.978 0.00878 0.961 0.996
## 0.375 270 1 0.975 0.00946 0.956 0.993
## 0.490 269 1 0.971 0.01010 0.951 0.991
## 0.509 268 1 0.967 0.01069 0.947 0.989
## 0.542 267 1 0.964 0.01125 0.942 0.986
## 0.567 266 1 0.960 0.01177 0.937 0.984
## 0.591 265 1 0.957 0.01228 0.933 0.981
## 0.611 264 1 0.953 0.01275 0.928 0.978
## 0.723 263 1 0.949 0.01321 0.924 0.976
## 0.756 262 1 0.946 0.01365 0.919 0.973
## 0.832 261 1 0.942 0.01407 0.915 0.970
## 0.879 260 1 0.938 0.01447 0.910 0.967
## 0.893 259 1 0.935 0.01486 0.906 0.964
## 0.914 258 1 0.931 0.01524 0.902 0.962
## 0.953 257 1 0.928 0.01561 0.897 0.959
## 1.065 256 1 0.924 0.01596 0.893 0.956
## 1.095 255 1 0.920 0.01630 0.889 0.953
## 1.259 254 1 0.917 0.01664 0.885 0.950
## 1.410 253 1 0.913 0.01696 0.880 0.947
## 1.503 252 1 0.909 0.01728 0.876 0.944
## 1.635 251 1 0.906 0.01758 0.872 0.941
## 1.843 250 1 0.902 0.01788 0.868 0.938
## 1.900 249 1 0.899 0.01817 0.864 0.935
## 1.938 248 1 0.895 0.01846 0.859 0.932
## 2.007 247 1 0.891 0.01874 0.855 0.929
## 2.053 245 1 0.888 0.01901 0.851 0.926
## 2.105 244 1 0.884 0.01928 0.847 0.923
## 2.152 243 1 0.880 0.01954 0.843 0.920
## 2.163 242 1 0.877 0.01979 0.839 0.916
## 2.182 241 1 0.873 0.02004 0.835 0.913
## 2.256 240 1 0.869 0.02029 0.831 0.910
## 2.327 238 1 0.866 0.02053 0.827 0.907
## 2.335 237 1 0.862 0.02076 0.822 0.904
## 2.352 236 1 0.859 0.02099 0.818 0.901
## 2.475 234 1 0.855 0.02122 0.814 0.897
## 2.546 233 1 0.851 0.02145 0.810 0.894
## 2.587 231 1 0.847 0.02167 0.806 0.891
## 2.604 230 1 0.844 0.02188 0.802 0.888
## 2.659 229 1 0.840 0.02210 0.798 0.885
## 2.667 228 1 0.836 0.02230 0.794 0.881
## 2.738 227 1 0.833 0.02251 0.790 0.878
## 2.839 226 1 0.829 0.02271 0.786 0.875
## 2.957 224 1 0.825 0.02291 0.782 0.872
## 2.965 223 1 0.822 0.02310 0.778 0.868
## 3.190 221 1 0.818 0.02329 0.774 0.865
## 3.203 220 1 0.814 0.02348 0.769 0.862
## 3.261 219 2 0.807 0.02385 0.761 0.855
## 3.318 217 1 0.803 0.02403 0.757 0.852
## 3.381 216 1 0.799 0.02420 0.753 0.848
## 3.696 214 1 0.796 0.02438 0.749 0.845
## 3.713 213 1 0.792 0.02455 0.745 0.841
## 3.863 212 1 0.788 0.02471 0.741 0.838
## 3.907 211 1 0.784 0.02488 0.737 0.835
## 3.926 210 1 0.781 0.02504 0.733 0.831
## 3.954 208 1 0.777 0.02520 0.729 0.828
## 4.071 205 1 0.773 0.02536 0.725 0.824
## 4.085 204 1 0.769 0.02551 0.721 0.821
## 4.315 199 1 0.765 0.02568 0.717 0.818
## 4.372 198 1 0.762 0.02584 0.713 0.814
## 4.537 195 1 0.758 0.02600 0.708 0.810
## 4.627 192 2 0.750 0.02632 0.700 0.803
## 4.767 190 1 0.746 0.02648 0.696 0.800
## 4.890 189 1 0.742 0.02663 0.692 0.796
## 5.002 186 1 0.738 0.02678 0.687 0.792
## 5.057 185 1 0.734 0.02693 0.683 0.789
## 5.131 183 1 0.730 0.02708 0.679 0.785
## 5.197 180 1 0.726 0.02723 0.674 0.781
## 5.271 178 1 0.722 0.02738 0.670 0.778
## 5.596 166 1 0.717 0.02756 0.665 0.774
## 5.626 165 1 0.713 0.02774 0.661 0.770
## 5.698 162 1 0.709 0.02791 0.656 0.766
## 5.722 159 1 0.704 0.02809 0.651 0.762
## 5.826 155 1 0.700 0.02827 0.646 0.757
## 6.089 151 1 0.695 0.02846 0.641 0.753
## 6.177 148 1 0.690 0.02866 0.636 0.749
## 6.264 144 1 0.686 0.02886 0.631 0.745
## 6.289 143 1 0.681 0.02905 0.626 0.740
## 6.571 133 1 0.676 0.02928 0.621 0.736
## 6.623 131 1 0.671 0.02951 0.615 0.731
## 6.752 125 1 0.665 0.02975 0.609 0.726
## 6.853 120 1 0.660 0.03002 0.603 0.721
## 6.954 119 1 0.654 0.03027 0.597 0.716
## 6.995 118 1 0.649 0.03052 0.591 0.711
## 7.072 116 1 0.643 0.03076 0.585 0.706
## 7.113 115 1 0.637 0.03100 0.579 0.701
## 7.581 105 1 0.631 0.03129 0.573 0.696
## 7.795 100 1 0.625 0.03161 0.566 0.690
## 7.891 98 1 0.619 0.03193 0.559 0.684
## 8.422 86 1 0.611 0.03235 0.551 0.678
## 8.449 85 2 0.597 0.03315 0.535 0.666
## 8.460 83 1 0.590 0.03352 0.528 0.659
## 8.679 78 1 0.582 0.03394 0.519 0.653
## 8.822 75 1 0.574 0.03436 0.511 0.646
## 8.882 69 1 0.566 0.03486 0.502 0.639
## 8.986 65 1 0.557 0.03539 0.492 0.631
## 9.194 61 1 0.548 0.03597 0.482 0.624
## 9.260 59 1 0.539 0.03654 0.472 0.616
## 9.386 56 1 0.529 0.03714 0.461 0.607
## 9.432 54 1 0.520 0.03772 0.451 0.599
## 9.785 47 1 0.509 0.03850 0.438 0.590
## 9.813 46 1 0.497 0.03922 0.426 0.581
## 10.084 42 1 0.486 0.04004 0.413 0.571
## 10.300 39 1 0.473 0.04090 0.399 0.561
## 10.511 34 1 0.459 0.04200 0.384 0.549
## 10.549 33 1 0.445 0.04297 0.369 0.538
## 13.892 6 1 0.371 0.07664 0.248 0.556
ggsurvplot(km.model1, pbc2.id, conf.int = F)
b. Do you see any right censored data. Please explain based on
the K-M plot.
Based on the K-M plot, there is right censored
data. Right censored data is more prominent in the females, based on the
K-M model. There are a few right censored data for males
c. Is the model significant? Perform the statistical test and
answer if the survival time is same between sex groups.
1. Null
hypothesis: H0: The survival for the male group is same as the survival
for the female group.
Alternative hypothesis: The survival for the
two groups are not the same.
2. The statistical test used was
log-rank test
3.If the p-value is less than α = 0.05, we will reject
the null-hypothesis and infer that there is significant proof that not
all groups are similar
4. Perform calculations:
survdiff(Surv(pbc2.id$years, pbc2.id$status2 ) ~ pbc2.id$sex, data = pbc2.id)
## Call:
## survdiff(formula = Surv(pbc2.id$years, pbc2.id$status2) ~ pbc2.id$sex,
## data = pbc2.id)
##
## N Observed Expected (O-E)^2/E (O-E)^2/V
## pbc2.id$sex=male 36 26 15 8.158 9.19
## pbc2.id$sex=female 276 114 125 0.976 9.19
##
## Chisq= 9.2 on 1 degrees of freedom, p= 0.002
d. Fit a Cox proportional hazards model using covariates age,
sex, drug, hepatomegaly, serChol and platelets.
res.cox1 = coxph(Surv(years, status2) ~ age + sex + drug + hepatomegaly + serChol + platelets, data = pbc2.id)
e. Report the summary of the cox regression and your
interpretation.
The likelihood ratio test, Wald test, and Score test all have p-values less than alpha which add further proof to the model’s significance.
summary(res.cox1)
## Call:
## coxph(formula = Surv(years, status2) ~ age + sex + drug + hepatomegaly +
## serChol + platelets, data = pbc2.id)
##
## n= 280, number of events= 126
## (32 observations deleted due to missingness)
##
## coef exp(coef) se(coef) z Pr(>|z|)
## age 0.0526430 1.0540533 0.0096570 5.451 5.00e-08 ***
## sexfemale -0.1939188 0.8237248 0.2388347 -0.812 0.416827
## drugD-penicil -0.0890058 0.9148403 0.1848501 -0.482 0.630159
## hepatomegalyYes 0.8075345 2.2423726 0.1994883 4.048 5.17e-05 ***
## serChol 0.0011769 1.0011776 0.0003099 3.798 0.000146 ***
## platelets -0.0016638 0.9983376 0.0010347 -1.608 0.107835
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## exp(coef) exp(-coef) lower .95 upper .95
## age 1.0541 0.9487 1.0343 1.074
## sexfemale 0.8237 1.2140 0.5158 1.315
## drugD-penicil 0.9148 1.0931 0.6368 1.314
## hepatomegalyYes 2.2424 0.4460 1.5167 3.315
## serChol 1.0012 0.9988 1.0006 1.002
## platelets 0.9983 1.0017 0.9963 1.000
##
## Concordance= 0.734 (se = 0.02 )
## Likelihood ratio test= 72.68 on 6 df, p=1e-13
## Wald test = 72.4 on 6 df, p=1e-13
## Score (logrank) test = 76.46 on 6 df, p=2e-14