Part1 -

a. Fit a Kaplan-Meier survival model for the pbc2.id data for the covariate “sex” and show the K-M plot

km.model1 = survfit(Surv(pbc2.id$years, pbc2.id$status2 ) ~ pbc2.id$sex)
summary(km.model1)
## Call: survfit(formula = Surv(pbc2.id$years, pbc2.id$status2) ~ pbc2.id$sex)
## 
##                 pbc2.id$sex=male 
##    time n.risk n.event survival std.err lower 95% CI upper 95% CI
##   0.383     36       1    0.972  0.0274       0.9200        1.000
##   0.523     35       1    0.944  0.0382       0.8725        1.000
##   1.511     33       1    0.916  0.0465       0.8290        1.000
##   1.698     32       1    0.887  0.0532       0.7889        0.998
##   2.086     31       1    0.859  0.0586       0.7510        0.982
##   2.188     30       1    0.830  0.0633       0.7148        0.964
##   2.437     28       1    0.800  0.0676       0.6782        0.944
##   2.735     27       1    0.771  0.0713       0.6429        0.924
##   2.771     26       1    0.741  0.0745       0.6086        0.902
##   2.946     25       1    0.711  0.0772       0.5752        0.880
##   3.154     24       1    0.682  0.0794       0.5426        0.857
##   3.332     23       1    0.652  0.0813       0.5107        0.833
##   3.551     22       1    0.622  0.0829       0.4795        0.808
##   3.724     21       1    0.593  0.0840       0.4490        0.783
##   4.205     20       1    0.563  0.0849       0.4191        0.757
##   4.583     19       1    0.534  0.0855       0.3898        0.730
##   4.611     18       1    0.504  0.0857       0.3611        0.703
##   4.887     17       1    0.474  0.0856       0.3329        0.676
##   6.533     14       1    0.440  0.0860       0.3004        0.646
##   7.362     12       1    0.404  0.0863       0.2655        0.614
##   7.655     11       1    0.367  0.0859       0.2320        0.581
##   9.295      9       1    0.326  0.0855       0.1952        0.545
##  10.013      8       1    0.285  0.0840       0.1604        0.508
##  10.686      7       1    0.245  0.0813       0.1276        0.469
##  11.168      6       1    0.204  0.0773       0.0970        0.429
##  11.475      5       1    0.163  0.0718       0.0688        0.386
## 
##                 pbc2.id$sex=female 
##    time n.risk n.event survival std.err lower 95% CI upper 95% CI
##   0.112    276       1    0.996 0.00362        0.989        1.000
##   0.140    275       1    0.993 0.00511        0.983        1.000
##   0.194    274       1    0.989 0.00624        0.977        1.000
##   0.211    273       1    0.986 0.00719        0.972        1.000
##   0.301    272       1    0.982 0.00803        0.966        0.998
##   0.361    271       1    0.978 0.00878        0.961        0.996
##   0.375    270       1    0.975 0.00946        0.956        0.993
##   0.490    269       1    0.971 0.01010        0.951        0.991
##   0.509    268       1    0.967 0.01069        0.947        0.989
##   0.542    267       1    0.964 0.01125        0.942        0.986
##   0.567    266       1    0.960 0.01177        0.937        0.984
##   0.591    265       1    0.957 0.01228        0.933        0.981
##   0.611    264       1    0.953 0.01275        0.928        0.978
##   0.723    263       1    0.949 0.01321        0.924        0.976
##   0.756    262       1    0.946 0.01365        0.919        0.973
##   0.832    261       1    0.942 0.01407        0.915        0.970
##   0.879    260       1    0.938 0.01447        0.910        0.967
##   0.893    259       1    0.935 0.01486        0.906        0.964
##   0.914    258       1    0.931 0.01524        0.902        0.962
##   0.953    257       1    0.928 0.01561        0.897        0.959
##   1.065    256       1    0.924 0.01596        0.893        0.956
##   1.095    255       1    0.920 0.01630        0.889        0.953
##   1.259    254       1    0.917 0.01664        0.885        0.950
##   1.410    253       1    0.913 0.01696        0.880        0.947
##   1.503    252       1    0.909 0.01728        0.876        0.944
##   1.635    251       1    0.906 0.01758        0.872        0.941
##   1.843    250       1    0.902 0.01788        0.868        0.938
##   1.900    249       1    0.899 0.01817        0.864        0.935
##   1.938    248       1    0.895 0.01846        0.859        0.932
##   2.007    247       1    0.891 0.01874        0.855        0.929
##   2.053    245       1    0.888 0.01901        0.851        0.926
##   2.105    244       1    0.884 0.01928        0.847        0.923
##   2.152    243       1    0.880 0.01954        0.843        0.920
##   2.163    242       1    0.877 0.01979        0.839        0.916
##   2.182    241       1    0.873 0.02004        0.835        0.913
##   2.256    240       1    0.869 0.02029        0.831        0.910
##   2.327    238       1    0.866 0.02053        0.827        0.907
##   2.335    237       1    0.862 0.02076        0.822        0.904
##   2.352    236       1    0.859 0.02099        0.818        0.901
##   2.475    234       1    0.855 0.02122        0.814        0.897
##   2.546    233       1    0.851 0.02145        0.810        0.894
##   2.587    231       1    0.847 0.02167        0.806        0.891
##   2.604    230       1    0.844 0.02188        0.802        0.888
##   2.659    229       1    0.840 0.02210        0.798        0.885
##   2.667    228       1    0.836 0.02230        0.794        0.881
##   2.738    227       1    0.833 0.02251        0.790        0.878
##   2.839    226       1    0.829 0.02271        0.786        0.875
##   2.957    224       1    0.825 0.02291        0.782        0.872
##   2.965    223       1    0.822 0.02310        0.778        0.868
##   3.190    221       1    0.818 0.02329        0.774        0.865
##   3.203    220       1    0.814 0.02348        0.769        0.862
##   3.261    219       2    0.807 0.02385        0.761        0.855
##   3.318    217       1    0.803 0.02403        0.757        0.852
##   3.381    216       1    0.799 0.02420        0.753        0.848
##   3.696    214       1    0.796 0.02438        0.749        0.845
##   3.713    213       1    0.792 0.02455        0.745        0.841
##   3.863    212       1    0.788 0.02471        0.741        0.838
##   3.907    211       1    0.784 0.02488        0.737        0.835
##   3.926    210       1    0.781 0.02504        0.733        0.831
##   3.954    208       1    0.777 0.02520        0.729        0.828
##   4.071    205       1    0.773 0.02536        0.725        0.824
##   4.085    204       1    0.769 0.02551        0.721        0.821
##   4.315    199       1    0.765 0.02568        0.717        0.818
##   4.372    198       1    0.762 0.02584        0.713        0.814
##   4.537    195       1    0.758 0.02600        0.708        0.810
##   4.627    192       2    0.750 0.02632        0.700        0.803
##   4.767    190       1    0.746 0.02648        0.696        0.800
##   4.890    189       1    0.742 0.02663        0.692        0.796
##   5.002    186       1    0.738 0.02678        0.687        0.792
##   5.057    185       1    0.734 0.02693        0.683        0.789
##   5.131    183       1    0.730 0.02708        0.679        0.785
##   5.197    180       1    0.726 0.02723        0.674        0.781
##   5.271    178       1    0.722 0.02738        0.670        0.778
##   5.596    166       1    0.717 0.02756        0.665        0.774
##   5.626    165       1    0.713 0.02774        0.661        0.770
##   5.698    162       1    0.709 0.02791        0.656        0.766
##   5.722    159       1    0.704 0.02809        0.651        0.762
##   5.826    155       1    0.700 0.02827        0.646        0.757
##   6.089    151       1    0.695 0.02846        0.641        0.753
##   6.177    148       1    0.690 0.02866        0.636        0.749
##   6.264    144       1    0.686 0.02886        0.631        0.745
##   6.289    143       1    0.681 0.02905        0.626        0.740
##   6.571    133       1    0.676 0.02928        0.621        0.736
##   6.623    131       1    0.671 0.02951        0.615        0.731
##   6.752    125       1    0.665 0.02975        0.609        0.726
##   6.853    120       1    0.660 0.03002        0.603        0.721
##   6.954    119       1    0.654 0.03027        0.597        0.716
##   6.995    118       1    0.649 0.03052        0.591        0.711
##   7.072    116       1    0.643 0.03076        0.585        0.706
##   7.113    115       1    0.637 0.03100        0.579        0.701
##   7.581    105       1    0.631 0.03129        0.573        0.696
##   7.795    100       1    0.625 0.03161        0.566        0.690
##   7.891     98       1    0.619 0.03193        0.559        0.684
##   8.422     86       1    0.611 0.03235        0.551        0.678
##   8.449     85       2    0.597 0.03315        0.535        0.666
##   8.460     83       1    0.590 0.03352        0.528        0.659
##   8.679     78       1    0.582 0.03394        0.519        0.653
##   8.822     75       1    0.574 0.03436        0.511        0.646
##   8.882     69       1    0.566 0.03486        0.502        0.639
##   8.986     65       1    0.557 0.03539        0.492        0.631
##   9.194     61       1    0.548 0.03597        0.482        0.624
##   9.260     59       1    0.539 0.03654        0.472        0.616
##   9.386     56       1    0.529 0.03714        0.461        0.607
##   9.432     54       1    0.520 0.03772        0.451        0.599
##   9.785     47       1    0.509 0.03850        0.438        0.590
##   9.813     46       1    0.497 0.03922        0.426        0.581
##  10.084     42       1    0.486 0.04004        0.413        0.571
##  10.300     39       1    0.473 0.04090        0.399        0.561
##  10.511     34       1    0.459 0.04200        0.384        0.549
##  10.549     33       1    0.445 0.04297        0.369        0.538
##  13.892      6       1    0.371 0.07664        0.248        0.556
ggsurvplot(km.model1, pbc2.id, conf.int = F)

b. Do you see any right censored data. Please explain based on the K-M plot.
Based on the K-M plot, there is right censored data. Right censored data is more prominent in the females, based on the K-M model. There are a few right censored data for males

c. Is the model significant? Perform the statistical test and answer if the survival time is same between sex groups.
1. Null hypothesis: H0: The survival for the male group is same as the survival for the female group.
Alternative hypothesis: The survival for the two groups are not the same.
2. The statistical test used was log-rank test
3.If the p-value is less than α = 0.05, we will reject the null-hypothesis and infer that there is significant proof that not all groups are similar
4. Perform calculations:

survdiff(Surv(pbc2.id$years, pbc2.id$status2 ) ~ pbc2.id$sex, data = pbc2.id)
## Call:
## survdiff(formula = Surv(pbc2.id$years, pbc2.id$status2) ~ pbc2.id$sex, 
##     data = pbc2.id)
## 
##                      N Observed Expected (O-E)^2/E (O-E)^2/V
## pbc2.id$sex=male    36       26       15     8.158      9.19
## pbc2.id$sex=female 276      114      125     0.976      9.19
## 
##  Chisq= 9.2  on 1 degrees of freedom, p= 0.002
  1. Conclusion: Based on the p-value of p =0.002, we can reject the null hypothesis at alpha = 0.05. The survival between the groups are not similar.
Part2 –

d. Fit a Cox proportional hazards model using covariates age, sex, drug, hepatomegaly, serChol and platelets.

res.cox1 = coxph(Surv(years, status2) ~ age + sex + drug + hepatomegaly + serChol + platelets, data = pbc2.id)
e. Report the summary of the cox regression and your interpretation.
Age has a p-value of 5.00e-08. Hepatomegaly has a p-value of 5.17e-05. SerChol has a p-value of 0.000146. Based on these p-values, we can conclude that age, hepatomegaly, and serChol are all significant and all impact the hazard rate.

The likelihood ratio test, Wald test, and Score test all have p-values less than alpha which add further proof to the model’s significance.

summary(res.cox1)
## Call:
## coxph(formula = Surv(years, status2) ~ age + sex + drug + hepatomegaly + 
##     serChol + platelets, data = pbc2.id)
## 
##   n= 280, number of events= 126 
##    (32 observations deleted due to missingness)
## 
##                       coef  exp(coef)   se(coef)      z Pr(>|z|)    
## age              0.0526430  1.0540533  0.0096570  5.451 5.00e-08 ***
## sexfemale       -0.1939188  0.8237248  0.2388347 -0.812 0.416827    
## drugD-penicil   -0.0890058  0.9148403  0.1848501 -0.482 0.630159    
## hepatomegalyYes  0.8075345  2.2423726  0.1994883  4.048 5.17e-05 ***
## serChol          0.0011769  1.0011776  0.0003099  3.798 0.000146 ***
## platelets       -0.0016638  0.9983376  0.0010347 -1.608 0.107835    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
##                 exp(coef) exp(-coef) lower .95 upper .95
## age                1.0541     0.9487    1.0343     1.074
## sexfemale          0.8237     1.2140    0.5158     1.315
## drugD-penicil      0.9148     1.0931    0.6368     1.314
## hepatomegalyYes    2.2424     0.4460    1.5167     3.315
## serChol            1.0012     0.9988    1.0006     1.002
## platelets          0.9983     1.0017    0.9963     1.000
## 
## Concordance= 0.734  (se = 0.02 )
## Likelihood ratio test= 72.68  on 6 df,   p=1e-13
## Wald test            = 72.4  on 6 df,   p=1e-13
## Score (logrank) test = 76.46  on 6 df,   p=2e-14