class: center, middle, inverse, title-slide # Statistics with R ## Statistical Modelling with R for Actuarial Students --- <style type="text/css"> pre { background: #ADD8E6; max-width: 100%; overflow-x: scroll; } </style> ***CS2B – Risk Modelling and Survival Analysis *** * The emphasis is placed on being able to apply statistical methods to actuarial problems using real data sets and the open-source software environment R. * Time Series. Probability Distributions, Survival Analysis --- This is the second part of a demonstration on Cox Proportional Hazard Models with R ### R Package ```r library(survival) ``` ### Data used in this example ```r mortalitydata <- read.csv(file = "CS2B_Apr_21_Qu_3_Data.csv", head = TRUE) ``` --- ### Part 6 Estimate a Cox proportional hazards model with death as the event of interest using the two covariates, drug and gender, with no interaction term, pasting your results into your answer script. You should use the Breslow method for tie handling. ```r m1 <- coxph( Surv(mortalitydata$Time, mortalitydata$Status) ~ Drug + Gender, data = mortalitydata, ties = "breslow") ``` --- ```r summary(m1) ``` ``` ## Call: ## coxph(formula = Surv(mortalitydata$Time, mortalitydata$Status) ~ ## Drug + Gender, data = mortalitydata, ties = "breslow") ## ## n= 4400, number of events= 3441 ## ## coef exp(coef) se(coef) z Pr(>|z|) ## Drug -0.35174 0.70347 0.03439 -10.23 <2e-16 *** ## Gender -0.93263 0.39352 0.03575 -26.09 <2e-16 *** ## --- ## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 ## ## exp(coef) exp(-coef) lower .95 upper .95 ## Drug 0.7035 1.422 0.6576 0.7525 ## Gender 0.3935 2.541 0.3669 0.4221 ## ## Concordance= 0.638 (se = 0.005 ) ## Likelihood ratio test= 820.3 on 2 df, p=<2e-16 ## Wald test = 813.4 on 2 df, p=<2e-16 ## Score (logrank) test = 859.4 on 2 df, p=<2e-16 ``` --- ```r ## Check for violation of proportional hazard (constant HR over time) (res.zph1 <- cox.zph(m1)) ``` ``` ## chisq df p ## Drug 4.32 1 0.038 ## Gender 1.63 1 0.202 ## GLOBAL 6.37 2 0.041 ``` --- ```r ## Displays a graph of the scaled Schoenfeld residuals, along with a smooth curve. par(mfrow=c(1,2)) plot(res.zph1,pch=18,col="red") ``` <!-- --> ```r par(mfrow=c(1,1)) ``` --- ### Part 7 Comment on the results produced in part (6) with reference to the effects of the two covariates, drug and gender, on the mortality rate. #### Exercise 7 * The results suggest that MediCo reduces the mortality rate of patients by around 30% * The results suggest that female lives’ mortality rates are around 40% of that of males * However, the graphs from part 4 suggest that an interaction term may be required in the Cox model to fully analyse the effects of MediCo and Gender on the mortality rate, and hence the above analysis is potentially misleading * The results suggest that the Gender covariate appears to have a greater effect on the mortality rate than the Drug covariate * Both p-values indicate that the coefficients are unlikely to be 0 --- ### Exercise 8 Update the Cox proportional hazards model in part 6. to include an interaction term between drug and gender, pasting your results into your answer script. ```r m2 <- coxph( Surv(mortalitydata$Time, mortalitydata$Status) ~ Drug*Gender, data = mortalitydata, ties = "breslow" ) ``` --- ```r summary(m2) ``` ``` ## Call: ## coxph(formula = Surv(mortalitydata$Time, mortalitydata$Status) ~ ## Drug * Gender, data = mortalitydata, ties = "breslow") ## ## n= 4400, number of events= 3441 ## ## coef exp(coef) se(coef) z Pr(>|z|) ## Drug -0.07143 0.93106 0.04515 -1.582 0.114 ## Gender -0.63564 0.52960 0.04711 -13.494 <2e-16 *** ## Drug:Gender -0.65874 0.51750 0.06977 -9.442 <2e-16 *** ## --- ## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 ## ## exp(coef) exp(-coef) lower .95 upper .95 ## Drug 0.9311 1.074 0.8522 1.0172 ## Gender 0.5296 1.888 0.4829 0.5808 ## Drug:Gender 0.5175 1.932 0.4514 0.5933 ## ## Concordance= 0.638 (se = 0.005 ) ## Likelihood ratio test= 910.2 on 3 df, p=<2e-16 ## Wald test = 811.2 on 3 df, p=<2e-16 ## Score (logrank) test = 888.8 on 3 df, p=<2e-16 ``` --- ### Part 9 Analyse the effectiveness of MediCo, commenting on any differences between males and females. #### Solution * MediCo reduces male mortality rates by 1 – 0.93106 = 6.9% which is not statistically significant * and reduces female mortality rates by 1 – 0.93106 * 0.51750 = 51.8% which is statistically significant * The results from the Cox analysis are consistent with the Kaplan–Meier plots in part 4. --- The likelihood ratio test statistic for the model with the interaction term compared with the model without the interaction term is: $910.2 – 820.3 = 89.9 $ OR ```r L1 = m1$loglik L2 = m2$loglik -2 *(L1 - L2) ``` ``` ## [1] 0.0000 89.8897 ``` (where m1 = Cox model fitted in part (6) and m2 = Cox model fitted in part 8) which is highly significant under the chi-squared distribution with one degree of freedom. --- ---