DEM 7223 - Event History Analysis - Cox Regression Model

Author

Samson A. Olowolaju, MPH

Published

October 6, 2022

Import the Dataset

Code
tracts.texas.pharmacy <- import("tracts.texas.pharmacy.rds") %>% 
  mutate(phy_type=ifelse(phy_type %in%c("Community Indep","Community Independent"), "Community Indep",
                         ifelse(phy_type=="", "Unknown",
  ifelse(phy_type=="Comm Multi 5","Community Multi", 
         ifelse(phy_type%in%c("Hospital (Independent)"),"Indep Hospital",
                ifelse(phy_type=="Hospital (Multiple/Chain)", "Multi Hospital",
                       ifelse(phy_type=="Ambulatory Surgical Center","Amb Surg Center",phy_type))))))) %>% 
mutate(days=(as.numeric(status_date - lic_orig_date)), rurality=as.factor(rurality)) %>% 
 mutate(Otherpop=totPop19 - rowSums(.[39:41]),
        majoritypop= pmax(whitepop,blackpop,Hispop,Otherpop, na.rm = TRUE),
        pctminority=round((totPop19-majoritypop)/totPop19*100,1),
        phy_type=as.factor(ifelse(phy_type %in%c("Community Indep","Community Multi"),phy_type, NA))) %>% 
  filter(complete.cases(.))

Age at Pharmacy Closure in Texas

This analysis uses pharmacies’ age-at-closure as the outcome variable. I calculated age at pharmacy closure as the difference between the Pharmacy date of closure (Status_date) and their first date of operation(lic_orig_date). However, in the case of active pharmacies, I calculated their age using the censoring date of pharmacy closure follow-up. In this case, the latest follow-up date was December 31, 2021

Code
tracts.texas.pharmacy <- tracts.texas.pharmacy %>%
  mutate(death_age =age_months,
         d.event = ifelse(status == "Closed", 1, 0)) %>% 
  filter(complete.cases(.))
  
head(tracts.texas.pharmacy)

Since i am mainly interested in knowing if the hazard rate of pharmacy closure will increase or decrease giving the types of pharmacy and the location of the pharmacy(metro or non-metro), i will fit a cox proportional hazard model. For these models, the covariates are the types of pharmacy, residential characteristics of the pharmacy(metro or non metro, pctminority), and market features(pct. uninsured)

Fitting the Cox proportional Harzard Model

Code
fit <- coxph(Surv(death_age,d.event)~ phy_type+rurality+pctminority+pctpop65over,
             data=tracts.texas.pharmacy)

summary(fit)
Call:
coxph(formula = Surv(death_age, d.event) ~ phy_type + rurality + 
    pctminority + pctpop65over, data = tracts.texas.pharmacy)

  n= 6754, number of events= 1114 

                             coef exp(coef)  se(coef)       z Pr(>|z|)    
phy_typeCommunity Multi -1.899517  0.149641  0.072800 -26.092  < 2e-16 ***
ruralityNon-Metro       -0.798242  0.450120  0.122221  -6.531 6.53e-11 ***
pctminority              0.005632  1.005647  0.001879   2.997  0.00272 ** 
pctpop65over            -0.010382  0.989672  0.005285  -1.965  0.04947 *  
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

                        exp(coef) exp(-coef) lower .95 upper .95
phy_typeCommunity Multi    0.1496     6.6827    0.1297    0.1726
ruralityNon-Metro          0.4501     2.2216    0.3542    0.5720
pctminority                1.0056     0.9944    1.0020    1.0094
pctpop65over               0.9897     1.0104    0.9795    1.0000

Concordance= 0.737  (se = 0.008 )
Likelihood ratio test= 901.8  on 4 df,   p=<2e-16
Wald test            = 730.9  on 4 df,   p=<2e-16
Score (logrank) test = 924.2  on 4 df,   p=<2e-16

Fitting the Cox proportional Harzard Model with Interactions

Code
fit_interaction <- coxph(Surv(death_age,d.event)~ phy_type+rurality+pctminority+pctpop65over+phy_type:rurality,
             data=tracts.texas.pharmacy)

summary(fit_interaction)
Call:
coxph(formula = Surv(death_age, d.event) ~ phy_type + rurality + 
    pctminority + pctpop65over + phy_type:rurality, data = tracts.texas.pharmacy)

  n= 6754, number of events= 1114 

                                               coef exp(coef)  se(coef)       z
phy_typeCommunity Multi                   -1.963300  0.140394  0.076261 -25.744
ruralityNon-Metro                         -1.013839  0.362824  0.146874  -6.903
pctminority                                0.005594  1.005610  0.001878   2.979
pctpop65over                              -0.010143  0.989908  0.005278  -1.922
phy_typeCommunity Multi:ruralityNon-Metro  0.867042  2.379860  0.255381   3.395
                                          Pr(>|z|)    
phy_typeCommunity Multi                    < 2e-16 ***
ruralityNon-Metro                          5.1e-12 ***
pctminority                               0.002887 ** 
pctpop65over                              0.054639 .  
phy_typeCommunity Multi:ruralityNon-Metro 0.000686 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

                                          exp(coef) exp(-coef) lower .95
phy_typeCommunity Multi                      0.1404     7.1228    0.1209
ruralityNon-Metro                            0.3628     2.7562    0.2721
pctminority                                  1.0056     0.9944    1.0019
pctpop65over                                 0.9899     1.0102    0.9797
phy_typeCommunity Multi:ruralityNon-Metro    2.3799     0.4202    1.4427
                                          upper .95
phy_typeCommunity Multi                      0.1630
ruralityNon-Metro                            0.4839
pctminority                                  1.0093
pctpop65over                                 1.0002
phy_typeCommunity Multi:ruralityNon-Metro    3.9258

Concordance= 0.737  (se = 0.008 )
Likelihood ratio test= 912.2  on 5 df,   p=<2e-16
Wald test            = 759.8  on 5 df,   p=<2e-16
Score (logrank) test = 991.4  on 5 df,   p=<2e-16

Plotting Survival Function for Different Types of Community Pharmacies

Code
plot(survfit(fit, conf.int = F),
xlim=c(0, 90),
ylab="S(t)",
xlab="Months")
title (main = "Survival Plots for Community Pharmacy closure")

lines(survfit(fit,
newdata=data.frame(phy_type="Community Multi", rurality="Metro", pctminority=50,pctpop65over=10 ),
conf.int = F), col="red")

lines(survfit(fit,
newdata=data.frame(phy_type="Community Indep", rurality="Non-Metro", pctminority=10,pctpop65over=50),
conf.int = F), col="green")

legend("bottomleft",
legend=c("Means of Covariates",
"Community Multi, Metro, %minor 50,%pop65+ 10",
"Community Indep, Non-Metro, %minor 10,%pop65+ 50"),
lty=1, col=c(1,"red", "green"), cex=.8)

Plotting Cumulative Harzard Function for Different Types of Community Pharmacies

Code
sf1 <- survfit(fit)

sf2 <- survfit(fit,
newdata=data.frame(phy_type="Community Multi", 
                   rurality="Metro", 
                   pctminority=50,
                   pctpop65over=10 ))
sf3 <- survfit(fit,
newdata=data.frame(phy_type="Community Indep", 
                   rurality="Non-Metro",
                   pctminority=10,
                   pctpop65over=50))

H1<--log(sf1$surv)
H2<--log(sf2$surv)
H3<--log(sf3$surv)

times <- sf1$time
hs1<-loess(diff(c(0,H1))~times, degree=1, span=.25)
hs2<-loess(diff(c(0,H2))~times, degree=1, span=.25)
hs3<-loess(diff(c(0,H3))~times, degree=1, span=.25)

plot(predict(hs1)~times,type="l", ylab="smoothed h(t)", xlab="Months",
ylim=c(0, .06))
title(main="Smoothed hazard plots")
lines(predict(hs2)~times, type="l", col="red")
lines(predict(hs3)~times, type="l", col="green")

legend("topright",
legend=c("Means of Covariates",
"Community Multi, Metro, %minor 50,%pop65+ 10",
"Community Indep, Non-Metro, %minor 10,%pop65+ 50"),
lty=1, col=c(1,"red", "green"), cex=.8)

Checking Proportionality of Hazards in the model

Code
fit.check <- coxph(Surv(death_age,d.event)~ phy_type+rurality+pctminority+pctpop65over,
             data=tracts.texas.pharmacy)

summary(fit.check)
Call:
coxph(formula = Surv(death_age, d.event) ~ phy_type + rurality + 
    pctminority + pctpop65over, data = tracts.texas.pharmacy)

  n= 6754, number of events= 1114 

                             coef exp(coef)  se(coef)       z Pr(>|z|)    
phy_typeCommunity Multi -1.899517  0.149641  0.072800 -26.092  < 2e-16 ***
ruralityNon-Metro       -0.798242  0.450120  0.122221  -6.531 6.53e-11 ***
pctminority              0.005632  1.005647  0.001879   2.997  0.00272 ** 
pctpop65over            -0.010382  0.989672  0.005285  -1.965  0.04947 *  
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

                        exp(coef) exp(-coef) lower .95 upper .95
phy_typeCommunity Multi    0.1496     6.6827    0.1297    0.1726
ruralityNon-Metro          0.4501     2.2216    0.3542    0.5720
pctminority                1.0056     0.9944    1.0020    1.0094
pctpop65over               0.9897     1.0104    0.9795    1.0000

Concordance= 0.737  (se = 0.008 )
Likelihood ratio test= 901.8  on 4 df,   p=<2e-16
Wald test            = 730.9  on 4 df,   p=<2e-16
Score (logrank) test = 924.2  on 4 df,   p=<2e-16
Code
#testing proportional Hazard
cox.zph(fit.check)
               chisq df      p
phy_type     12.1066  1 0.0005
rurality      1.2528  1 0.2630
pctminority   0.0153  1 0.9015
pctpop65over  2.7894  1 0.0949
GLOBAL       15.1401  4 0.0044
Code
par(mfrow=c(2, 2)) 
plot(cox.zph(fit.check))

Evidence from both the formal test( using the weighted residuals) and the graphical representation suggested strong evidence of non-proportional hazards for types of pharmacy.

Checking for Linearity of the continuous Covariates Using Martingale Residuals

Code
res.mar<-resid(fit.check, type="martingale")
#plot vs pctminority
scatter.smooth(tracts.texas.pharmacy$pctminority, res.mar,degree = 2,
span = 1, ylab="Martingale Residual",
col=1, cex=.5, lpars=list(col = "red", lwd = 3))
title(main="Martingale Residuals for Percent of Minority in Phamarcy Tracts")

Code
#plot vs pctminority
scatter.smooth(tracts.texas.pharmacy$pctpop65over, res.mar,degree = 2,
span = 1, ylab="Martingale Residual",
col=1, cex=.5, lpars=list(col = "red", lwd = 3))
title(main="Martingale Residuals for Percent of Pop. over 65 in Phamarcy Tracts")

Given that the lines on these graphs fit the residual in a straight line, we can conclude that these covariates(pctminority and pctpop65over) have been modeled effectively in the model.