DEM 7223 - Event History Analysis - Cox Regression Model
Import the Dataset
Code
<- import("tracts.texas.pharmacy.rds") %>%
tracts.texas.pharmacy mutate(phy_type=ifelse(phy_type %in%c("Community Indep","Community Independent"), "Community Indep",
ifelse(phy_type=="", "Unknown",
ifelse(phy_type=="Comm Multi 5","Community Multi",
ifelse(phy_type%in%c("Hospital (Independent)"),"Indep Hospital",
ifelse(phy_type=="Hospital (Multiple/Chain)", "Multi Hospital",
ifelse(phy_type=="Ambulatory Surgical Center","Amb Surg Center",phy_type))))))) %>%
mutate(days=(as.numeric(status_date - lic_orig_date)), rurality=as.factor(rurality)) %>%
mutate(Otherpop=totPop19 - rowSums(.[39:41]),
majoritypop= pmax(whitepop,blackpop,Hispop,Otherpop, na.rm = TRUE),
pctminority=round((totPop19-majoritypop)/totPop19*100,1),
phy_type=as.factor(ifelse(phy_type %in%c("Community Indep","Community Multi"),phy_type, NA))) %>%
filter(complete.cases(.))
Age at Pharmacy Closure in Texas
This analysis uses pharmacies’ age-at-closure as the outcome variable. I calculated age at pharmacy closure as the difference between the Pharmacy date of closure (Status_date) and their first date of operation(lic_orig_date). However, in the case of active pharmacies, I calculated their age using the censoring date of pharmacy closure follow-up. In this case, the latest follow-up date was December 31, 2021
Code
<- tracts.texas.pharmacy %>%
tracts.texas.pharmacy mutate(death_age =age_months,
d.event = ifelse(status == "Closed", 1, 0)) %>%
filter(complete.cases(.))
head(tracts.texas.pharmacy)
Since i am mainly interested in knowing if the hazard rate of pharmacy closure will increase or decrease giving the types of pharmacy and the location of the pharmacy(metro or non-metro), i will fit a cox proportional hazard model. For these models, the covariates are the types of pharmacy, residential characteristics of the pharmacy(metro or non metro, pctminority), and market features(pct. uninsured)
Fitting the Cox proportional Harzard Model
Code
<- coxph(Surv(death_age,d.event)~ phy_type+rurality+pctminority+pctpop65over,
fit data=tracts.texas.pharmacy)
summary(fit)
Call:
coxph(formula = Surv(death_age, d.event) ~ phy_type + rurality +
pctminority + pctpop65over, data = tracts.texas.pharmacy)
n= 6754, number of events= 1114
coef exp(coef) se(coef) z Pr(>|z|)
phy_typeCommunity Multi -1.899517 0.149641 0.072800 -26.092 < 2e-16 ***
ruralityNon-Metro -0.798242 0.450120 0.122221 -6.531 6.53e-11 ***
pctminority 0.005632 1.005647 0.001879 2.997 0.00272 **
pctpop65over -0.010382 0.989672 0.005285 -1.965 0.04947 *
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
exp(coef) exp(-coef) lower .95 upper .95
phy_typeCommunity Multi 0.1496 6.6827 0.1297 0.1726
ruralityNon-Metro 0.4501 2.2216 0.3542 0.5720
pctminority 1.0056 0.9944 1.0020 1.0094
pctpop65over 0.9897 1.0104 0.9795 1.0000
Concordance= 0.737 (se = 0.008 )
Likelihood ratio test= 901.8 on 4 df, p=<2e-16
Wald test = 730.9 on 4 df, p=<2e-16
Score (logrank) test = 924.2 on 4 df, p=<2e-16
Fitting the Cox proportional Harzard Model with Interactions
Code
<- coxph(Surv(death_age,d.event)~ phy_type+rurality+pctminority+pctpop65over+phy_type:rurality,
fit_interaction data=tracts.texas.pharmacy)
summary(fit_interaction)
Call:
coxph(formula = Surv(death_age, d.event) ~ phy_type + rurality +
pctminority + pctpop65over + phy_type:rurality, data = tracts.texas.pharmacy)
n= 6754, number of events= 1114
coef exp(coef) se(coef) z
phy_typeCommunity Multi -1.963300 0.140394 0.076261 -25.744
ruralityNon-Metro -1.013839 0.362824 0.146874 -6.903
pctminority 0.005594 1.005610 0.001878 2.979
pctpop65over -0.010143 0.989908 0.005278 -1.922
phy_typeCommunity Multi:ruralityNon-Metro 0.867042 2.379860 0.255381 3.395
Pr(>|z|)
phy_typeCommunity Multi < 2e-16 ***
ruralityNon-Metro 5.1e-12 ***
pctminority 0.002887 **
pctpop65over 0.054639 .
phy_typeCommunity Multi:ruralityNon-Metro 0.000686 ***
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
exp(coef) exp(-coef) lower .95
phy_typeCommunity Multi 0.1404 7.1228 0.1209
ruralityNon-Metro 0.3628 2.7562 0.2721
pctminority 1.0056 0.9944 1.0019
pctpop65over 0.9899 1.0102 0.9797
phy_typeCommunity Multi:ruralityNon-Metro 2.3799 0.4202 1.4427
upper .95
phy_typeCommunity Multi 0.1630
ruralityNon-Metro 0.4839
pctminority 1.0093
pctpop65over 1.0002
phy_typeCommunity Multi:ruralityNon-Metro 3.9258
Concordance= 0.737 (se = 0.008 )
Likelihood ratio test= 912.2 on 5 df, p=<2e-16
Wald test = 759.8 on 5 df, p=<2e-16
Score (logrank) test = 991.4 on 5 df, p=<2e-16
Plotting Survival Function for Different Types of Community Pharmacies
Code
plot(survfit(fit, conf.int = F),
xlim=c(0, 90),
ylab="S(t)",
xlab="Months")
title (main = "Survival Plots for Community Pharmacy closure")
lines(survfit(fit,
newdata=data.frame(phy_type="Community Multi", rurality="Metro", pctminority=50,pctpop65over=10 ),
conf.int = F), col="red")
lines(survfit(fit,
newdata=data.frame(phy_type="Community Indep", rurality="Non-Metro", pctminority=10,pctpop65over=50),
conf.int = F), col="green")
legend("bottomleft",
legend=c("Means of Covariates",
"Community Multi, Metro, %minor 50,%pop65+ 10",
"Community Indep, Non-Metro, %minor 10,%pop65+ 50"),
lty=1, col=c(1,"red", "green"), cex=.8)
Plotting Cumulative Harzard Function for Different Types of Community Pharmacies
Code
<- survfit(fit)
sf1
<- survfit(fit,
sf2 newdata=data.frame(phy_type="Community Multi",
rurality="Metro",
pctminority=50,
pctpop65over=10 ))
<- survfit(fit,
sf3 newdata=data.frame(phy_type="Community Indep",
rurality="Non-Metro",
pctminority=10,
pctpop65over=50))
<--log(sf1$surv)
H1<--log(sf2$surv)
H2<--log(sf3$surv)
H3
<- sf1$time
times <-loess(diff(c(0,H1))~times, degree=1, span=.25)
hs1<-loess(diff(c(0,H2))~times, degree=1, span=.25)
hs2<-loess(diff(c(0,H3))~times, degree=1, span=.25)
hs3
plot(predict(hs1)~times,type="l", ylab="smoothed h(t)", xlab="Months",
ylim=c(0, .06))
title(main="Smoothed hazard plots")
lines(predict(hs2)~times, type="l", col="red")
lines(predict(hs3)~times, type="l", col="green")
legend("topright",
legend=c("Means of Covariates",
"Community Multi, Metro, %minor 50,%pop65+ 10",
"Community Indep, Non-Metro, %minor 10,%pop65+ 50"),
lty=1, col=c(1,"red", "green"), cex=.8)
Checking Proportionality of Hazards in the model
Code
<- coxph(Surv(death_age,d.event)~ phy_type+rurality+pctminority+pctpop65over,
fit.check data=tracts.texas.pharmacy)
summary(fit.check)
Call:
coxph(formula = Surv(death_age, d.event) ~ phy_type + rurality +
pctminority + pctpop65over, data = tracts.texas.pharmacy)
n= 6754, number of events= 1114
coef exp(coef) se(coef) z Pr(>|z|)
phy_typeCommunity Multi -1.899517 0.149641 0.072800 -26.092 < 2e-16 ***
ruralityNon-Metro -0.798242 0.450120 0.122221 -6.531 6.53e-11 ***
pctminority 0.005632 1.005647 0.001879 2.997 0.00272 **
pctpop65over -0.010382 0.989672 0.005285 -1.965 0.04947 *
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
exp(coef) exp(-coef) lower .95 upper .95
phy_typeCommunity Multi 0.1496 6.6827 0.1297 0.1726
ruralityNon-Metro 0.4501 2.2216 0.3542 0.5720
pctminority 1.0056 0.9944 1.0020 1.0094
pctpop65over 0.9897 1.0104 0.9795 1.0000
Concordance= 0.737 (se = 0.008 )
Likelihood ratio test= 901.8 on 4 df, p=<2e-16
Wald test = 730.9 on 4 df, p=<2e-16
Score (logrank) test = 924.2 on 4 df, p=<2e-16
Code
#testing proportional Hazard
cox.zph(fit.check)
chisq df p
phy_type 12.1066 1 0.0005
rurality 1.2528 1 0.2630
pctminority 0.0153 1 0.9015
pctpop65over 2.7894 1 0.0949
GLOBAL 15.1401 4 0.0044
Code
par(mfrow=c(2, 2))
plot(cox.zph(fit.check))
Evidence from both the formal test( using the weighted residuals) and the graphical representation suggested strong evidence of non-proportional hazards for types of pharmacy.
Checking for Linearity of the continuous Covariates Using Martingale Residuals
Code
<-resid(fit.check, type="martingale")
res.mar#plot vs pctminority
scatter.smooth(tracts.texas.pharmacy$pctminority, res.mar,degree = 2,
span = 1, ylab="Martingale Residual",
col=1, cex=.5, lpars=list(col = "red", lwd = 3))
title(main="Martingale Residuals for Percent of Minority in Phamarcy Tracts")
Code
#plot vs pctminority
scatter.smooth(tracts.texas.pharmacy$pctpop65over, res.mar,degree = 2,
span = 1, ylab="Martingale Residual",
col=1, cex=.5, lpars=list(col = "red", lwd = 3))
title(main="Martingale Residuals for Percent of Pop. over 65 in Phamarcy Tracts")
Given that the lines on these graphs fit the residual in a straight line, we can conclude that these covariates(pctminority and pctpop65over) have been modeled effectively in the model.