Question 5

We import the required data set.

TdapVaccine <- read.csv("TdapVaccine.csv")
TdapVaccine <- TdapVaccine[, 2:4]
TdapVaccine %>% head() %>% kable()
Delivery Vac tVac
36.1 1 35.3
37.0 1 36.8
38.6 1 33.8
41.3 0 NA
35.6 0 NA
41.2 0 NA

We find that there are 1000 observations. The meanings of each variable are as follows:

Delivery means the time of delivery (in weeks since conception);

Vac means whether the woman received the Tdap vaccine during pregnancy;

tVac means the time that the woman received the Tdap vaccine (NA if no vaccine received).

Question 5.1

According to the question, we construct the time and status columns as follows:

time <- ifelse(TdapVaccine$Delivery >= 37, 37, TdapVaccine$Delivery)
status <- ifelse(TdapVaccine$Delivery >= 37, 0, 1)
TdapVaccine <- cbind(time, status, TdapVaccine)

Then, we fit the Cox proportional hazards model with Vac being a time-independent covariate. The Cox model is as follows: \[\lambda(t)=\lambda_0(t)\mbox{exp}\{\beta\cdot I(\text{Vac}=1)\}\]

model1 <- coxph(Surv(time, status) ~ Vac, data = TdapVaccine)
summary(model1)
## Call:
## coxph(formula = Surv(time, status) ~ Vac, data = TdapVaccine)
## 
##   n= 1000, number of events= 364 
## 
##        coef exp(coef) se(coef)      z Pr(>|z|)    
## Vac -0.4447    0.6410   0.1145 -3.885 0.000102 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
##     exp(coef) exp(-coef) lower .95 upper .95
## Vac     0.641       1.56    0.5122    0.8022
## 
## Concordance= 0.553  (se = 0.012 )
## Likelihood ratio test= 15.91  on 1 df,   p=7e-05
## Wald test            = 15.09  on 1 df,   p=1e-04
## Score (logrank) test = 15.34  on 1 df,   p=9e-05

According to the output above, we find that \(\beta\) is significant in this model. And the hazard ratio of preterm birth after vaccination to before vaccination is \(\mbox{exp}(\hat{\beta})=0.641\), which means that receives the Tdap vaccine during pregnancy significantly reduces the hazard of preterm birth.

Question 5.2

We create the time-dependent variable as follows:

ID <- 1:nrow(TdapVaccine)
TdapVaccine <- cbind(ID, TdapVaccine)
TdapVaccine2 <- tmerge(TdapVaccine, TdapVaccine, id = ID, status = event(time, status), Vac = tdc(tVac))[, c("ID", "tstart", "tstop", "status", "Vac")]

Then, we fit the Cox model with the time-dependent covariate as follows:

model2 <- coxph(Surv(tstart, tstop, status) ~ Vac, data = TdapVaccine2)
summary(model2)
## Call:
## coxph(formula = Surv(tstart, tstop, status) ~ Vac, data = TdapVaccine2)
## 
##   n= 1326, number of events= 364 
## 
##       coef exp(coef) se(coef)     z Pr(>|z|)
## Vac 0.1209    1.1285   0.1149 1.053    0.293
## 
##     exp(coef) exp(-coef) lower .95 upper .95
## Vac     1.129     0.8861     0.901     1.413
## 
## Concordance= 0.512  (se = 0.012 )
## Likelihood ratio test= 1.09  on 1 df,   p=0.3
## Wald test            = 1.11  on 1 df,   p=0.3
## Score (logrank) test = 1.11  on 1 df,   p=0.3

In this model, we find that there is not enough evidence to show that receiving the Tdap vaccine during pregnancy has an influence on the hazard of preterm birth.

Question 5.3

We consider the variable Vac as a time-independent covariate in model 1, it used information from the future to model the hazard of preterm birth, which would result in a biased analysis. And the variable Vac is considered as a time-dependent covariate in model 2. Therefore, we accept the conclusion in model 2. That is, receiving the Tdap vaccine during pregnancy does not have significant effect on the hazard of preterm birth.