R Markdown

s_data <- read_sav("/Users/anna/Downloads/HW1Leukemia.sav")
View(s_data)
cox_model <- coxph(Surv(Time, Status) ~ Group, data = s_data)
summary(cox_model)
## Call:
## coxph(formula = Surv(Time, Status) ~ Group, data = s_data)
## 
##   n= 42, number of events= 30 
## 
##          coef exp(coef) se(coef)      z Pr(>|z|)    
## Group -1.5721    0.2076   0.4124 -3.812 0.000138 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
##       exp(coef) exp(-coef) lower .95 upper .95
## Group    0.2076      4.817   0.09251    0.4659
## 
## Concordance= 0.691  (se = 0.042 )
## Likelihood ratio test= 16.35  on 1 df,   p=5e-05
## Wald test            = 14.53  on 1 df,   p=1e-04
## Score (logrank) test = 17.25  on 1 df,   p=3e-05
  1. The effect estimate for the patients from placebo vs treatment is -1.57. After exponentiating, we get the hazard ratio of 0.2, meaning that people receiving the placebo have an 80% higher relative risk of death compared to people receiving the 6PM treatment. The 95% confidence interval for the hazard ratio is 0.09251; 0.4659 .
table_data <- table(s_data$Group, s_data$Remission)

dimnames(table_data) <- list(
  G = c("Placebo", "6-MP"),                  
  R = c("Partial Remission", "Complete Remission") 
)
table_data
##          R
## G         Partial Remission Complete Remission
##   Placebo                 5                 16
##   6-MP                    5                 16
  1. Since this is a randomized clinical trial, we would not expect a significant difference in the proportion of patients in complete remission at baseline between the placebo and 6-MP groups. Randomization ensures that baseline characteristics, such as remission status, are balanced between the groups. To confirm, please check the table above, which shows no proportion difference.

3.Yes, remission status at baseline may predict survival, as it reflects the initial severity of the disease. Patients in complete remission at baseline tend to have better outcomes and a higher chance of survival compared to those in partial remission. However, this relationship would need to be tested statistically, such as with a Cox proportional hazards model, to assess its strength and significance.

cox_model1 <- coxph(Surv(Time, Status) ~ Group + Remission, data = s_data)
summary(cox_model1)
## Call:
## coxph(formula = Surv(Time, Status) ~ Group + Remission, data = s_data)
## 
##   n= 42, number of events= 30 
## 
##              coef exp(coef) se(coef)      z Pr(>|z|)    
## Group     -1.5772    0.2066   0.4199 -3.756 0.000172 ***
## Remission  0.0284    1.0288   0.4320  0.066 0.947578    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
##           exp(coef) exp(-coef) lower .95 upper .95
## Group        0.2066      4.841   0.09071    0.4704
## Remission    1.0288      0.972   0.44122    2.3989
## 
## Concordance= 0.668  (se = 0.049 )
## Likelihood ratio test= 16.36  on 2 df,   p=3e-04
## Wald test            = 14.51  on 2 df,   p=7e-04
## Score (logrank) test = 17.26  on 2 df,   p=2e-04
  1. Remission status at baseline does not appear to be a statistically significant predictor of survival, with a p-value above the 0.05 threshold. Furthermore, it does not influence the treatment effect coefficient.
uis_data <- read_excel("/Users/anna/Downloads/UIS_part.xlsx")
View(uis_data)
cox_model_uis <- coxph(Surv(time, censor) ~ ivyn , data = uis_data)
summary(cox_model_uis)
## Call:
## coxph(formula = Surv(time, censor) ~ ivyn, data = uis_data)
## 
##   n= 209, number of events= 177 
## 
##        coef exp(coef) se(coef)     z Pr(>|z|)  
## ivyn 0.3185    1.3750   0.1674 1.902   0.0572 .
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
##      exp(coef) exp(-coef) lower .95 upper .95
## ivyn     1.375     0.7273    0.9903     1.909
## 
## Concordance= 0.527  (se = 0.019 )
## Likelihood ratio test= 3.78  on 1 df,   p=0.05
## Wald test            = 3.62  on 1 df,   p=0.06
## Score (logrank) test = 3.65  on 1 df,   p=0.06

5.1 In the unadjusted analysis, drug use does not appear to be a statistically significant predictor of returning to drug use, with a p-value above the 0.05 threshold. However, it’s very close to 0.05, so may be considered marginally significant.

cox_model_uis_adj <- coxph(Surv(time, censor) ~ ivyn + age + race , data = uis_data)
summary(cox_model_uis_adj)
## Call:
## coxph(formula = Surv(time, censor) ~ ivyn + age + race, data = uis_data)
## 
##   n= 209, number of events= 177 
## 
##          coef exp(coef) se(coef)      z Pr(>|z|)   
## ivyn  0.28798   1.33373  0.17701  1.627  0.10376   
## age  -0.02914   0.97128  0.01431 -2.036  0.04173 * 
## race -0.57494   0.56274  0.19463 -2.954  0.00314 **
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
##      exp(coef) exp(-coef) lower .95 upper .95
## ivyn    1.3337     0.7498    0.9427    1.8869
## age     0.9713     1.0296    0.9444    0.9989
## race    0.5627     1.7770    0.3843    0.8241
## 
## Concordance= 0.606  (se = 0.023 )
## Likelihood ratio test= 18.11  on 3 df,   p=4e-04
## Wald test            = 16.93  on 3 df,   p=7e-04
## Score (logrank) test = 17.4  on 3 df,   p=6e-04

5.2 After adjusting for age and race, drug use remains not a significant predictor of returning to drug use. The lack of significance for ivyn suggests that any potential effect of ever using IV drugs on the outcome is likely explained by other factors (such as age and race), or that ivyn itself does not strongly predict returning to drug use.

cox_model_uis_adj2 <- coxph(Surv(time, censor) ~ ivyn*age + race , data = uis_data)
summary(cox_model_uis_adj2)
## Call:
## coxph(formula = Surv(time, censor) ~ ivyn * age + race, data = uis_data)
## 
##   n= 209, number of events= 177 
## 
##              coef exp(coef) se(coef)      z Pr(>|z|)   
## ivyn     -1.83224   0.16005  1.20197 -1.524  0.12742   
## age      -0.08514   0.91838  0.03524 -2.416  0.01569 * 
## race     -0.61014   0.54328  0.19598 -3.113  0.00185 **
## ivyn:age  0.06809   1.07046  0.03866  1.761  0.07819 . 
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
##          exp(coef) exp(-coef) lower .95 upper .95
## ivyn        0.1601     6.2479   0.01518    1.6881
## age         0.9184     1.0889   0.85709    0.9841
## race        0.5433     1.8407   0.37000    0.7977
## ivyn:age    1.0705     0.9342   0.99235    1.1547
## 
## Concordance= 0.613  (se = 0.023 )
## Likelihood ratio test= 21.31  on 4 df,   p=3e-04
## Wald test            = 18.93  on 4 df,   p=8e-04
## Score (logrank) test = 19.5  on 4 df,   p=6e-04
  1. The interaction term between ivyn and age was not statisticallty significant, meaning that age does not modify the relationship between IV drug use and returning to drug use.

The hazard ratio in the group who never used drugs (ivyn=0), decreases by 0.91 (9%) with every year increase in age. The hazard ratio in the group who has a history of using drugs (ivyn=1), decreases by (-0.085+0.06, and then exponentaited= 0.98) 1.67% with every year increase in age. 0.06809 is the coefficient for the interaction term between drug use (ivyn) and age (age). This term specifically tells us how the relationship between age and hazard differs for drug users (ivyn = 1) compared to non-users (ivyn = 0).