According to the source, a large HMO wished to evaluate the survival time of its HIV+ members using a follow-up study. Subjects were enrolled in the study from January 1,1989 to December 31,1991. The study ended on December 31,1995. After a confirmed diagnosis of HIV, members were followed until death due to AIDS-related complications, until the end of the study or until the subject was lost to follow-up. We assumed that there were no deaths to the other causes. The primary outcome variable of interest is survival time after a confirmed diagnosis of HIV. Since subjects entered the study at different time over a 3-year period, the maximum possible follow-up time is different for each study participant. Possible predictors of survival time were collected at enrollment into the study.
Variables are the following:
TIME = the follow-up time is the number of months between the entry date and end date
AGE= the age of the subject at the start of follow-up (in years)
DRUG= history of prior IV drug use (1= Yes, 0=No)
CENSOR= vital status at the end of the study (1=death due to AIDS, 0= lost of follow-up or alive)
columname <- c("ID", "StartDate","EndDate","Age","Drug","Censor")
type_col <- c("integer", rep("factor",2),rep("integer",3))
hmohiv <- read.csv("/Users/azizur/Desktop/stat 755/others work/hmohiv .csv")
head(hmohiv)
## ID StartDate EndDate Age Drug Censor
## 1 1 5/15/90 12:00 AM 10/14/90 12:00 AM 46 0 1
## 2 2 9/19/89 12:00 AM 3/20/90 12:00 AM 35 1 0
## 3 3 4/21/91 12:00 AM 12/20/91 12:00 AM 30 1 1
## 4 4 1/3/91 12:00 AM 4/4/91 12:00 AM 30 1 1
## 5 5 9/18/89 12:00 AM 7/19/91 12:00 AM 36 0 1
## 6 6 3/18/91 12:00 AM 4/17/91 12:00 AM 32 1 0
a) Kaplan-Maier Survival Curve with 95% Confidence Interval
library(survival)
library(date)
Start.d <- as.Date(hmohiv$StartDate, format = "%m/%d/%Y")
End.d <- as.Date(hmohiv$EndDate, format = "%m/%d/%Y")
Survival_days <- difftime(End.d, Start.d)
hmohiv.km <- survfit(Surv(Survival_days,hmohiv$Censor)~1, conf.type="log-log")
plot(hmohiv.km, xlab="time (days)", ylab="survival probability",
main="KME of survival probability for HMO-HIV+ data",
conf.int=T)
b)The median survival and a Confidence Interval
hmohiv.km
## Call: survfit(formula = Surv(Survival_days, hmohiv$Censor) ~ 1, conf.type = "log-log")
##
## n events median 0.95LCL 0.95UCL
## 100 80 212 152 273
We can see that the median survival time is 212 days and a 95% condidence interval ranges from 152 to 273 days
A patient’s history of IV drug use may be a risk factor for survival. Compare the estimated survival curves for the two groups: users (DRUG=1) and non-users (DRUG=0).
Survival_month <- Survival_days/30.5
drug <- factor(hmohiv$Drug)
levels(drug) <- list(No_Drug = as.character(0),Drug = as.character(1))
library(tibble)
hmohiv1 <- add_column(hmohiv,drug, .after = 6)
plot(survfit(Surv(Survival_month) ~ drug), xlab="Time in months",
ylab="Survival probability", col=c("blue", "red"), lwd=2)
legend("topright", legend=c("Drug", "No Drug"),
col=c("blue","red"), lwd=2)
YES. Non-users of IV drug have significantly different survival experience than the users.