Problem 1

Consider the survival data given in the following table:

Year of follow-up	Number alive at the beginning of interval	Number dying in the interval
0-1	1100	240
1-2	860	180
2-3	680	185
3-4	495	137
4-5	358	118
5-6	240	60
6-7	180	52
7-8	128	44
8-9	84	32
more than 9	52	28

i.) Compute the estimated survival function, the probability density function and the hazard function.

ii.) Also compute acturial estimate of hazard rate and compare with previous one.

iii.) Estimate median survival time from the graph.

Solution:

Given data:

b_y = 0:9
e_y = c(1:9,Inf)
n_b = c(1100,860,680,495,358,240,180,128,84,52)
n_d = c(240,180,185,137,118,60,52,44,32,28)
df = data.frame(year_beg = b_y, year_e = e_y, alive_beg = n_b, die_int = n_d )
df

##    year_beg year_e alive_beg die_int
## 1         0      1      1100     240
## 2         1      2       860     180
## 3         2      3       680     185
## 4         3      4       495     137
## 5         4      5       358     118
## 6         5      6       240      60
## 7         6      7       180      52
## 8         7      8       128      44
## 9         8      9        84      32
## 10        9    Inf        52      28

df$Surv = df$alive_beg/1100
df$Pdf = c(-diff(df$Surv),NA)
df$Haz =  df$Pdf/df$Surv
df$Act_Haz = df$die_int/(df$alive_beg - df$die_int/2)

Survival Curve, PDF CURV

plot(df$year_beg,df$Surv,main = "Survival Curve", xlab = "Survival Times(t)",ylab = "Survival Function S(t)")
segments(df$year_beg,df$Surv,df$year_e,df$Surv)
segments(df$year_e,df$Surv,df$year_beg[-1],df$Surv[-1], lty = 4)
abline(h = 0.5, lty = 2, col = "red")

Problem 2

Suppose in a research design one particular treatment was administered to the patients and enty or exit of patients at any time throughout the course of the study was permissible. The experimenter stopped the study after a period of 500 days. Their survival times (in days) and event status of patients (i.e. censored/ uncensored) were recorded in the following table.

Patients	Suvival Times (in days)	Event status of patients
1	50	event occured
2	100	event occured
3	150	event occured
4	200	lost to follow-up
5	250	event occured
6	300	lost to follow-up
7	350	event occured
8	400	event occured
9	450	event not occured at the end of the study
10	500	event not occured at the end of the study

Usind Kaplan-Meier method, estimate the survival function for the given data.
Plot the survival function based on the estimates obtained in (i)
Assuming the survival time follows exponential distribution with survival rate $\lambda$, estimate the mean survival time.

Solution:

data = data.frame(patients = 1:10)
data$surv_times = seq(50,500,50)
data$delta = rep(c(1,0,1,0,1,0),c(3,1,1,1,2,2))
m = 1
for(i in 1:10){
  if(data$delta[i] == 1){
    data$r[i] = data$patients[i]
    data$p[i] = round((10-data$r[i])/(10-data$r[i] + 1),4)
    m = data$p[i] * m 
    #print(c(i,data$p[i],m))
    data$St[i] = m
  } else{
    data$r[i] = NA
    data$p[i] = NA
    data$St[i] = NA
  }
}
data

##    patients surv_times delta  r      p        St
## 1         1         50     1  1 0.9000 0.9000000
## 2         2        100     1  2 0.8889 0.8000100
## 3         3        150     1  3 0.8750 0.7000087
## 4         4        200     0 NA     NA        NA
## 5         5        250     1  5 0.8333 0.5833173
## 6         6        300     0 NA     NA        NA
## 7         7        350     1  7 0.7500 0.4374880
## 8         8        400     1  8 0.6667 0.2916732
## 9         9        450     0 NA     NA        NA
## 10       10        500     0 NA     NA        NA

Plot Survival Curve

val = data[!is.na(data$St),c(2,6)]
val = rbind(c(0,1),val)
val = cbind(val,end = c(val$surv_times[-1],NA))
plot(val$surv_times,val$St, main = "Survival Curve",xlab = "Time(t)",ylab="Survival Function S(t)")
segments(val$surv_times,val$St,val$end,val$St)
segments(val$end,val$St,val$surv_times[-1],val$St[-1],lty=3)

When we assume that the given survival times(T) follows an exponential distribution then the corresponding likelihood is given by:

\[ L \propto \prod_{i=1}^{10} \left( f(t_i) \right)^{\delta_i} \left( S(t_i) \right)^{1-\delta_i} \\ = \prod_{i=1}^{10} \left( \lambda*e^{-\lambda t_i} \right)^{\delta_i} \left( e^{-\lambda t_i} \right)^{1-\delta_i} \]

The log-likelihood can then be written as:

\[ log(L) \propto \sum_{exact} log(\lambda) - \lambda \sum t_i \]

The mle can thus be calculated as:

\[ \lambda_{mle} = \frac{r}{\sum t_i} \]

r = sum(data$delta)
sum = sum(data$surv_times)
mle = r/sum
mean = 1/mle
mle

## [1] 0.002181818

mean

## [1] 458.3333

DSE-A2-P(2024)

Dunken

2024-07-29

Problem 1

Solution:

Given data:

Survival Curve, PDF CURV

Problem 2

Solution:

Plot Survival Curve