Lista 03 - Analise de Sobrevivência
Libraries
library(survival)
library(flexsurv)## Warning: package 'flexsurv' was built under R version 4.1.3
Questão 02
Sabemos que \(X \sim EXP (\lambda)\), então temos que \(E(X) = \theta = \frac{1}{\lambda}\) logo, o EMV para \(\theta\) é \(\hat{\theta}=\frac{1}{\hat{\lambda}}\), aonde sabemos que \(\hat{\lambda} = \frac{r}{\sum_{i=1}^{n}t_{i}}\). Então, sabendo que:
\[ Var(g(\hat{\lambda})) = Var(\hat{\lambda}) \cdot \bigg(\frac{d g(\hat{\lambda}{})}{d \lambda}\bigg)^{2} \rightarrow \]
\[ DP(g(\hat{\lambda})) = DP(\hat{\lambda}) \cdot \bigg|\frac{d g(\hat{\lambda})}{d \lambda}\bigg| \rightarrow \]
Então, temos:
\[ DP(\hat{\theta}) = DP(\hat{\lambda})\cdot \bigg| \frac{d\hat{\theta}}{d\lambda}\bigg| \]
onde temos \(\frac{d\hat{\theta}}{\lambda} = -\frac{1}{\lambda^{2}}\)
substituindo, temos então:
\[ DP(\hat{\theta}) = DP(\hat{\lambda})\cdot\frac{1}{\lambda^{2}} \]
Como queremos a estimativa do desvio padrão, então teremos:
\[ \hat{DP(\hat{\theta})} = \hat{DP}(\hat{\lambda})\cdot\frac{1}{\hat{\lambda}^{2}} \]
Questão 03
lambda=0.13
gama=1.5A)
- S(T)
\[ S(t) = e^{-\big(\alpha\cdot t\big)^{\gamma}} \]
logo, a formula para o \(t_p\) percentil da mediana, é dada por:
\[ ln(1 - p) = -\bigg(t\cdot \alpha\bigg)^{\gamma} \rightarrow (ln( (1 - p)^{-1} ))^{\frac{1}{\gamma}}\cdot \frac{1}{\alpha} = t \]
qtp <- function(p, alpha, gama){
( (log((1 - p)**(-1)))**(1/gama))*(1/alpha)
}S <- function(t, alpha, gama){
exp(-((t*alpha)**(gama)))
}- h(t)
\[ h(t) = \gamma\cdot \alpha\cdot \bigg(t\cdot\alpha\bigg)^{\gamma-1} \]
h <- function(t, alpha, gama){
(gama*alpha)*((t*alpha)**(gama-1))
}- H(T)
\[ H(t) = \bigg(t\cdot \alpha\bigg)^{\gamma} \]
H <- function(t, alpha, gama){
((t*alpha)**(gama))
}B)
alpha<-0.13
gama<-1.5
par(mfrow=c(2,2))
curve(S(x, alpha, gama), from=0, to=20)
grid()
curve(h(x, alpha, gama), from=0, to=20)
grid()
curve(H(x, alpha, gama), from=0, to=20)
grid()C)
qtp(0.5, 0.13, 1.5)## [1] 6.024767
D)
p1 = 0.8; p2 = 0.1
qtp(1 - p1, alpha, gama);qtp(1 - p2, alpha, gama)## [1] 2.829955
## [1] 13.41324
E)
- S(t)
lambda = 0.13
gama = seq(0, 1.5, by=0.25)
n = 20
t = seq(1, n, by=1)
for (i in 1:length(gama)){
if (i == 1){
plot(1:length(t), S(t, lambda, gama[1]),
type="o", col=1, pch=1, lty=1,
xlab='tempo',ylab="S(t)", ylim=c(0,1),
main='Função de Sobrevivência')
}else{
points(1:length(t), S(t, lambda, gama[i]), col=i, pch=i)
lines(1:length(t), S(t, lambda, gama[i]), col=i,lty=i)
}
}
v = NULL
for (i in 1:length(gama)){
v[i] = parse(text = paste0(' ~ gamma == ', gama[i]))
}
legend("topright",
legend=v,
pch=seq(1,length(gama)),lty=1, ncol=1,col=1:length(gama))
grid() * h(t)
lambda = 0.13
gama = seq(0, 1.5, by=0.25)
n = 20
t = seq(1, n, by=1)
for (i in 1:length(gama)){
if (i == 1){
plot(1:length(t), h(t, lambda, gama[1]),
type="o", col=1, pch=1, lty=1,
xlab='tempo',ylab="h(t)", ylim=c(0,0.5),
main='Função de Risco')
}else{
points(1:length(t), h(t, lambda, gama[i]), col=i, pch=i)
lines(1:length(t), h(t, lambda, gama[i]), col=i,lty=i)
}
}
v = NULL
for (i in 1:length(gama)){
v[i] = parse(text = paste0(' ~ gamma == ', gama[i]))
}
legend("topright",
legend=v,
pch=seq(1,length(gama)),lty=1, ncol=1,col=1:length(gama))
grid()Questão 04
\[ X \sim LogNormal(\mu, (\sigma)^{2}) \] \[ F(t) = 0.08 \]
e
\[ S(t) = 0.92 \]
então, sabendo que \(S(t) = 1 - \phi\bigg(\frac{ln(t) - \mu}{\sigma}\bigg)\), então, substituindo temos:
\[ \phi\bigg(\frac{ln(t) - \mu}{\sigma}\bigg) = 0.08 \]
tomando \(t = 500\) e \(\sigma = 0.5\), teremos:
\[ \phi\bigg(\frac{ln(500) - \mu}{0.5}\bigg) = 0.08 \]
Achando então o quantil da \(Normal(0,1)\) temos que \(\frac{ln(500) - \mu}{\sigma} = -1.4051\), logo isolnando \(\mu\), temos: \(\mu = 6.9171\)
Questão 05 (Duvida)
n = 12
tempo = sort(c(48,35,91,62,59,77,53,84))
r = length(tempo)
cens = c(rep(1,length(tempo)), rep(0, n - length(tempo)))
tempo = c(tempo, rep(91,n - length(tempo)))
data = data.frame(tempo,cens)
knitr::kable(data)| tempo | cens |
|---|---|
| 35 | 1 |
| 48 | 1 |
| 53 | 1 |
| 59 | 1 |
| 62 | 1 |
| 77 | 1 |
| 84 | 1 |
| 91 | 1 |
| 91 | 0 |
| 91 | 0 |
| 91 | 0 |
| 91 | 0 |
A)
- S(t) do modelo Exponencial
\[ S(t) = e^{-t\cdot \lambda} \]
S <- function(t, lambda){
exp(-t*lambda)
}library(survival)
ajust1<-survreg(Surv(tempo,cens)~1,dist="exponential")
lambda<-exp(ajust1$coefficients[1])
lambdae<-as.vector(1/lambda)
lambdae## [1] 0.009163803
plot(S(data$tempo, lambdae),pch=16, ylab = "S(t): Kaplan-Meier",
xlab="S(t): exponencial",type="l")B)
\[ T \sim EXP(\lambda) \]
Sabemos que
\[ IC[\lambda | (1 -\alpha) \%] = [\hat{\lambda} \pm z_{\frac{\alpha}{2}}\cdot \sqrt{\hat{Var}(\hat{\lambda})}] \]
Onde, podemos definir:
\[ Var(\hat{\lambda}) = - \frac{1}{E\bigg[\frac{\partial^{2}l(\lambda)}{\partial\lambda^{2}}\bigg]} \]
por fim, sabendo que:
\[ \frac{\partial l(\lambda)}{\partial\lambda} = \frac{1}{\lambda}\cdot\sum_{i=1}^{n} - \sum_{i=1}^{n}t_i \rightarrow \]
\[ \rightarrow \frac{\partial^{2} l(\lambda)}{\partial\lambda^{2}} = -\frac{r}{\lambda^{2}} \]
Então temos:
\[ Var(\hat{\lambda}) = - \frac{1}{-\frac{r}{\lambda^{2}}} = \frac{\lambda^{2}}{r} \]
Logo a sua estimativa é dada por:
\[ \hat{Var}(\hat{\lambda}) = \frac{\hat{\lambda}^{2}}{r} \]
Logo temos a seguinte forma para o intervalo de confiança:
\[ IC[\lambda | (1 -\alpha) \%] = [\hat{\lambda} \pm z_{\frac{\alpha}{2}}\cdot \frac{\hat{\lambda}}{\sqrt{r}}] \]
Vamos calcular agora Computacionalmente:
alpha = 0.05
dp = lambdae/sqrt(r)
quantil = qnorm(1 - (alpha/2))
ic = c(
'Lower' = lambdae - quantil*dp,
'Upper' = lambdae + quantil*dp
)
ic## Lower Upper
## 0.002813728 0.015513878
C)
- Tempo mediano pontual
Para tanto, basta acharmos \(S(t_p) = 0.5\), logo temos a seguinte espressão definida:
\[ S(t_p) = e^{-\lambda\cdot t_p} = 0.5 \rightarrow \]
\[ \rightarrow -\lambda t_p = ln(0.5) \rightarrow \]
\[ \rightarrow t_p = -\frac{ln(0.5)}{\lambda} \]
Calculando computacionalmente, temos:
(tp = -log(0.5)/lambdae)## [1] 75.63969
- Tempo mediano intervalar
Considerando a seguinte expressão para o tempo mediano:
\[ S(t_p) = 1 - F(t_p) = 1 - p \rightarrow \] \[ \rightarrow e^{-\lambda\cdot t_p} = 1 - p \rightarrow \lambda\cdot t_p = ln[ (1-p)^{-1} ]\rightarrow \]
\[ t_p = \frac{ln\bigg(\frac{1}{1-p}\bigg)}{\lambda} \rightarrow \hat{t}_p = \frac{ln\bigg(\frac{1}{1-p}\bigg)}{\hat{\lambda}} \] Sabendo também que a variância do \(p-\)ésimo percentil é dada por:
\[ Var(\hat{t}_p) = Var(\hat{\lambda})\cdot \bigg(\frac{\partial t_p}{\partial \lambda}\bigg)^{2} \]
Apos todas as manipulaões, temos que:
\[ Var(\hat{t}_p) = \frac{\lambda^{2}}{r}\cdot \frac{[ln( (1-p)^{-1} )]^{2}}{\lambda^{4}} \rightarrow \]
\[ \rightarrow Var(\hat{t}_p) = \frac{[ln( (1-p)^{-1} )]^{2}}{r\cdot \lambda^{2}} = \frac{1}{r}\cdot \bigg[\frac{ln( (1 - p)^{-1} )}{\lambda}\bigg]^{2} \] note que
\[ Var(\hat{t}_p) = \frac{t_p^{2}}{r}\rightarrow DP(\hat{t}_{p}) = \frac{t_p}{\sqrt{r}} \] Então, substituindo na formula, teremos:
\[ IC[t_p | 1 - \alpha \%] = [\hat{t}_p \pm z_{\frac{\alpha}{2}}\cdot \frac{\hat{t}_p}{\sqrt{r}}] \]
Aplicando computacionalmente, temos:
alpha = 0.05
quantil = qnorm(1 - (alpha/2))
p = 0.5
tphat = log( (1 - p)**(-1) )/lambdae
dp = tphat/sqrt(r)
(ic = c(
'Lower' = tphat - quantil*dp,
'Upper' = tphat + quantil*dp
))## Lower Upper
## 23.22502 128.05435
D)
Utilizaremos a seguinte formula:
\[ DP(g(\theta)) = DP(\theta)\cdot \bigg|\frac{\partial g(\theta)}{\partial\theta}\bigg| \] Seja \(g(\theta) = S(t)\), então:
\[ DP(S(t)) = DP(\lambda)\cdot \bigg| \frac{\partial S(t)}{\partial \lambda} \bigg| \]
calculando a derivada de \(S(t)\), temos que \(\frac{\partial S(t)}{\partial \lambda} = -t\cdot e^{-\lambda\cdot t}\), então temos a seguinte forma:
\[ DP(S(t)) =\frac{\lambda}{\sqrt{r}}\cdot t\cdot e^{-\lambda\cdot t} \]
Logo, substituindo no IC, teremos:
\[ IC[S(t) | 1 - \alpha \%] = [\hat{S}(t) \pm z_{\frac{\alpha}{2}}\cdot \frac{\lambda\cdot t \cdot e^{-\lambda \cdot t}}{\sqrt{r}}] \]
Substituindo de forma computacional, e considerando \(t = 80\), teremos:
- Estimando de forma pontual
t = 80
(shat = exp(-lambdae*t))## [1] 0.4804153
- Estimando de forma intervalar
t = 80
alpha = 0.05
shat = exp(-lambdae*t)
quantil = qnorm(1 - (alpha/2))
dp = lambdae*t*exp(-lambdae*t)/sqrt(r)
(
ic = c(
'Lower' = shat - quantil*dp,
'Upper' = shat + quantil*dp
)
)## Lower Upper
## 0.2363615 0.7244692
Questão 06
sabemos que \(n=100\), \(t_{c}=1000\) e \(r=20\). Além disso, é dito que:
\[ X \sim EXP(\theta) \]
tal que:
\[ f(x | \theta) = \frac{1}{\theta} \cdot e^{\big(-\frac{t_i}{\theta}\big)} \]
Então, conhecendo \(f(t_i | \theta)\), temos que \(S(t_i | \theta) = e^{-\frac{t_i}{\theta}}\). Aplicando a verossimilhança, temos:
\[ L(\theta) = \Pi_{i=1}^{n}\bigg[\frac{1}{\theta}\cdot e^{-\frac{t_i}{\theta}}\bigg]^{\delta_i}\cdot \bigg[e^{-\frac{t_i}{\theta}}\bigg]^{1 - \delta_i} = \Pi_{i=1}^{n}\bigg[\frac{1}{\theta}\bigg]^{\delta_i}\cdot e^{-\frac{t_i}{\theta}} \]
então, aplicando o \(log\), teremos:
\[ l(\theta) = \sum_{i=1}^{n}\bigg[\delta_i\cdot ln\bigg(\frac{1}{\theta}\bigg)- \frac{t_i}{\theta}\bigg] \]
agora, derivando com respeito a \(\theta\), temos:
\[ \frac{dl(\theta)}{d\theta} = -\frac{1}{\theta}\cdot \sum_{i=1}^{n}\delta_i + \frac{1}{\theta^{2}}\cdot \sum_{i=1}^{n}t_i \] igualando agora a zero, temos:
\[ -\frac{1}{\hat{\theta}}\cdot \sum_{i=1}^{n}\delta_i + \frac{1}{\hat{\theta}^{2}}\cdot \sum_{i=1}^{n}t_i = 0 \rightarrow \]
\[ \rightarrow -\sum_{i=1}^{n}\delta_i + \frac{1}{\hat{\theta}}\cdot\sum_{i=1}^{n}t_i = 0 \rightarrow \]
\[ \rightarrow \frac{1}{\hat{\theta}} = \frac{\sum_{i=1}^{n}\delta_i}{\sum_{i=1}^{n}t_i} \rightarrow \]
\[ \rightarrow \hat{\theta} = \frac{\sum_{i=1}^{n}t_i}{\sum_{i=1}^{n}\delta_i} \]
sabendo que \(\sum_{i=1}^{n}\delta_i = r\), então temos por fim:
\[ \rightarrow \hat{\theta} = \frac{\sum_{i=1}^{n}t_i}{r} \]
Questão 07
Sendo \(T \sim LogNormal(\mu,\sigma^{2})\), com \(S(t) = 1 - \phi\bigg(\frac{ln(t) - \mu}{\sigma}\bigg)\), vamos achar os quantis para \(S(t) = 0.5\) e \(S(t) = 0.2\)
- Para \(S(t) = 0.5\)
sendo \(S(t) = 1 - \phi\bigg(\frac{ln(t) - \mu}{\sigma}\bigg)\), logo:
\[ 1 - \phi\bigg(\frac{ln(t) - \mu}{\sigma}\bigg) = 0.5 \rightarrow \phi\bigg(\frac{ln(t) - \mu}{\sigma}\bigg) = 0.5 \]
\[ \rightarrow \frac{ln(t) - \mu}{\sigma} = \phi^{-1}(0.5) \rightarrow \]
\[ \rightarrow ln(t) -\mu = \sigma\cdot \phi^{-1}(0.5) \rightarrow ln(t)=\sigma\cdot \phi^{-1}(0.5) + \mu \rightarrow \]
Sabendo que \(\phi^{-1}(t)\) é o quantil de uma \(Normal(0,1)\) acumulada, temos que \(\phi^{-1}(t) = 0\). Logo:
\[ ln(t) = \mu \rightarrow t = e^{\mu} \]
- Para \(S(t) = 0.2\)
round(qnorm(0.2), 4)## [1] -0.8416
Replicamos os passos anteriores, porém agora sabendo que \(\phi^{-1}(0.2) = -0.8416\), então temos a seguinte expressão:
\[ ln(t) = \sigma\cdot\phi^{-1}(0.2) + \mu = \sigma \cdot (-0.8416) + \mu \rightarrow \] \[ \rightarrow t = e^{\sigma\cdot (-0.8416) + \mu} \]
Questão 08
A)
- \(\frac{t_{1p}}{t_{2p}} = \frac{\alpha_{1}}{\alpha_{2}}\)
então, sabendo que:
\[ S_{1p}(t_{1p}) = e^{-\big(\frac{t_{1p}}{\alpha_{1}}\big)^{\beta}} \rightarrow \]
\[ \rightarrow -ln(S_{1p}(t_{1p})) = \bigg(\frac{t_{1p}}{\alpha_1}\bigg)^{\beta}\rightarrow \]
\[ \rightarrow ln(-ln(S_{1p}(t_{1p}))) = \beta\cdot [ln(t_{1p}) - ln(\alpha_1)]\rightarrow \]
\[ \rightarrow ln(t_{1p}) = \frac{1}{\beta}\cdot ln(-ln(S_{1p}(t_{1p}))) + ln(\alpha_1) \]
por fim, temos :
\[ t_{1p} = exp\bigg\{\frac{1}{\beta}\cdot ln(-ln(S_{1p}(t_{1p}))) \bigg\} \cdot \alpha_1 \]
Analogamente podemos obter \(t_{2p}\), com a seguinte forma:
\[ t_{2p} = exp\bigg\{\frac{1}{\beta}\cdot ln(-ln(S_{1p}(t_{2p}))) \bigg\} \cdot \alpha_2 \]
então, teremos:
\[ \frac{t_{1p}}{t_{2p}} = exp\bigg\{ \frac{1}{\beta} \cdot \bigg[ ln(-ln(S_{1}(t_{1p}))) - ln(-ln(S_{2}(t_{2p}))) \bigg]\bigg\}\cdot \frac{\alpha_1}{\alpha_2} \]
logo, como \(S_{1}(t_{1p}) = S_{2}(t_{2p})\), temos que:
\[ \frac{t_{1p}}{t_{2p}} = 1\cdot \frac{\alpha_1}{\alpha_2} = \frac{\alpha_1}{\alpha_2} \]
B)
- \(\frac{h_{1}(t)}{h_{2}(t)} = \delta\) onde \(\delta=\bigg(\frac{\alpha_2}{\alpha_1}\bigg)^{\beta}\).
Sabendo que \(h_{i}(t) = \frac{\beta_i}{\alpha_i}\cdot \bigg(\frac{t}{\alpha_i}\bigg)^{\beta_i - 1}\), então:
\[ \frac{h_{1}(t)}{h_{2}(t)} = \frac{\bigg(\frac{\beta}{\alpha_1}\bigg)\bigg( \frac{t}{\alpha_1}\bigg)^{\beta-1}}{\bigg(\frac{\beta}{\alpha_2}\bigg)\bigg( \frac{t}{\alpha_2}\bigg)^{\beta-1}} = \bigg(\frac{\alpha_2}{\alpha_1}\bigg) \cdot \bigg(\frac{t}{\alpha_1}\bigg)^{\beta - 1} \cdot \bigg(\frac{\alpha_2}{t}\bigg)^{\beta-1} \rightarrow \]
\[ \frac{h_{1}(t)}{h_{2}(t)} = \frac{\alpha_2}{\alpha_1} \cdot \frac{\bigg(\frac{t}{\alpha_1}\bigg)^{\beta-1}}{\bigg(\frac{t}{\alpha_2}\bigg)^{\beta-1}}\rightarrow \]
\[ \frac{h_{1}(t)}{h_{2}(t)} = \frac{\alpha_2}{\alpha_1}\cdot \bigg[\bigg(\frac{\alpha_2}{\alpha_1}\bigg)\bigg]^{\beta - 1} = \bigg(\frac{\alpha_2}{\alpha_1}\bigg)^{\beta} \]
Questão 10
tempo = c(1,1,2,4,4,6,6,6,7,8,9,9,10,12,13,14,18,19,24,26,29,31,42,45,50,57,60,71,85,91)
cens = c(rep(1,21),0,1,0,0,1,1,0,0,1)
r = sum(cens==1)
data = data.frame(tempo,cens)
knitr::kable(data)| tempo | cens |
|---|---|
| 1 | 1 |
| 1 | 1 |
| 2 | 1 |
| 4 | 1 |
| 4 | 1 |
| 6 | 1 |
| 6 | 1 |
| 6 | 1 |
| 7 | 1 |
| 8 | 1 |
| 9 | 1 |
| 9 | 1 |
| 10 | 1 |
| 12 | 1 |
| 13 | 1 |
| 14 | 1 |
| 18 | 1 |
| 19 | 1 |
| 24 | 1 |
| 26 | 1 |
| 29 | 1 |
| 31 | 0 |
| 42 | 1 |
| 45 | 0 |
| 50 | 0 |
| 57 | 1 |
| 60 | 1 |
| 71 | 0 |
| 85 | 0 |
| 91 | 1 |
A)
ajust10<-survreg(Surv(tempo,cens)~1,dist="exponential")
lambda<-exp(as.vector(ajust10$coefficients[1]))
lambdae<-1/lambda
lambdae## [1] 0.03289474
ajust10## Call:
## survreg(formula = Surv(tempo, cens) ~ 1, dist = "exponential")
##
## Coefficients:
## (Intercept)
## 3.414443
##
## Scale fixed at 1
##
## Loglik(model)= -110.4 Loglik(intercept only)= -110.4
## n= 30
#lambda<-exp(ajust1$coefficients[1])
c('Tempo médio' = as.vector(ajust10$coefficients[1]))## Tempo médio
## 3.414443
B)
\[ IC[\lambda | (1 - \alpha)\%] = \bigg[\hat{\lambda}\pm z_{\frac{\alpha}{2}}\cdot\frac{\hat{\lambda}}{\sqrt{r}}\bigg] \]
Como ja conhecemos a formula do IC, vamos aplica-la:
alpha = 0.05
dp = lambdae/sqrt(r)
quantil = qnorm(1 - (alpha/2))
ic = c(
'Lower' = lambdae - quantil*dp,
'Upper' = lambdae + quantil*dp
)
ic## Lower Upper
## 0.02000024 0.04578924
C)
\[ IC[S(t) | 1 - \alpha \%] = [\hat{S}(t) \pm z_{\frac{\alpha}{2}}\cdot \frac{\lambda\cdot t \cdot e^{-\lambda \cdot t}}{\sqrt{r}}] \]
Conhecendo a formula, so precisamos aplica-la, então:
t = tempo
sthat = exp(-lambdae*t)
quantil = qnorm(1 - (alpha/2))
ic = data.frame(
'Lower' = sthat - quantil*(lambdae*t*exp(-lambdae*t)/sqrt(r)),
'Upper' = sthat + quantil*(lambdae*t*exp(-lambdae*t)/sqrt(r))
)
ic## Lower Upper
## 1 0.955163172 0.9801177
## 2 0.955163172 0.9801177
## 3 0.912181003 0.9604749
## 4 0.831491107 0.9219290
## 5 0.831491107 0.9219290
## 6 0.757378492 0.8843978
## 7 0.757378492 0.8843978
## 8 0.757378492 0.8843978
## 9 0.722627618 0.8660215
## 10 0.689332708 0.8479083
## 11 0.657435922 0.8300606
## 12 0.657435922 0.8300606
## 13 0.626881641 0.8124801
## 14 0.569588704 0.7781260
## 15 0.542749169 0.7613540
## 16 0.517050225 0.7448527
## 17 0.424772169 0.6815508
## 18 0.404124787 0.6663981
## 19 0.313559342 0.5946081
## 20 0.282629639 0.5677129
## 21 0.241169028 0.5292658
## 22 0.216511475 0.5048682
## 23 0.115149450 0.3872138
## 24 0.095525210 0.3596304
## 25 0.068590587 0.3175361
## 26 0.040641137 0.2660688
## 27 0.031447199 0.2464408
## 28 0.008175263 0.1853438
## 29 -0.005862839 0.1279640
## 30 -0.008690036 0.1089214
plot(1:30, ic$Lower,
type="o", col=1, pch=1, lty=1,
xlab='tempo',ylab="Shat(t)", ylim=c(0,1),
main='Função de Sobrevivência (IC)')
points(1:30, sthat, col=2, pch=2, type='l')
points(1:30, ic$Upper, col=3, pch=3)
lines(1:30, ic$Upper, col=3,lty=3)D)
sabendo que:
\[ h(t) = \frac{f(t)}{S(t)} = \frac{\lambda\cdot e^{-\lambda\cdot t}}{e^{-\lambda\cdot t}} = \lambda \] logo,
\[ \bigg(\frac{d h(t)}{d\lambda}\bigg)^{2} = 1 \]
então o \(ic\) é dado por:
\[ IC[h(t) | (1 - \alpha)\%] = \bigg[\hat{h}(t) \pm z_{\frac{\alpha}{2}}\cdot 1\cdot \frac{\lambda}{\sqrt{r}}\bigg] \]
então computacionalmente temos:
hhat <- lambdae
alpha = 0.05
quantil = qnorm(1 - (alpha/2))
dp = (lambdae/sqrt(r))*1
(
ic = c(
'Lower' = hhat - quantil*dp,
'Upper' = hhat + quantil*dp
)
)## Lower Upper
## 0.02000024 0.04578924
E)
S <- function(t, lambda){
exp(-lambda*t)
}
S(26, lambdae)## [1] 0.4251713
F)
Sabendo que a formula para o tempo mediano da exponencial é dado por:
\[ t = -\frac{1}{\lambda}\cdot ln(0.5) \]
então:
tmedhat = (-1/lambdae)*log(0.5)
dp = (1/(sqrt(r)*lambdae))*abs(log(0.5))
r= sum(cens==0)
quantil = qnorm(1 - (alpha/2))
ic = c(
'Lower' = tmedhat - quantil*dp,
'Upper' = tmedhat + quantil*dp
)
ic## Lower Upper
## 12.81173 29.33162
lambdae## [1] 0.03289474
G)
Sabendo que
\[ S(t_p) = 1 - p \rightarrow e^{-\lambda\cdot t_p} = 1 - p \rightarrow \]
\[ \rightarrow -\lambda t_p = ln(1 - p)\rightarrow t_p = \frac{1}{\lambda}\cdot ln\bigg(\frac{1}{1 - p}\bigg) \]
Questão 11
tempos = c(151,164,336,365,403,454,455,473,538,577,592,628,632,647,675,727,785,801,811,816,867,893,930,937,976,1008,1040,1051,1060,1183,1329,1334,1379,1380,1633,1769,1827,1831,1849,2016,2282,2415,2430,2686,2729,rep(2729,15))
cens = c(rep(1,45),rep(0,15))
data = data.frame(tempos, cens)A)
ajust1<-survreg(Surv(tempos,cens)~1,dist="exponential")
ajust1## Call:
## survreg(formula = Surv(tempos, cens) ~ 1, dist = "exponential")
##
## Coefficients:
## (Intercept)
## 7.609741
##
## Scale fixed at 1
##
## Loglik(model)= -387.4 Loglik(intercept only)= -387.4
## n= 60
#Estimativas do par?metro no modelo Exponencial
lambda<-exp(ajust1$coefficients[1])
lambdae<-1/lambda
lambdae## (Intercept)
## 0.0004956002
#Ajustando modelo Weibull
ajust2<-survreg(Surv(tempos,cens)~1,dist='weibull')
ajust2## Call:
## survreg(formula = Surv(tempos, cens) ~ 1, dist = "weibull")
##
## Coefficients:
## (Intercept)
## 7.597504
##
## Scale= 0.7804515
##
## Loglik(model)= -385.7 Loglik(intercept only)= -385.7
## n= 60
#Estimativas do par?metro no modelo Weibull
lambda1<-exp(ajust2$coefficients[1])
lambdaw<-1/lambda1
gama<-1/ajust2$scale
cbind(gama, lambdaw)## gama lambdaw
## (Intercept) 1.28131 0.000501702
#Ajustando modelo Lognormal
ajust3<-survreg(Surv(tempos,cens)~1,dist='lognormal')
ajust3## Call:
## survreg(formula = Surv(tempos, cens) ~ 1, dist = "lognormal")
##
## Coefficients:
## (Intercept)
## 7.224766
##
## Scale= 0.9505452
##
## Loglik(model)= -382.7 Loglik(intercept only)= -382.7
## n= 60
#Estimativas do par?metro no modelo Lognormal
mu<- ajust3$coefficients
sigma<- ajust3$scale
cbind(mu,sigma)## mu sigma
## (Intercept) 7.224766 0.9505452
B)
#Obtendo as estimativas das fun??es de sobreviv?ncia
ekm<-survfit(Surv(tempos,cens)~1) #estimador de kaplan-meier
time<-ekm$time #tempos
st<-ekm$surv #sobreviv?ncia kaplan meier
ste<- exp(-time*lambdae) #sobreviv?ncia exponencial
stw<- exp(-(time*lambdaw)^gama) #sobreviv?ncia weibull
stln<- pnorm((-log(time)+ mu)/sigma) #sobreviv?ncia lognormal
#cbind(time,st,ste,stw,stln)
#Procedimentos para escolha da distribui??o
#M?todo Gr?fico 1
par(mfrow=c(1,3))
plot(ste,st,pch=16,ylim=range(c(0.0,1)), xlim=range(c(0,1)), ylab = "S(t): Kaplan-Meier",
xlab="S(t): exponencial")
lines(c(0,1), c(0,1), type="l", lty=1)
plot(stw,st,pch=16,ylim=range(c(0.0,1)), xlim=range(c(0,1)), ylab = "S(t): Kaplan-Meier",
xlab="S(t): Weibull")
lines(c(0,1), c(0,1), type="l", lty=1)
plot(stln,st,pch=16,ylim=range(c(0.0,1)), xlim=range(c(0,1)), ylab = "S(t): Kaplan-Meier",
xlab="S(t): log-normal")
lines(c(0,1), c(0,1), type="l", lty=1)par(mfrow=c(1,3))
plot(ekm, conf.int=F, xlab="Tempos", ylab="S(t)")
lines(c(0,time),c(1,ste), lty=2)
legend("bottomleft",lty=c(1,2),c("Kaplan-Meier", "Exponencial"),bty="n",cex=0.8)
plot(ekm, conf.int=F, xlab="Tempos", ylab="S(t)")
lines(c(0,time),c(1,stw), lty=2)
legend("bottomleft",lty=c(1,2),c("Kaplan-Meier", "Weibull"),bty="n",cex=0.8)
plot(ekm, conf.int=F, xlab="Tempos", ylab="S(t)")
lines(c(0,time),c(1,stln), lty=2)
legend("bottomleft",lty=c(1,2),c("Kaplan-Meier", "Log-normal"),bty="n",cex=0.8)Podemos notar que aparentemente, o melhor modelo, pelo qq-plot, por apresentar os valores da sobrevivencia estimada muito proximos ao estimado pelo kaplan-meier, é o modelo log-normal. Vamos considera-lo como o modelo para seguir nas proximas questões
C)
t = 1380
(stln11<- pnorm((-log(t)+ mu)/sigma))## (Intercept)
## 0.4978712
D)
exp(mu + (sigma**2)/2)## (Intercept)
## 2157.131
E)
p = 0.5
(tp = exp((sigma*qnorm(p)) + mu))## (Intercept)
## 1373.018
F)
1 - pnorm((log(500) - mu)/sigma)## (Intercept)
## 0.8560443
Questão 12
tempos = seq(1,300,by=1)
cens = c(rep(1,300-10),rep(0,10))
data = data.frame(tempos, cens)ajust1<-survreg(Surv(tempos,cens)~1,dist="exponential")
lambda<-exp(ajust1$coefficients[1])
lambdae<-1/lambda
lambdae## (Intercept)
## 0.006423034
alpha = 0.05
t = 325
r = sum(cens==0)
quantil = qnorm(1 - (alpha/2))
dp = (lambdae/sqrt(r))*t*exp(-lambdae*t)
sthat = exp(-lambdae*t)
c(
'Lower' = sthat - quantil*dp,
'Upper' = sthat + quantil*dp
)## Lower.(Intercept) Upper.(Intercept)
## -0.03643243 0.28442934
Questão 15
- \(f(t) = \frac{\theta}{t^{\theta+1}}\cdot \lambda^{\theta}\), com \(\theta,\lambda > 0\) e \(t> \lambda\)
Primeiro vamos achar \(S(t)\), então:
\[ F(t) = \int_{\lambda}^{t}\frac{\theta}{u^{\theta+1}}\cdot \lambda^{\theta}du = \]
\[ = \theta\cdot\lambda^{\theta}\int_{\lambda}^{t}\frac{1}{u^{\theta+1}}du = \theta\lambda^{\theta}\int_{\lambda}^{t}u^{-(\theta+1)}du = \] \[ = \theta\cdot \lambda^{\theta} \cdot \bigg[\frac{u^{-(\theta+1) + 1}}{-(\theta+1) + 1}\bigg]_{\lambda}^{t} = \]
\[ = \theta\lambda^{\theta}\bigg[\frac{\lambda^{-\theta}}{\theta} - \frac{t^{-\theta}}{\theta}\bigg] = 1 - \lambda^{\theta}\cdot t^{-\theta} \]
\[ \therefore S(t) = \lambda^{\theta}\cdot t^{-\theta} \] Agora, substituindo na verossimilhança, temos:
\[ L(\theta) = \Pi_{i=1}^{n}\bigg(\frac{\theta\cdot\lambda^{\theta}}{t_i^{\theta+1}}\bigg)^{\delta_i}\cdot\bigg(\frac{\lambda^{\theta}}{t_i^{\theta}}\bigg)^{1 - \delta_i} = \Pi_{i=1}^{n}\bigg(\frac{\theta}{t_i}\bigg)^{\delta_i}\cdot \bigg(\frac{\lambda}{t_i}\bigg)^{\theta} \]
Logo, derivando, temos:
\[ l(\theta) = \sum_{i=1}^{n}\bigg\{\delta_i\cdot\bigg[ln(\theta) - ln(t_i)\bigg] + \theta\cdot\bigg[ln(\lambda) - ln(t_i)\bigg]\bigg\} = \]
\[ = ln(\theta)\sum_{i=1}^{n}\delta_i - \sum_{i=1}^{n}\delta_i\cdot ln(t_i) + n\cdot\theta\cdot ln(\lambda) - \theta\sum_{i=1}^{n}ln(t_i) \]
Por fim, derivando a expressão acima, temos:
\[ \frac{\partial l(\theta)}{\partial\theta} = \frac{1}{\theta}\cdot \sum_{i=1}^{n}\delta_i + n\cdot ln(\lambda) - \sum_{i=1}^{n}ln(t_i) \]
igualando a zero agora, temos:
\[ \frac{1}{\hat{\theta}}\cdot \sum_{i=1}^{n}\delta_i + n\cdot ln(\lambda) - \sum_{i=1}^{n}ln(t_i) = 0 \rightarrow \]
\[ \rightarrow \frac{1}{\hat{\theta}}\cdot \sum_{i=1}^{n}\delta_i = \sum_{i=1}^{n}ln(t_i) - n\cdot ln(\lambda) = \sum_{i=1}^{n}[ln(t_i) - ln(\lambda)] \rightarrow \]
por fim, como \(\sum_{i=1}^{n}\delta_i = r\), temos:
\[ \frac{1}{\hat{\theta}} = \frac{\sum_{i=1}^{n}ln(t_i) - n\cdot ln(\lambda) = \sum_{i=1}^{n}[ln(t_i) - ln(\lambda)]}{r} \]
\[ \therefore \hat{\theta} = \frac{r}{\sum_{i=1}^{n}[ln(t_i) - ln(\lambda)]} \]
Questão 19
tempos <- c(7,8,10,12,13,14,19,23,25,26,27,31,31,49,59,64,87,89,107,117,119,230,233,130,148,153,156,159,191,222,200,203,210,220,220,228,235,240,240,240,241,245,247,248,250)
cens <- c(1,1,1,1,1,0,1,1,0,1,1,1,0,1,0,0,1,1,1,1,1,0,0,1,1,1,1,1,1,1,0,0,0,0,1,0,0,1,0,0,0,0,0,0,0)
data =data.frame(tempos, cens)A)
#Ajustando modelo exponencial
ajust1<-survreg(Surv(tempos,cens)~1,dist="exponential")
ajust1## Call:
## survreg(formula = Surv(tempos, cens) ~ 1, dist = "exponential")
##
## Coefficients:
## (Intercept)
## 5.484963
##
## Scale fixed at 1
##
## Loglik(model)= -162.1 Loglik(intercept only)= -162.1
## n= 45
#Estimativas do par?metro no modelo Exponencial
lambda<-exp(ajust1$coefficients[1])
lambdae<-1/lambda
lambdae## (Intercept)
## 0.004148689
#Ajustando modelo Weibull
ajust2<-survreg(Surv(tempos,cens)~1,dist='weibull')
ajust2## Call:
## survreg(formula = Surv(tempos, cens) ~ 1, dist = "weibull")
##
## Coefficients:
## (Intercept)
## 5.518501
##
## Scale= 1.103829
##
## Loglik(model)= -162 Loglik(intercept only)= -162
## n= 45
#Estimativas do par?metro no modelo Weibull
lambda1<-exp(ajust2$coefficients[1])
lambdaw<-1/lambda1
gama<-1/ajust2$scale
cbind(gama, lambdaw)## gama lambdaw
## (Intercept) 0.9059372 0.004011857
#Ajustando modelo Lognormal
ajust3<-survreg(Surv(tempos,cens)~1,dist='lognormal')
ajust3## Call:
## survreg(formula = Surv(tempos, cens) ~ 1, dist = "lognormal")
##
## Coefficients:
## (Intercept)
## 5.063596
##
## Scale= 1.607539
##
## Loglik(model)= -161.9 Loglik(intercept only)= -161.9
## n= 45
#Estimativas do par?metro no modelo Lognormal
mu<- ajust3$coefficients
sigma<- ajust3$scale
cbind(mu,sigma)## mu sigma
## (Intercept) 5.063596 1.607539
B)
Vamos obter a sobrevivencia estimada para todos os valores do banco:
ekm<-survfit(Surv(tempos,cens)~1) #estimador de kaplan-meier
time<-ekm$time #tempos
st<-ekm$surv #sobreviv?ncia kaplan meier
ste<- exp(-time*lambdae) #sobreviv?ncia exponencial
stw<- exp(-(time*lambdaw)^gama) #sobreviv?ncia weibull
stln<- pnorm((-log(time)+ mu)/sigma) #sobreviv?ncia lognormal
knitr::kable(cbind(time,st,ste,stw,stln))| time | st | ste | stw | stln |
|---|---|---|---|---|
| 7 | 0.9777778 | 0.9713768 | 0.9614627 | 0.9737746 |
| 8 | 0.9555556 | 0.9673552 | 0.9566161 | 0.9682981 |
| 10 | 0.9333333 | 0.9593619 | 0.9471575 | 0.9570596 |
| 12 | 0.9111111 | 0.9514347 | 0.9379674 | 0.9456565 |
| 13 | 0.8888889 | 0.9474956 | 0.9334607 | 0.9399472 |
| 14 | 0.8888889 | 0.9435729 | 0.9290080 | 0.9342522 |
| 19 | 0.8660969 | 0.9242016 | 0.9074586 | 0.9062919 |
| 23 | 0.8433048 | 0.9089912 | 0.8909586 | 0.8848161 |
| 25 | 0.8433048 | 0.9014802 | 0.8829236 | 0.8744213 |
| 26 | 0.8198797 | 0.8977480 | 0.8789560 | 0.8693120 |
| 27 | 0.7964546 | 0.8940312 | 0.8750205 | 0.8642616 |
| 31 | 0.7730294 | 0.8793174 | 0.8595830 | 0.8446439 |
| 49 | 0.7488723 | 0.8160450 | 0.7952625 | 0.7669762 |
| 59 | 0.7488723 | 0.7828825 | 0.7625723 | 0.7301928 |
| 64 | 0.7488723 | 0.7668101 | 0.7469270 | 0.7132123 |
| 87 | 0.7230491 | 0.6970236 | 0.6802072 | 0.6449802 |
| 89 | 0.6972259 | 0.6912641 | 0.6747759 | 0.6397027 |
| 107 | 0.6714027 | 0.6415233 | 0.6282543 | 0.5960298 |
| 117 | 0.6455795 | 0.6154531 | 0.6041112 | 0.5743677 |
| 119 | 0.6197564 | 0.6103675 | 0.5994182 | 0.5702306 |
| 130 | 0.5939332 | 0.5831391 | 0.5743741 | 0.5485361 |
| 148 | 0.5681100 | 0.5411786 | 0.5360142 | 0.5164696 |
| 153 | 0.5422868 | 0.5300683 | 0.5258968 | 0.5082281 |
| 156 | 0.5164636 | 0.5235119 | 0.5199329 | 0.5034097 |
| 159 | 0.4906405 | 0.5170366 | 0.5140472 | 0.4986826 |
| 191 | 0.4648173 | 0.4527571 | 0.4558024 | 0.4532832 |
| 200 | 0.4648173 | 0.4361636 | 0.4408006 | 0.4419555 |
| 203 | 0.4648173 | 0.4307688 | 0.4359244 | 0.4383023 |
| 210 | 0.4648173 | 0.4184388 | 0.4247806 | 0.4300040 |
| 220 | 0.4338295 | 0.4014342 | 0.4094106 | 0.4186678 |
| 222 | 0.4004580 | 0.3981172 | 0.4064117 | 0.4164701 |
| 228 | 0.4004580 | 0.3883295 | 0.3975611 | 0.4100090 |
| 230 | 0.4004580 | 0.3851207 | 0.3946588 | 0.4078983 |
| 233 | 0.4004580 | 0.3803572 | 0.3903493 | 0.4047713 |
| 235 | 0.4004580 | 0.3772143 | 0.3875054 | 0.4027122 |
| 240 | 0.3504007 | 0.3694701 | 0.3804955 | 0.3976518 |
| 241 | 0.3504007 | 0.3679405 | 0.3791105 | 0.3966544 |
| 245 | 0.3504007 | 0.3618850 | 0.3736258 | 0.3927125 |
| 247 | 0.3504007 | 0.3588947 | 0.3709164 | 0.3907696 |
| 248 | 0.3504007 | 0.3574089 | 0.3695699 | 0.3898050 |
| 250 | 0.3504007 | 0.3544556 | 0.3668930 | 0.3878896 |
Vamos considerar o tempo 90 dias ate a ocorrencia dos primeiros sinais de alteracoes
t = 90
ste10<- exp(-t*lambdae)
stw10<- exp(-(t*lambdaw)^gama)
stln10<- pnorm((-log(t)+ mu)/sigma)
ste10 #Exponencial## (Intercept)
## 0.6884022
stw10 #Weibull## (Intercept)
## 0.6720808
stln10 #Lognormal## (Intercept)
## 0.6370984
C)
- Método 01
#M?todo Gr?fico 1
par(mfrow=c(1,3))
plot(ste,st,pch=16,ylim=range(c(0.0,1)), xlim=range(c(0,1)), ylab = "S(t): Kaplan-Meier",
xlab="S(t): exponencial")
lines(c(0,1), c(0,1), type="l", lty=1)
plot(stw,st,pch=16,ylim=range(c(0.0,1)), xlim=range(c(0,1)), ylab = "S(t): Kaplan-Meier",
xlab="S(t): Weibull")
lines(c(0,1), c(0,1), type="l", lty=1)
plot(stln,st,pch=16,ylim=range(c(0.0,1)), xlim=range(c(0,1)), ylab = "S(t): Kaplan-Meier",
xlab="S(t): log-normal")
lines(c(0,1), c(0,1), type="l", lty=1)- Método 02
#Linearizando as fun??es de sobreviv?ncia
#Exponencial:
lste=-log(st)
#Weibull:
lstw=log(-log(st))
#Lognormal:
lstln=qnorm(st)
par(mfrow=c(1,3))
plot(time, lste,pch=16,xlab="tempos",ylab="-log(S(t))",main="Exponencial")
abline(lm(lste ~ time))
plot(log(time),lstw,pch=16,xlab="log(tempos)",ylab="log(-log(S(t)))",main="Weibull")
abline(lm(lstw ~ log(time)))
plot(log(time),lstln, pch=16,xlab="log(tempos)", ylab=expression(Phi^-1 * (S(t))),main="Log-Normal")
abline(lm(lstln ~ log(time))) * Analise:
Será escolhido o modelo exponencial, por aparentar, primeiramente para o metodo 01, que seus pontos coiencidem em sua maioria com a reta y =x quando plotado versus o kaplan meier, aparentando estar com a sobrevivencia mais bem ajustada que os demais. Além disso, ao analisarmos o metodo 02, os pontos por mais que distoam em alguns tempos, o modelo exponencial ainda aparentou apresentar os seus valores estimados mais proximos da reta y=x, onde y é a função de sobrevivencia linearizade e y o tempo linearizado.
D)
fitgg <- flexsurvreg(formula = Surv(tempos, cens) ~ 1,dist="gengamma")
fite <- flexsurvreg(formula = Surv(tempos, cens) ~ 1, dist="exp")
#Obten??o das log-verossimilhan?as
loglikm1<-fite$loglik[1] #log-verossimilhan?a modelo exponencial
loglikG<-fitgg$loglik[1] #log-verossimilhan?a modelo generalizado
#Obten??o das estat?sticas de teste
RVe<-2*(loglikG-loglikm1) #Estat?stica da raz?o de verossimilhan?as para o modelo exponencial
#Obten??o dos p-valores dos testes
pRV1<-1-pchisq(RVe,2) #Exponencial
pRV1 #Exponencial## [1] 0.7567779
Considerando o teste de razão de verossimilhança para modelos encaixados, a um nivel de significancia de 5%, não rejeitamos a hipotese nula, ou seja, temos evidencias estatisticamente significantes para afirmar que o modelo exponencial está adequado para se utilizar.