PHC6177_HW3_question5

The rats data set in the survival package in R can be loaded by typing “data(rats)”. It contains data on 300 rats-3 selected from each of 100 litters-that were followed for tumor incidence. The variable time is the follow-up times and the variable status is the event indicator. a. How many rats had tumors? What was the total rat-time contributed by all the animals? What was the incidence rate for tumors?

library(survival)
##data(rats, package="survival")

rats1 <- rats[ which(rats$status=='1'), ] ##no. of rats who had tumor
##nrows(rats1)
sum(rats$time) ##total rat time

## [1] 27132

42 rats had tumors incidence rate=42/27132=0.00154

Calculate a Wald 95% confidence interval for the true incidence rate lambda. (which is also the exponential rate parameter and hazard).

## m=42
## T=27132
0.00154+1.96*sqrt(42/(27132^2))

## [1] 0.002008165

0.00154-1.96*sqrt(42/(27132^2))

## [1] 0.001071835

Calculate a log-transformed 95% confidence interval for lambda.

(42/27132)*exp(1.96*sqrt(1/42))

## [1] 0.002094658

(42/27132)*exp(-1.96*sqrt(1/42))

## [1] 0.001143989

Use the function survreg to fit an exponential distribution and calculate the 95% confidence interval for lambda. Does it agree with the Wald or log-transformed confidence intervals?

expfit<-survreg(Surv(time, status) ~1, data=rats, dist="exponential")
summary(expfit)

## 
## Call:
## survreg(formula = Surv(time, status) ~ 1, data = rats, dist = "exponential")
##             Value Std. Error    z      p
## (Intercept) 6.471      0.154 41.9 <2e-16
## 
## Scale fixed at 1 
## 
## Exponential distribution
## Loglik(model)= -313.8   Loglik(intercept only)= -313.8
## Number of Newton-Raphson Iterations: 5 
## n= 300

exp(-coef(expfit))

## (Intercept) 
## 0.001547988

confint(expfit)

##               2.5 %   97.5 %
## (Intercept) 6.16837 6.773229

exp(-confint(expfit))

##                   2.5 %      97.5 %
## (Intercept) 0.002094646 0.001143995

It agrees more with log-transformed confidence intervals.

Fit a Weibull distribution and get point estimates and 95% confidence intervals for the rate and shape parameters. Is the exponential distribution a good model for this data?

weibfit<-survreg(Surv(time, status) ~1, data=rats, dist="weibull")
summary(weibfit)

## 
## Call:
## survreg(formula = Surv(time, status) ~ 1, data = rats, dist = "weibull")
##               Value Std. Error     z      p
## (Intercept)  5.0791     0.0825 61.55 <2e-16
## Log(scale)  -1.2995     0.1425 -9.12 <2e-16
## 
## Scale= 0.273 
## 
## Weibull distribution
## Loglik(model)= -287.1   Loglik(intercept only)= -287.1
## Number of Newton-Raphson Iterations: 9 
## n= 300

lambda<-exp(-coef(weibfit))
lambda

## (Intercept) 
## 0.006225671

confint(weibfit)

##                2.5 %   97.5 %
## (Intercept) 4.917331 5.240818

exp(-confint(weibfit))

##                   2.5 %      97.5 %
## (Intercept) 0.007318641 0.005295925

Compared to Weibull distribution, exponential distribution is a good model for this data.

PHC6177_HW3_question5

Kunal Dolas

9/13/2021