Statistical Methods for Reliability Data

Chapter 9 - Bootstrap Confidence Intervals

W. Q. Meeker, L. A. Escobar, and J. K. Freels

03 August 2018

OVERVIEW

This chapter explains…

9.1 - Introduction

Comparison of Methods to Construct CI

9.2.1 - Bootstrap Sampling

General Idea

9.2.1 - Bootstrap Sampling - Example

Background

\[ \Big[\underset{\sim}{\theta}, \overset{\sim}{\theta}\Big] = \Big[\widehat{\theta}/w,\widehat{\theta}\times w\Big] \]

\[ Z_{\log(\hat{\theta)}}=\frac{\log(\hat{\theta})-\log(\theta)}{\widehat{se}_{\log(\hat{\theta})}}\sim NOR(0,1) \]

9.2.2 - Bootstrap Sampling Methods

Several bootstrap sampling methods exist

Procedure to compute a bootstrap CI

  1. Observe \(n\) observations generated from some process with unknown distrubution \(F(t|\mathbf{\theta})\)

  2. Use maximum likelihood to estimate the form of \(F(t)\) and the value(s) of \(\hat{\theta}\)

  3. Generate bootstrap samples of size \(n\) from the estimated distribution

    • For bootstrap sample \(i = 1\) simulate \(n\) observations from \(F(t|\mathbf{\hat{\theta}})\)

    • Use ML to estimate \(\hat{\theta}^*_1\) for bootstrap sample \(1\)

    • Repeat this process of sampling & estimating until \(B\) samples have been generated and \(B\) estimates of \(\hat{\theta}^*_i, i = 1,...,B\) have been computed

    • \(B\) should be a large number 10,000 is a small number of bootstrap sample

  4. Order the estimated values \(\hat{\theta}^*_i, i = 1,...,B\) from smallest to largest

  5. The upper and lower endpoints of the \(95\%\) CI for \(\theta\) using the bootstrap method are found as

    • The lower endpoint is the value of \(\hat{\theta}^*_i\) that is the \(B \times 2.5%\) smallest

    • The upper endpoint is the value of \(\hat{\theta}^*_i\) that is the \(B \times 97.5%\) smallest

9.3 - Exponential Distribution Confidence Intervals - Example

Example 9.1

library(SMRD)
## Step 1 - observe data (already done)

## Step 2 - Compute ML estimate for parameters based on assumed model
berk20.ld <- frame.to.ld(berkson20, 
                         response.column = c(1,2),
                         censor.column = 3,
                         case.weight.column = 4)

berk20.ml <- mlest(berk20.ld, distribution = 'exponential')

theta_ml <- print(berk20.ml)$mle[3,1]

theta_ml
[1] 440.1711
## Step 3 - Generate first set of n bootstrap observations from ML distribution
berk20_boot <- rexp(20, rate = 1/theta_ml)

# enforce censoring by cutting bootstrap samples into observation intervals 
breaks <- c(0,unique(c(berkson20[[1]], berkson20[[2]])),Inf)

Cut <- cut(berk20_boot, 
           breaks = breaks, 
           ordered_result = T)

# build data.frame or bootstrap data
berkson20_boot <- cbind(berkson20[,-4], as.numeric(table(Cut)))

# use this new data set to generate the first bootstrap sample for theta
berk20.ld_boot <- frame.to.ld(berkson20_boot, 
                              response.column = c(1,2),
                              censor.column = 3,
                              case.weight.column = 4)

berk20.ml_boot <- mlest(berk20.ld_boot, distribution = 'exponential')

theta_ml_boot <- print(berk20.ml_boot)$mle[3,1]

theta_ml_boot
[1] 477.402
library(SMRD)
## Step 1 - observe data (already done)

## Step 2 - Compute ML estimate for parameters based on assumed model
berk20.ld <- frame.to.ld(berkson20, 
                         response.column = c(1,2),
                         censor.column = 3,
                         case.weight.column = 4)

berk20.ml <- mlest(berk20.ld, distribution = 'exponential')

theta_ml <- print(berk20.ml)

# Set number of desired bootstrap samples
B = 2000

# Pre-allocate matrix to contain results
theta_star <- matrix(NA, nrow = B, ncol = 5)
colnames(theta_star) <- c('theta*', 
                          'log(theta*)', 
                          'se_log(theta*)',
                          'Z_theta*', 
                          'Z_log(theta*)')

for(i in 1:B) {
  
    # Generate n bootstrap observations from ML distribution
    berk20_boot <- rexp(20, rate = 1/theta_ml$mle[3,1])
    
    # enforce censoring by cutting bootstrap samples into observation intervals 
    breaks <- c(0,unique(c(berkson20[[1]], berkson20[[2]])),Inf)
    
    Cut <- cut(berk20_boot, 
               breaks = breaks, 
               ordered_result = T)
    
    # build data.frame or bootstrap data
    berkson20_boot <- cbind(berkson20[,-4], as.numeric(table(Cut)))
    
    # use this new data set to generate the first bootstrap sample for theta
    berk20.ld_boot <- frame.to.ld(berkson20_boot, 
                                  response.column = c(1,2),
                                  censor.column = 3,
                                  case.weight.column = 4)
    
    berk20.ml_boot <- print(mlest(berk20.ld_boot, distribution = 'exponential'))
    
    theta_star[i,1] <- berk20.ml_boot$mle[3,1]
    theta_star[i,2] <- log(berk20.ml_boot$mle[3,1])
    theta_star[i,3] <- berk20.ml_boot$mle[3,2]
    theta_star[i,4] <- (berk20.ml_boot$mle[3,1] - theta_ml$mle[3,1]) / berk20.ml_boot$mle[3,2]
    theta_star[i,5] <- (log(berk20.ml_boot$mle[3,1]) - log(theta_ml$mle[3,1])) / (berk20.ml_boot$mle[3,2] / berk20.ml_boot$mle[3,1])

}
First derivatives of the loglikelihood= 0,0 

Start values
[1] 6.774224 1.000000
[1] 6.774224 1.000000
First derivatives of the loglikelihood= 0,0 

Start values
[1] 6.937314 1.000000
[1] 6.937314 1.000000
First derivatives of the loglikelihood= 0,0 

Start values
[1] 7.038784 1.000000
[1] 7.038784 1.000000
First derivatives of the loglikelihood= 0,0 

Start values
[1] 7.349231 1.000000
[1] 7.349231 1.000000
First derivatives of the loglikelihood= 0,0 

Start values
[1] 6.922644 1.000000
[1] 6.922644 1.000000
First derivatives of the loglikelihood= 0,0 

Start values
[1] 7.247793 1.000000
[1] 7.247793 1.000000
First derivatives of the loglikelihood= 0,0 

Start values
[1] 6.856462 1.000000
[1] 6.856462 1.000000
First derivatives of the loglikelihood= 0,0 

Start values
[1] 6.951772 1.000000
[1] 6.951772 1.000000
First derivatives of the loglikelihood= 0,0 

Start values
[1] 7.247793 1.000000
[1] 7.247793 1.000000
First derivatives of the loglikelihood= 0,0 

Start values
[1] 6.902743 1.000000
[1] 6.902743 1.000000
First derivatives of the loglikelihood= 0,0 

Start values
[1] 6.984716 1.000000
[1] 6.984716 1.000000
First derivatives of the loglikelihood= 0,0 

Start values
[1] 7.200425 1.000000
[1] 7.200425 1.000000
First derivatives of the loglikelihood= 0,0 

Start values
[1] 7.003065 1.000000
[1] 7.003065 1.000000
First derivatives of the loglikelihood= 0,0 

Start values
[1] 6.951772 1.000000
[1] 6.951772 1.000000
First derivatives of the loglikelihood= 0,0 

Start values
[1] 6.802395 1.000000
[1] 6.802395 1.000000
First derivatives of the loglikelihood= 0,0 

Start values
[1] 6.507278 1.000000
[1] 6.507278 1.000000
First derivatives of the loglikelihood= 0,0 

Start values
[1] 7.04316 1.00000
[1] 7.04316 1.00000
First derivatives of the loglikelihood= 0,0 

Start values
[1] 7.077498 1.000000
[1] 7.077498 1.000000
First derivatives of the loglikelihood= 0,0 

Start values
[1] 7.007601 1.000000
[1] 7.007601 1.000000
First derivatives of the loglikelihood= 0,0 

Start values
[1] 6.756932 1.000000
[1] 6.756932 1.000000
First derivatives of the loglikelihood= 0,0 

Start values
[1] 6.84588 1.00000
[1] 6.84588 1.00000
First derivatives of the loglikelihood= 0,0 

Start values
[1] 7.142827 1.000000
[1] 7.142827 1.000000
First derivatives of the loglikelihood= 0.333147794008255,0 

Start values
[1] 6.59987 1.00000
[1] 6.59987 1.00000
First derivatives of the loglikelihood= 0,0 

Start values
[1] 7.077498 1.000000
[1] 7.077498 1.000000
First derivatives of the loglikelihood= 0,0 

Start values
[1] 6.807935 1.000000
[1] 6.807935 1.000000
par(mfrow = c(1,2))

hist(theta_star[,4], 
     breaks = 50, 
     col = scales::alpha('red', 0.35), 
     border = 'white',
     las = 1,
     xlab = expression(Z[theta^'*']),
     main = expression('Histogram of '*Z[theta^'*']))

hist(theta_star[,5], 
     breaks = 50, 
     col = scales::alpha('blue', 0.35), 
     border = 'white',
     las = 1,
     xlab = expression(Z[log(theta^'*')]),
     main = expression('Histogram of '*Z[log(theta^'*')]))

exp_plot <- function(theta, p = seq(0.01, 0.99, 0.01)) { -log(1 - p) * theta }

xmat <- matrix(NA, nrow = B, ncol = length(seq(0.01, 0.99, 0.01)))
ymat <- matrix(NA, nrow = B, ncol = length(seq(0.01, 0.99, 0.01)))

# fill ymat
for(i in 1:B) {
  
   ymat[i,] <- -log(1-seq(0.01, 0.99, 0.01))*theta_star[i,1]
   xmat[i,] <- exp_plot(theta = theta_star[i,1])
  
}

matplot(x = xmat, y = ymat, type = 'l')

plot(y = -log(1-seq(0.01, 0.99, 0.01))*theta_ml$mle[3,1], x = exp_plot(theta_ml$mle[3,1]))

9.3.1 - Bootstrap CI for \(\theta\)

9.3.2 - Bootstrap CI for \(f(\theta)\)

9.3.3 - Comparison of CI methods

9.4 - Weibull, Lognormal and Loglogistic Distribution CI’s

9.4.1 - Construction of CI for parameters \(\bar{p}\)

9.4.2 - Construction of CI for \(f(\bar{p})\)

9.4.3 - Comparison of CI methods

9.5 - Nonparametric Bootstrap Confidence Intervals

9.5.1 - Nonparametric bootstrap sampling

9.5.2 - Warning on NP bootstrap estimates

9.5.3 - Distribution of bootstrap statistics

9.5.4 - Pointwise NP bootstrap CI’s for \(F(t_i)\)

9.6 - Percentile Bootstrap Method