I. Introduction and Motivation

Reliability analysis focuses on quantifying how long a system or component continues to perform its intended function before failure. The fundamental metric in this analysis is the reliability (or survival) function, defined as:

\[ R(t) = P(T > t), \]

where \(T\) is a random variable representing the lifetime of the unit.

In engineering and biomedical applications, lifetime data is rarely complete. It is frequently subject to right-censoring, meaning that for some units, the event of interest (failure) has not occurred by the end of the observation period.

To address this, the Kaplan-Meier (KM) estimator provides a robust, nonparametric estimate of the reliability function, denoted as \(\hat{R}(t)\).

The Challenge: Pointwise vs. Simultaneous Uncertainty

While traditional reliability reports focus on the estimated curve \(\hat{R}(t)\) or construct confidence intervals using Greenwood’s variance formula, these methods typically produce pointwise intervals. A pointwise interval guarantees that the true reliability falls within the bounds at a specific time \(t\) with probability \(1-\alpha\).

However, engineering applications often require stronger guarantees. For example, when establishing warranty periods or safety certifications, analysts need to ensure that the entire reliability curve lies within a specific region. This necessitates the use of Simultaneous Confidence Bands (SCBs). Unlike pointwise intervals, SCBs provide a global envelope \([L(t), U(t)]\) such that:

\[ P\left( L(t) \le R(t) \le U(t) \ \text{for all } t \in [0, t_{\max}] \right) \approx 1 - \alpha. \]

Project Scope

The goal of this project is to construct and compare two distinct approaches for generating these global confidence bands:

Bootstrap-based bands: A data-driven, computational approach that uses resampling to approximate the distribution of the maximum deviation.
Borell-TIS Gaussian bounds: An analytical approach utilizing results from Gaussian process theory—specifically the Borell-TIS inequality—to derive rigorous probability bounds.

This comparison will illustrate the trade-offs between empirical adaptability (Bootstrap) and theoretical rigor (Borell-TIS) in the presence of censored data.

II. Objectives

To achieve the comparison outlined above, this project aims to:

Implement the Kaplan-Meier estimator \(\hat{R}(t)\) to handle right-censored lifetime data.
Calculate Greenwood’s variance to establish a baseline for pointwise uncertainty.
Construct Bootstrap-based simultaneous confidence bands for \(\hat{R}(t)\) using resampling techniques.
Derive and Compute Borell-TIS bounds for \(\hat{R}(t)\) by modeling the error process as a Gaussian process.
Compare the performance of both methods using:
1. Simulated Data: Weibull-distributed lifetimes with controlled censoring rates.
2. Empirical Data: The cancer dataset from the R survival package, serving as a realistic proxy for component reliability data.

The ultimate objective is to provide a clear recommendation on when to prefer computational resampling versus analytical Gaussian bounds for reliability estimation.

III. Methodology

A. The Kaplan-Meier Estimator

To estimate the reliability function \(R(t)\) in the presence of right-censoring, we utilize the Product-Limit (Kaplan-Meier) estimator.

Suppose \(n\) units are placed on test. For each unit \(i = 1, \dots, n\), we observe a pair \((X_i, \delta_i)\), where:

\[ X_i = \min(T_i, C_i), \quad \delta_i = \mathbf{1}\{T_i \le C_i\}. \]

Here, \(T_i\) represents the true lifetime of the unit, and \(C_i\) represents the censoring time (e.g., the end of the study or loss to follow-up). The indicator \(\delta_i\) equals 1 if the failure is observed and 0 if the observation is censored.

Let \(t_{(1)} < t_{(2)} < \dots < t_{(k)}\) denote the distinct ordered failure times observed in the sample. For each distinct failure time \(t_{(j)}\), we define:

\(d_j\): The number of failures occurring exactly at \(t_{(j)}\).
\(n_j\): The number of units “at risk” (still operating and uncensored) just prior to \(t_{(j)}\).

The Kaplan-Meier estimator is defined as the product over failure times:

\[ \hat{R}(t) = \prod_{t_{(j)} \le t} \left( 1 - \frac{d_j}{n_j} \right). \]

For \(t < t_{(1)}\), \(\hat{R}(t) = 1\). The estimator is a step function that drops only at observed failure times.

B. Greenwood’s Variance and Pointwise Intervals

To quantify the uncertainty of \(\hat{R}(t)\), we require an estimate of its variance. A standard large-sample approximation is provided by Greenwood’s formula:

\[ \widehat{\mathrm{Var}}[\hat{R}(t)] = \hat{R}(t)^2 \sum_{t_{(j)} \le t} \frac{d_j}{n_j (n_j - d_j)}. \]

Using this variance, we can construct a \((1-\alpha)\) pointwise confidence interval for \(R(t)\) at any fixed time \(t\):

\[ \hat{R}(t) \pm z_{\alpha/2} \sqrt{\widehat{\mathrm{Var}}[\hat{R}(t)]}, \]

where \(z_{\alpha/2}\) is the upper \(\alpha/2\) quantile of the standard normal distribution.

Simulation Study: KM and Pointwise Intervals

To evaluate the performance of the estimators, we first generate synthetic data where the true reliability function is known. This allows us to visually verify the accuracy of the Kaplan-Meier estimator and the coverage of the Greenwood pointwise intervals.

Data Simulation

We simulate lifetimes from a Weibull distribution, a common model in reliability engineering for components exhibiting wear-out behavior.

True Lifetime (\(T\)): \(T \sim \text{Weibull}(\text{shape} = 2, \text{scale} = 100)\).
Censoring Time (\(C\)): \(C \sim \text{Uniform}(0, 150)\).
Sample Size (\(n\)): 500 units.

The observed time is \(X_i = \min(T_i, C_i)\) and the status is \(\delta_i = \mathbf{1}\{T_i \le C_i\}\).

knitr::opts_chunk$set(warning = FALSE, message = FALSE)
# library
library(survival)

# 1. set Seed
set.seed(2025)

# 2. parameters
n <- 500
shape_param <- 2
scale_param <- 100

# 3. generate Weibull distribution 
T_true <- rweibull(n, shape = shape_param, scale = scale_param)

# 4. generate censoring times (Uniform). set max=150 to ensure a moderate percentage of censoring
C_times <- runif(n, min = 0, max = 150)

# 5. construct observed data X_i
obs_time <- pmin(T_true, C_times)
status <- as.numeric(T_true <= C_times) # indicator delta_i (1=failure, 0=censored)

# survival object
surv_obj <- Surv(obs_time, status)

# see the first 5 observations
head(surv_obj)

## [1] 55.77884  62.56528+ 81.55441  83.44385  49.80930  82.74532

Kaplan-Meier Fit and Greenwood Intervals

Next, we fit the Kaplan-Meier estimator using the survfit function. R also calculates the standard error using Greenwood’s formula and produces pointwise 95% confidence intervals.

# fit the Kaplan-Meier estimator
km_fit <- survfit(surv_obj ~ 1)

# define dynamic time points for the summary
# we pick 5 evenly spaced points based on the data range to inspect the fit
time_points <- pretty(c(0, max(obs_time)), n = 5)
time_points <- time_points[time_points > 0]

# summary
summary(km_fit, times = time_points)

## Call: survfit(formula = surv_obj ~ 1)
## 
##  time n.risk n.event survival std.err lower 95% CI upper 95% CI
##    50    261      99    0.751  0.0219        0.710        0.795
##   100     55      98    0.368  0.0310        0.312        0.434

We now visualize the estimated reliability curve \(\hat{R}(t)\). The solid line represents the KM estimate, and the dashed lines represent the pointwise intervals. Because this is a simulation, we can also overlay the true Weibull reliability curve (in red) to verify accuracy.

# plotting the KM Curve with pointwise CIs
plot(km_fit, 
     main = "Kaplan-Meier Estimate with Pointwise 95% CI",
     xlab = "Time (t)", 
     ylab = "Reliability R(t)",
     conf.int = TRUE,   
     col = "black",        
     lwd = 2, 
     las = 1)

# add in the TRUE theoretical Weibull curve for comparison
curve(pweibull(x, shape=shape_param, scale=scale_param, lower.tail=FALSE), 
      add=TRUE, col="red", lwd=2, lty=2)

legend("bottomleft", 
       legend = c("KM Estimate", "95% Pointwise CI", "True Weibull R(t)"),
       col = c("black", "black", "red"),
       lty = c(1, 2, 2),
       lwd = c(2, 1, 2))

The KM estimate (step function) tracks the True Weibull curve (red dashed line) very closely. Although, we see as time jumps from \(t=50\) to \(t=100\) the standard error increase, hence the confidence band gets wider.

C. Simultaneous Confidence Bands

While pointwise intervals are useful for local estimation, they do not quantify the uncertainty of the survival curve as a whole. We seek Simultaneous Confidence Bands (SCBs) \([L(t), U(t)]\) such that:

\[ P\left( L(t) \le R(t) \le U(t) \ \text{for all } t \in [0, t_{\max}] \right) \approx 1 - \alpha. \]

The probability that the entire curve remains within the bounds is significantly lower than the probability that a single point remains within the bounds. Therefore, to maintain a 95% global confidence level, simultaneous bands must be wider than pointwise intervals.

We employ two methods to construct these bands.

1. Bootstrap-based Bands (Empirical)

The bootstrap method approximates the distribution of the global deviation statistic by resampling the observed data.

The Algorithm: 1. Resample: Draw \(B\) random samples of size \(n\) from the original data \(\{(X_i, \delta_i)\}\) with replacement. 2. Estimate: For each bootstrap sample \(b\), compute the Kaplan-Meier estimator \(\hat{R}^{*(b)}(t)\). 3. Supremum Statistic: Calculate the maximum absolute difference between the bootstrap estimate and the original estimate over the entire time range: \(T_b = \sup_{t} |\hat{R}^{*(b)}(t) - \hat{R}(t)\) 4. Threshold: Determine the critical value \(c_\alpha\) as the \((1-\alpha)\) empirical quantile of the statistics \(\{T_1, \dots, T_B\}\).

The resulting band is uniform in width: \[ L(t) = \hat{R}(t) - c_\alpha, \quad U(t) = \hat{R}(t) + c_\alpha. \]

2. Borell-TIS Gaussian Bounds (Analytical Approach)

We now derive the analytical bounds. This method relies on the asymptotic convergence of the normalized Kaplan-Meier error process.

Step 1: The Normalized Error Process

We define the normalized Kaplan-Meier error process as:

\[ G_n(t) = \sqrt{n}(\hat{R}(t) - R(t)) \]

By the Central Limit Theorem in function space, as \(n \to \infty\):

\[ G_n(t) \implies \mathcal{G}(t) \]

where \(\mathcal{G}(t)\) is a zero-mean Gaussian process. Under the hypothesis that \(\hat{R}(t)\) is the correct estimator, this process should fluctuate around zero. We seek a confidence band that covers the supremum (maximum absolute value) of this process with probability \(1-\alpha\).

Step 2: The Borell-TIS Inequality

The Borell-TIS inequality states that if \(\mathcal{G}(t)\) is a centered Gaussian process with maximum variance \(\sigma^2_{\max} = \sup_t \mathrm{Var}(\mathcal{G}(t))\), then for any large threshold \(u > 0\):

\[ P\left( \sup_{t} \mathcal{G}(t) - \mathbb{E}[\sup_t \mathcal{G}(t)] > u \right) \le \exp\left(-\frac{u^2}{2\sigma^2_{\max}}\right) \]

This theorem tells us that the maximum of the process concentrates sharply around its mean.

Step 3: Solving for the Threshold

We want to find a threshold \(u_\alpha\) such that the probability of exceeding it is at most \(\alpha\). Setting the exponential bound equal to \(\alpha\):

\[ \exp\left(-\frac{u_\alpha^2}{2\sigma^2_{\max}}\right) \le \alpha \]

Taking the natural log of both sides (and noting \(\ln(\alpha) < 0\)):

\[ -\frac{u_\alpha^2}{2\sigma^2_{\max}} \le \ln(\alpha) \]

Multiplying by \(-1\) reverses the inequality:

\[ \frac{u_\alpha^2}{2\sigma^2_{\max}} \ge -\ln(\alpha) = \ln(1/\alpha) \]

Solving for \(u_\alpha\), we get the smallest value satisfying the condition:

\[ u_\alpha = \sigma_{\max} \sqrt{2 \ln(1/\alpha)} \]

Step 4: Translating back to the Empirical Estimator

The derivation above applies to the normalized process \(G_n(t) = \sqrt{n}(\hat{R}(t) - R(t))\). To find the band width for the actual reliability function \(R(t)\), we must rescale by dividing by \(\sqrt{n}\).

\[ |\hat{R}(t) - R(t)| \le \frac{u_\alpha}{\sqrt{n}} \]

Substituting our solution for \(u_\alpha\):

\[ |\hat{R}(t) - R(t)| \le \frac{\sigma_{\max} \sqrt{2\ln(1/\alpha)}}{\sqrt{n}} \]

This yields the conservative simultaneous confidence bands:

\[ L(t) = \hat{R}(t) - \frac{\sigma_{\max} k_\alpha}{\sqrt{n}}, \quad U(t) = \hat{R}(t) + \frac{\sigma_{\max} k_\alpha}{\sqrt{n}} \]

where \(k_\alpha = \sqrt{2\ln(1/\alpha)}\). In practice, we estimate \(\sigma_{\max}\) using the maximum of the standard errors calculated via Greenwood’s formula.

Constructing Simultaneous Confidence Bands

We now calculate the global confidence bands using both methods described in the methodology.

# 1. Bootstrap CIs
B <- 1000  # numbers of samples
alpha <- 0.05

# we evaluate on the specific failure times observed in the original data
time_grid <- summary(km_fit)$time
R_hat <- summary(km_fit)$surv # KM estimate

# placeholder for maximum deviations
sup_deviations <- numeric(B)

# bootstrap Loop
for(b in 1:B) {
  #resample data with replacement
  indices <- sample(1:n, n, replace = TRUE)
  boot_time <- obs_time[indices]
  boot_status <- status[indices]
  
  # then fit KM to bootstrap sample
  km_boot <- survfit(Surv(boot_time, boot_status) ~ 1)
  
  # now evaluate bootstrap KM on the exact same grid
  R_boot <- summary(km_boot, times = time_grid, extend = TRUE)$surv
  
  # finally calculate max absolute deviation for this loop
  diffs <- abs(R_boot - R_hat)
  sup_deviations[b] <- max(diffs, na.rm = TRUE)}

# Bootstrap critical value (c_alpha)

c_alpha_boot <- quantile(sup_deviations, 1 - alpha)


# 2. Borell-TIS CIs
# extract standard errors (Greenwood)
se_greenwood <- summary(km_fit)$std.err

# estimate maximum variance of the process: sigma^2_max = max(n * SE^2)
sigma_sq_max <- max(n * se_greenwood^2, na.rm = TRUE)

# calculate the threshold u_alpha: 
u_alpha <- sqrt(sigma_sq_max) * sqrt(2 * log(1/alpha))

# calculate the Borell-TIS uniform half-width:
width_bt <- u_alpha / sqrt(n)


# 3. plot
# plot KM Estimate (Black)
plot(km_fit, conf.int = FALSE, 
     main = paste("Simultaneous Confidence Bands (n=", n, ")", sep=""),
     xlab = "Time", ylab = "Reliability R(t)",
     lwd = 2, las = 1)

# add Bootstrap bands (blue, dashed)
lines(km_fit$time, pmax(0, km_fit$surv - c_alpha_boot), col = "blue", lty = 2, lwd = 2)
lines(km_fit$time, pmin(1, km_fit$surv + c_alpha_boot), col = "blue", lty = 2, lwd = 2)

# add Borell-TIS bands (Red, Dotted)
lines(km_fit$time, pmax(0, km_fit$surv - width_bt), col = "red", lty = 3, lwd = 2)
lines(km_fit$time, pmin(1, km_fit$surv + width_bt), col = "red", lty = 3, lwd = 2)

legend("bottomleft", 
       legend = c("KM Estimate", "Bootstrap SCB", "Borell-TIS SCB"),
       col = c("black", "blue", "red"),
       lty = c(1, 2, 3), lwd = 2)

# print results for comparison
cat("Bootstrap Half-Width (c_alpha):", round(c_alpha_boot, 4), "\n")

## Bootstrap Half-Width (c_alpha): 0.11

cat("Borell-TIS Half-Width (u_alpha/sqrt(n)):", round(width_bt, 4), "\n")

## Borell-TIS Half-Width (u_alpha/sqrt(n)): 0.0838

Summary of Simulation Results

The simulation for \(n=500\) yields the following insights regarding the reliability estimation and uncertainty quantification:

Visual Verification: The Kaplan-Meier estimator (black line) remains strictly within both sets of simultaneous confidence bands throughout the observation period, confirming the validity of the bands.
Comparison to Pointwise: As expected, both simultaneous bands are significantly wider than the pointwise Greenwood intervals. This additional width is the “cost” required to guarantee that the entire curve is covered with 95% probability.
Method Comparison:
- The Bootstrap Band (Blue Dashed) produced a uniform half-width of 0.11.
- The Borell-TIS Band (Red Dotted) produced a uniform half-width of 0.0838.
In this specific simulation, the Borell-TIS method provided a tighter (more precise) confidence envelope compared to the Bootstrap. This indicates that the Gaussian process approximation provided a sharp bound for the error process, while the nonparametric bootstrap was more sensitive to sampling variability in the tails of the distribution.

IV. Real Data Analysis: The Cancer Dataset

For the empirical application, we utilize the NCCTG Lung Cancer dataset included in the R survival package. This dataset contains survival times (in days) for patients with advanced lung cancer from the North Central Cancer Treatment Group.

Although this is biomedical data, it is structurally identical to reliability data:

Patients \(=\) Components.
Death \(=\) Failure.
Alive \(=\) Right-Censoring

Data Dictionary

The dataset consists of 228 observations. While there are covariates like age and sex, our reliability analysis focuses on the time-to-event variables.

Variable	Description	Role in Analysis
time	Survival time in days.	The lifetime variable (\(X_i\)).
status	Censoring status.	The event indicator (\(\delta_i\)).
age	Patient age in years.	Covariate (not used for univariate bands).
sex	Patient sex (1=Male, 2=Female).	Covariate.
ph.ecog	ECOG performance score (0-5).	Measure of functional capacity.

Loading and Pre-processing

We load the data and recode the status variable to align with standard reliability notation (\(1=\) Failure, \(0=\) Censored).

Note: In the raw survival package data, Status is coded as 1 (censored) and 2 (dead). We subtract 1 to convert this to the standard 0/1 format.

# Load library and dataset
library(survival)
data(lung, package = "survival")

# rename dataset for modification 
cancer_data <- lung
#View(cancer_data)

# Recode Status: 
# original: 1 = Censored, 2 = Dead
# new:      0 = Censored, 1 = Dead (Failure)
cancer_data$status <- cancer_data$status - 1 

# re-check dataset
#View(cancer_data)

# see the first 5 rows 
head(cancer_data)

# only care about time and status 
# check for missing values
na_time <- sum(is.na(cancer_data$time))
na_status <- sum(is.na(cancer_data$status))

cat("\nMissing Values in Time:", na_time, "\n")

## 
## Missing Values in Time: 0

cat("Missing Values in Status:", na_status, "\n")

## Missing Values in Status: 0

A-B. Kaplan-Meier Fit and Greenwood Intervals

Just like we did with the simulation, we start by fitting the standard Kaplan-Meier estimator and looking at the pointwise intervals. This gives us our baseline “survival curve” for the lung cancer patients.

We fit the Kaplan-Meier estimator to the empirical data. Note that we define the formula as Surv(time, status) ~ 1, which treats the entire dataset as a single population.

# 1. Fit the Kaplan-Meier Estimator to the cancer data

km_cancer <- survfit(Surv(time, status) ~ 1, data = cancer_data)

# 2. inspect the fit at certain time points
 
time_points_cancer <- pretty(c(0, max(cancer_data$time, na.rm=TRUE)), n = 5)
time_points_cancer <- time_points_cancer[time_points_cancer > 0]

summary(km_cancer, times = time_points_cancer)

## Call: survfit(formula = Surv(time, status) ~ 1, data = cancer_data)
## 
##  time n.risk n.event survival std.err lower 95% CI upper 95% CI
##   200    144      72   0.6803  0.0311       0.6219        0.744
##   400     57      54   0.3768  0.0358       0.3128        0.454
##   600     24      22   0.2136  0.0335       0.1571        0.290
##   800      8      15   0.0783  0.0246       0.0423        0.145
##  1000      2       2   0.0503  0.0228       0.0207        0.123

We visualize the survival curve with the standard pointwise 95% confidence intervals.

plot(km_cancer, 
     main = "Kaplan-Meier Estimate: Lung Cancer Survival",
     xlab = "Time (Days)", 
     ylab = "Survival Probability",
     conf.int = TRUE,      
     col = "black", 
     lwd = 2, 
     las = 1)

legend("topright", 
       legend = c("KM Estimate", "95% Pointwise CI"),
       col = c("black", "black"),
       lty = c(1, 2), lwd = 2)

High Mortality: by day 400, survival drops to ~37%. By day 1000, only ~5% of patients are expected to survive.
Declining Sample Size: we can see n.risk dropping rapidly (from 144 down to 2). This means our variance (uncertainty) will be high at the tail end, just like in the simulation.

C. Simultaneous Confidence Bands for Lung Cancer Survival

We now construct the simultaneous confidence bands to quantify the global uncertainty of the lung cancer survival curve.

# parameters
n <- nrow(cancer_data)  # n =228
B <- 1000               # Bootstrap iterations
alpha <- 0.05

# define the grid based on observed failure times in the real data
time_grid <- summary(km_cancer)$time
R_hat <- summary(km_cancer)$surv

# 2. Bootstrap construction
sup_deviations <- numeric(B)

for(b in 1:B) {
  # Resample indices
  indices <- sample(1:n, n, replace = TRUE)
  
  # create bootstrap sample
  boot_dat <- cancer_data[indices, ]
  
  # fit KM to bootstrap sample
  km_boot <- survfit(Surv(time, status) ~ 1, data = boot_dat)
  
  R_boot <- summary(km_boot, times = time_grid, extend = TRUE)$surv
  
  # calculate max deviation
  diffs <- abs(R_boot - R_hat)
  sup_deviations[b] <- max(diffs, na.rm = TRUE)
}

# Bootstrap multiplier
c_alpha_boot <- quantile(sup_deviations, 1 - alpha)


# 3. Borell-TIS construction 
# extract Standard Errors from KM fit
se_greenwood <- summary(km_cancer)$std.err

# estimate maximum variance of the process
sigma_sq_max <- max(n * se_greenwood^2, na.rm = TRUE)

# calculate threshold u_alpha
u_alpha <- sqrt(sigma_sq_max) * sqrt(2 * log(1/alpha))

# calculate Uniform margin of error
width_bt <- u_alpha / sqrt(n)


# 4. Plot
plot(km_cancer, conf.int = FALSE, 
     main = "Simultaneous Bands: Lung Cancer Data",
     xlab = "Time (Days)", ylab = "Survival Probability",
     lwd = 2, las = 1)

# add Bootstrap bands (blue)
lines(km_cancer$time, pmax(0, km_cancer$surv - c_alpha_boot), col = "blue", lty = 2, lwd = 2)
lines(km_cancer$time, pmin(1, km_cancer$surv + c_alpha_boot), col = "blue", lty = 2, lwd = 2)

# add Borell-TIS bands (Red)
lines(km_cancer$time, pmax(0, km_cancer$surv - width_bt), col = "red", lty = 3, lwd = 2)
lines(km_cancer$time, pmin(1, km_cancer$surv + width_bt), col = "red", lty = 3, lwd = 2)

legend("topright", 
       legend = c("KM Estimate", "Bootstrap SCB", "Borell-TIS SCB"),
       col = c("black", "blue", "red"),
       lty = c(1, 2, 3), lwd = 2)

# compare
cat("Real Data - Bootstrap Width:", round(c_alpha_boot, 4), "\n")

## Real Data - Bootstrap Width: 0.0915

cat("Real Data - Borell-TIS Width:", round(width_bt, 4), "\n")

## Real Data - Borell-TIS Width: 0.0877

Summary of Real Data Analysis

The application of both methods to the NCCTG lung cancer dataset (\(n=228\)) yielded results consistent with the simulation study:

Coverage: The Kaplan-Meier estimate remains fully contained within both sets of simultaneous bands, indicating that both methods successfully captured the global uncertainty.
Width Comparison:
- Bootstrap Half-Width: 0.0915
- Borell-TIS Half-Width: 0.0877
The Borell-TIS bands were approximately 4% narrower than the bootstrap bands. This suggests that the Gaussian process approximation remains robust even at moderate sample sizes (\(n \approx 228\)) and provides a slightly more precise confidence envelope for this dataset.

V. Discussion and Conclusion

This project set out to compare two methodologies for constructing simultaneous confidence bands for the reliability function under right-censoring: the data-driven Bootstrap method and the analytical Borell-TIS method (based on Gaussian process theory).

Key Findings

Accuracy of Pointwise Intervals: As demonstrated in the simulation, standard Greenwood pointwise intervals become increasingly unreliable at the tail of the distribution where the risk set is small. This confirms the necessity of simultaneous bands for making global guarantees about system performance.
Performance Comparison: Across both the Simulation Study (\(n=500\)) and the Real Data Analysis (\(n=228\)), the Borell-TIS method consistently produced narrower confidence bands compared to the Bootstrap approach.
- Simulation: Borell-TIS (0.084) vs. Bootstrap (0.110).
- Real Data: Borell-TIS (0.088) vs. Bootstrap (0.092).
Trade-offs:
- Bootstrap: Offers high flexibility and requires fewer theoretical assumptions, but is computationally intensive and produced slightly wider (more conservative) bands in our tests.
- Borell-TIS: Proved to be computationally efficient (instant calculation) and provided tighter bounds. This suggests that the Gaussian approximation for the normalized Kaplan-Meier process converges relatively quickly, making it a powerful tool for reliability engineers.

Final Recommendation

For datasets with moderate to large sample sizes (\(n > 200\)), the Borell-TIS approach is highly recommended. It provides rigorous probabilistic guarantees with superior precision and computational speed. The Bootstrap remains a valuable alternative for smaller sample sizes or complex censoring patterns where asymptotic normality may be in doubt.

VI. References

Kaplan, E. L., & Meier, P. (1958). Nonparametric Estimation from Incomplete Observations. Journal of the American Statistical Association, 53(282), 457–481.
Meeker, W. Q., & Escobar, L. A. (1998). Statistical Methods for Reliability Data. Wiley.
Borell, C. (1975). The Brunn–Minkowski Inequality in Gauss Space. Mathematical Scandinavica.
Davison, A. C., & Hinkley, D. V. (1997). Bootstrap Methods and Their Application. Cambridge University Press.

Nonparametric Reliability Estimation with Simultaneous Confidence Bands: Bootstrap versus Borell-TIS

Mimi Tran

2025-12-7