Optimizing Surveillance Test Designs

Using a Simulation-Based Approach to Detect Performance Degradation

Bryant Chen

About Me

Professional Background

  • Work
    • Statistician at Lockheed Martin Space (2016 - )
      • MMIII RSRV
      • Sentinel RVI
      • F35 ICP, IRADs, etc.
    • Math Tutor at Salt Lake Community College (2024 - )
  • Education
    • B.S & B.A in Industrial Engineering & Mathematics (2016)
    • M.S in Statistics (2022)
  • Memberships
    • SDNS Member
    • SLC R Users

Areas of Statistical Experience

  • Reliability Analyses
  • Bayesian Statistics
  • Computational Statistics and Simulations
  • Regression Modelling
  • Sampling

Hobbies and Interests

  • Skiing
  • Backpacking
  • Biking
  • Climbing
  • Pickleballing
  • Volunteering

Aging Surveillance in DoD Programs

  • Assessment of age-related degradation in component performance are a key part of evaluating the weapons stockpile

  • Aging mechanisms are monitored by gathering and trending data on performance as a function of age against limit(s) to predict age-out

Data Collection

  • Data collection and trending is a sequential process
    • Fielded assets are randomly selected according to predetermined sampling requirements (quantity and frequency)
    • Interim analyses are performed periodically to identify the presence of aging degradation
  • Asset availability is of concern due to population depletion from destructive testing

How Can We Address This?

  • Common Solutions
    • Procure more units? Can be too expensive.
    • Source new capable manufacturers? Lack of capable manufacturers.
    • Redesign for compatibility? Long lead-times.
    • Reduce testing to prolong asset availability? Risk undetected performance degradation.
  • Can we develop a dynamic decision criteria to reevaluate test quantities and testing intervals as we test? Let’s try.
    • Preference towards asset conservation
    • Statistical criteria

[1]

  • A large age effect in the presence of low variability has a higher chance of detection as a true trend with less data overall than a small age effect in the presence of high variability
  • By simulating data that is very similar to our original dataset, we can identify an optimal combination of a test quantity and test interval that maximizes our ability to detect a true aging slope with high probability, accounting for the inherent variability in the data

Elements of Statistical Topics

  • Regression Modelling
  • Simulation and Power Analysis
  • Sequential Methods
  • Adaptive Designs in Clinical Trials

[2]

[3]

Simulation Procedure

library(reshape2)
library(tidyverse)
library(magrittr)

B0 = 10  # true intercept
#B1 = 0 # for assessing the false positive rate
B1 = c(.1, .5, .1, 2, 5)  # vector of true slopes
sigma = c(.1, .5, 1, 2, 5)  # vector of true sds 
max = 20  # max number of interim analyses
t.vec = c(1, 2, 3)  # vector of test frequencies in years
n.vec = c(2, 4, 6, 8)  # vector of test quantities
overall_alpha = .05

nsim = 1000  # Number of simulations

# Function to calculate slope detection and p-value
slope.detect = function(x.new, B0, B1, sigma) {
  x.old = x_i
  y.old = y_i
  
  y_det = B0 + B1 * x.new
  y.new = rnorm(n = n, mean = y_det, sd = sigma)
  
  x = append(x.old, x.new)
  y = append(y.old, y.new)
  model = lm(y ~ x)
  return(c(x = list(x), y = list(y), p.value = coef(summary(model))["x", "Pr(>|t|)"]))
}

# Function for dynamic alpha to control false positive rate
dynamic_alpha = function(m, overall_alpha){
  # z_alpha = qnorm(1 - overall_alpha / 2)
  # adj_alpha = 2 * (1 - pnorm(z_alpha / sqrt(m)))  # O'Brien-Fleming formula
  adj_alpha = overall_alpha/m
  
  return(adj_alpha)
}

# Function to simulate power
power.sim = function(x.new, B0, B1, sigma, nsim, m, overall_alpha) {
  adj_alpha = dynamic_alpha(m = m, overall_alpha = overall_alpha)
  # compute power
  power.result = sum(replicate(n = nsim, expr = slope.detect(x.new = sample.x,
                                                             B0 = B0,
                                                             B1 = B1,
                                                             sigma = sigma)$p.value) < adj_alpha) / nsim
  # compute type 1 error
  alpha.result = sum(replicate(n = nsim, expr = slope.detect(x.new = sample.x,
                                                             B0 = B0,
                                                             B1 = 0,
                                                             sigma = sigma)$p.value) < adj_alpha) / nsim
  
  return(list(power.result = power.result, alpha.result = alpha.result))
}

#Create a grid of parameter combinations for power computation
table.power = array(data = NA, dim = c(length(n.vec),
                                       length(t.vec),
                                       max,
                                       length(B1),
                                       length(sigma)),
                    dimnames = list(c(n.vec),
                                    c(t.vec),
                                    c(seq(1, max, by = 1)),
                                    c(B1),
                                    c(sigma)))

table.alpha = array(data = NA, dim = c(length(n.vec),
                                       length(t.vec),
                                       max,
                                       length(B1),
                                       length(sigma)),
                    dimnames = list(c(n.vec),
                                    c(t.vec),
                                    c(seq(1, max, by = 1)),
                                    c(B1),
                                    c(sigma)))

for(l in 1:length(n.vec)){
  for(c in 1:length(t.vec)){
    for(i in 1:length(B1)){
      for(j in 1:length(sigma)){
        
        t = t.vec[c] # set test frequency in years
        n = n.vec[l] # set test quantity
        
        continue = TRUE
        m = 0
        k = 0
        time = 0
        x_i = numeric()
        y_i = numeric()
        
        pop.N = 400
        pop.age = seq(0, 5, length.out = 1000) |> sample(pop.N) |> round(digits = 2)
        pop.age_indices = seq(1, pop.N, by = 1)
        
        while (continue) {
          m = m + 1
          k = k + n
          time = time + t
          pop.age = pop.age + t
          
          sampled.indices = sample(pop.age_indices, size = n, replace = FALSE) # sampling the indices
          sample.x = pop.age[sampled.indices]
          
          table.power[l, c, m, i, j] = power.sim(x.new = sample.x, B0 = B0,
                                                 B1 = B1[i],
                                                 sigma = sigma[j],
                                                 nsim = nsim, m, overall_alpha = .1)$power.result
          
          table.alpha[l, c, m,i, j] = power.sim(x.new = sample.x, B0 = B0,
                                                 B1 = 0,
                                                 sigma = sigma[j],
                                                 nsim = nsim, m, overall_alpha = .1)$alpha.result
          
          x_i = append(x_i, sample.x)
          sample.y = rnorm(n = n, mean = B0 + B1[i]*sample.x, sd = sigma[j])
          y_i = append(y_i, sample.y)
          pop.age_indices = pop.age_indices[-sampled.indices] # remaining population indices after sampling
          
          if(m >= max){continue = FALSE}
        }
        
      }
    }
  }
}

# Convert the power results array to a data frame
table.power = table.power |> melt(varnames = c("n","t","N","B1","sigma"))
table.alpha = table.alpha |> melt(varnames = c("n","t","N","B1","sigma"))

# Plot the power results
labeller = label_bquote(cols = sigma == .(sigma), rows = beta[1] == .(B1))

table.power |> ggplot() +
  geom_line(aes(x = N, y = value, col = as.factor(n), linetype = as.factor(t))) +
  facet_grid(rows = vars(B1), cols = vars(sigma), labeller = labeller) +
  theme_bw() +
  labs(title = "Power Curves",
       subtitle = expression(paste("Confidence (Power) as a function of ", beta[1], ", ",
                                   sigma, ", Test Quantity (n), Test Frequency (t), # of Trend Analyses (N) and Significance Lvl (", 
                                   alpha, " = 0.05)")),
       y = expression(paste("P[t " >= " ", t[1-alpha], "|", beta[1] != 0, "]")),
       x = "N Trend Analyses",
       color = "Test Quantity", linetype = "Test Frequency") +
  scale_x_continuous(labels = c(2, 4, 6, 8, 10))

# Plotting the type 1 error
table.alpha |> ggplot() +
  geom_line(aes(x = N, y = value, col = as.factor(n), linetype = as.factor(t))) +
  facet_grid(rows = vars(B1), cols = vars(sigma), labeller = labeller) +
  theme_bw() +
  labs(title = "Type I Error Curves",
       subtitle = expression(paste("Type I error as a function of ", beta[1], ", ",
                                   sigma, ", Test Quantity (n), Test Frequency (t), # of Trend Analyses (N) and Significance Lvl (", 
                                   alpha, " = 0.01)")),
       y = expression(paste("P[t " >= " ", t[1-alpha], "|", beta[1] == 0, "]")),
       x = "N Trend Analyses",
       color = "Test Quantity", linetype = "Test Frequency") +
  scale_x_continuous(labels = c(2, 4, 6, 8, 10))



# Smooth out the random fluctuations
table.power |> 
  ggplot() +
  geom_line(aes(x = N, y = value, col = as.factor(n), linetype = as.factor(t)), 
            alpha = 0.5) +  # Raw data
  geom_smooth(aes(x = N, y = value, col = as.factor(n), linetype = as.factor(t)), 
              method = "loess", se = FALSE, linewidth = .5) +  # Smoothed curve
  facet_grid(rows = vars(B1), cols = vars(sigma), labeller = labeller) +
  theme_bw() +
  labs(title = "Power Curves with Smoothed and Raw Data",
       subtitle = expression(paste("Confidence (Power) as a function of ", beta[1], ", ",
                                   sigma, ", Test Quantity (n), Test Frequency (t), # of Trend Analyses (N) and Significance Lvl (", 
                                   alpha, " = 0.10")),
       y = expression(paste("P[t " >= " ", t[1-alpha], "|", beta[1] != 0, "]")),
       color = "Test Quantity", linetype = "Test Frequency") +
  scale_x_continuous(labels = c(2, 4, 6, 8, 10))

True Positive Rates

False Positive Rates

Conclusion

  • This work facilitates efforts to modify original test quantities and intervals in order to support program life-extension efforts due to availability concerns
  • We can consider reducing the number of tests required and increasing the interval between tests, while still maintaining confidence in detecting performance degradation over time if a sufficient level of power has been achieved

Future Work

  • Extending Non-linear relationships
  • Bootstrapping effect and variability estimates
  • Fitting a non-linear model and/or smoother to power and false positive estimates to better encapsulate and visualize trends
    • Characterize uncertainty in these estimates