Optimizing Surveillance Test Designs

Using a Simulation-Based Approach to Detect Performance Degradation

Bryant Chen

About Me

Professional Background

Work
- Statistician at Lockheed Martin Space (2016 - )
  - MMIII RSRV
  - Sentinel RVI
  - F35 ICP, IRADs, etc.
- Math Tutor at Salt Lake Community College (2024 - )
Education
- B.S & B.A in Industrial Engineering & Mathematics (2016)
- M.S in Statistics (2022)
Memberships
- SDNS Member
- SLC R Users

Areas of Statistical Experience

Reliability Analyses
Bayesian Statistics
Computational Statistics and Simulations
Regression Modelling
Sampling

Hobbies and Interests

Skiing
Backpacking
Biking
Climbing
Pickleballing
Volunteering

Aging Surveillance in DoD Programs

Assessment of age-related degradation in component performance are a key part of evaluating the weapons stockpile
Aging mechanisms are monitored by gathering and trending data on performance as a function of age against limit(s) to predict age-out

Data Collection

Data collection and trending is a sequential process
- Fielded assets are randomly selected according to predetermined sampling requirements (quantity and frequency)
- Interim analyses are performed periodically to identify the presence of aging degradation

Asset availability is of concern due to population depletion from destructive testing

How Can We Address This?

Common Solutions
- Procure more units? Can be too expensive.
- Source new capable manufacturers? Lack of capable manufacturers.
- Redesign for compatibility? Long lead-times.
- Reduce testing to prolong asset availability? Risk undetected performance degradation.
Can we develop a dynamic decision criteria to reevaluate test quantities and testing intervals as we test? Let’s try.
- Preference towards asset conservation
- Statistical criteria

[1]

A large age effect in the presence of low variability has a higher chance of detection as a true trend with less data overall than a small age effect in the presence of high variability
By simulating data that is very similar to our original dataset, we can identify an optimal combination of a test quantity and test interval that maximizes our ability to detect a true aging slope with high probability, accounting for the inherent variability in the data

Elements of Statistical Topics

Regression Modelling
Simulation and Power Analysis
Sequential Methods
Adaptive Designs in Clinical Trials

[2]

[3]

Simulation Procedure

library(reshape2)
library(tidyverse)
library(magrittr)

B0 = 10  # true intercept
#B1 = 0 # for assessing the false positive rate
B1 = c(.1, .5, .1, 2, 5)  # vector of true slopes
sigma = c(.1, .5, 1, 2, 5)  # vector of true sds 
max = 20  # max number of interim analyses
t.vec = c(1, 2, 3)  # vector of test frequencies in years
n.vec = c(2, 4, 6, 8)  # vector of test quantities
overall_alpha = .05

nsim = 1000  # Number of simulations

# Function to calculate slope detection and p-value
slope.detect = function(x.new, B0, B1, sigma) {
  x.old = x_i
  y.old = y_i
  
  y_det = B0 + B1 * x.new
  y.new = rnorm(n = n, mean = y_det, sd = sigma)
  
  x = append(x.old, x.new)
  y = append(y.old, y.new)
  model = lm(y ~ x)
  return(c(x = list(x), y = list(y), p.value = coef(summary(model))["x", "Pr(>|t|)"]))
}

# Function for dynamic alpha to control false positive rate
dynamic_alpha = function(m, overall_alpha){
  # z_alpha = qnorm(1 - overall_alpha / 2)
  # adj_alpha = 2 * (1 - pnorm(z_alpha / sqrt(m)))  # O'Brien-Fleming formula
  adj_alpha = overall_alpha/m
  
  return(adj_alpha)
}

# Function to simulate power
power.sim = function(x.new, B0, B1, sigma, nsim, m, overall_alpha) {
  adj_alpha = dynamic_alpha(m = m, overall_alpha = overall_alpha)
  # compute power
  power.result = sum(replicate(n = nsim, expr = slope.detect(x.new = sample.x,
                                                             B0 = B0,
                                                             B1 = B1,
                                                             sigma = sigma)$p.value) < adj_alpha) / nsim
  # compute type 1 error
  alpha.result = sum(replicate(n = nsim, expr = slope.detect(x.new = sample.x,
                                                             B0 = B0,
                                                             B1 = 0,
                                                             sigma = sigma)$p.value) < adj_alpha) / nsim
  
  return(list(power.result = power.result, alpha.result = alpha.result))
}

#Create a grid of parameter combinations for power computation
table.power = array(data = NA, dim = c(length(n.vec),
                                       length(t.vec),
                                       max,
                                       length(B1),
                                       length(sigma)),
                    dimnames = list(c(n.vec),
                                    c(t.vec),
                                    c(seq(1, max, by = 1)),
                                    c(B1),
                                    c(sigma)))

table.alpha = array(data = NA, dim = c(length(n.vec),
                                       length(t.vec),
                                       max,
                                       length(B1),
                                       length(sigma)),
                    dimnames = list(c(n.vec),
                                    c(t.vec),
                                    c(seq(1, max, by = 1)),
                                    c(B1),
                                    c(sigma)))

for(l in 1:length(n.vec)){
  for(c in 1:length(t.vec)){
    for(i in 1:length(B1)){
      for(j in 1:length(sigma)){
        
        t = t.vec[c] # set test frequency in years
        n = n.vec[l] # set test quantity
        
        continue = TRUE
        m = 0
        k = 0
        time = 0
        x_i = numeric()
        y_i = numeric()
        
        pop.N = 400
        pop.age = seq(0, 5, length.out = 1000) |> sample(pop.N) |> round(digits = 2)
        pop.age_indices = seq(1, pop.N, by = 1)
        
        while (continue) {
          m = m + 1
          k = k + n
          time = time + t
          pop.age = pop.age + t
          
          sampled.indices = sample(pop.age_indices, size = n, replace = FALSE) # sampling the indices
          sample.x = pop.age[sampled.indices]
          
          table.power[l, c, m, i, j] = power.sim(x.new = sample.x, B0 = B0,
                                                 B1 = B1[i],
                                                 sigma = sigma[j],
                                                 nsim = nsim, m, overall_alpha = .1)$power.result
          
          table.alpha[l, c, m,i, j] = power.sim(x.new = sample.x, B0 = B0,
                                                 B1 = 0,
                                                 sigma = sigma[j],
                                                 nsim = nsim, m, overall_alpha = .1)$alpha.result
          
          x_i = append(x_i, sample.x)
          sample.y = rnorm(n = n, mean = B0 + B1[i]*sample.x, sd = sigma[j])
          y_i = append(y_i, sample.y)
          pop.age_indices = pop.age_indices[-sampled.indices] # remaining population indices after sampling
          
          if(m >= max){continue = FALSE}
        }
        
      }
    }
  }
}

# Convert the power results array to a data frame
table.power = table.power |> melt(varnames = c("n","t","N","B1","sigma"))
table.alpha = table.alpha |> melt(varnames = c("n","t","N","B1","sigma"))

# Plot the power results
labeller = label_bquote(cols = sigma == .(sigma), rows = beta[1] == .(B1))

table.power |> ggplot() +
  geom_line(aes(x = N, y = value, col = as.factor(n), linetype = as.factor(t))) +
  facet_grid(rows = vars(B1), cols = vars(sigma), labeller = labeller) +
  theme_bw() +
  labs(title = "Power Curves",
       subtitle = expression(paste("Confidence (Power) as a function of ", beta[1], ", ",
                                   sigma, ", Test Quantity (n), Test Frequency (t), # of Trend Analyses (N) and Significance Lvl (", 
                                   alpha, " = 0.05)")),
       y = expression(paste("P[t " >= " ", t[1-alpha], "|", beta[1] != 0, "]")),
       x = "N Trend Analyses",
       color = "Test Quantity", linetype = "Test Frequency") +
  scale_x_continuous(labels = c(2, 4, 6, 8, 10))

# Plotting the type 1 error
table.alpha |> ggplot() +
  geom_line(aes(x = N, y = value, col = as.factor(n), linetype = as.factor(t))) +
  facet_grid(rows = vars(B1), cols = vars(sigma), labeller = labeller) +
  theme_bw() +
  labs(title = "Type I Error Curves",
       subtitle = expression(paste("Type I error as a function of ", beta[1], ", ",
                                   sigma, ", Test Quantity (n), Test Frequency (t), # of Trend Analyses (N) and Significance Lvl (", 
                                   alpha, " = 0.01)")),
       y = expression(paste("P[t " >= " ", t[1-alpha], "|", beta[1] == 0, "]")),
       x = "N Trend Analyses",
       color = "Test Quantity", linetype = "Test Frequency") +
  scale_x_continuous(labels = c(2, 4, 6, 8, 10))



# Smooth out the random fluctuations
table.power |> 
  ggplot() +
  geom_line(aes(x = N, y = value, col = as.factor(n), linetype = as.factor(t)), 
            alpha = 0.5) +  # Raw data
  geom_smooth(aes(x = N, y = value, col = as.factor(n), linetype = as.factor(t)), 
              method = "loess", se = FALSE, linewidth = .5) +  # Smoothed curve
  facet_grid(rows = vars(B1), cols = vars(sigma), labeller = labeller) +
  theme_bw() +
  labs(title = "Power Curves with Smoothed and Raw Data",
       subtitle = expression(paste("Confidence (Power) as a function of ", beta[1], ", ",
                                   sigma, ", Test Quantity (n), Test Frequency (t), # of Trend Analyses (N) and Significance Lvl (", 
                                   alpha, " = 0.10")),
       y = expression(paste("P[t " >= " ", t[1-alpha], "|", beta[1] != 0, "]")),
       color = "Test Quantity", linetype = "Test Frequency") +
  scale_x_continuous(labels = c(2, 4, 6, 8, 10))

True Positive Rates

False Positive Rates

Conclusion

This work facilitates efforts to modify original test quantities and intervals in order to support program life-extension efforts due to availability concerns
We can consider reducing the number of tests required and increasing the interval between tests, while still maintaining confidence in detecting performance degradation over time if a sufficient level of power has been achieved

Future Work

Extending Non-linear relationships
Bootstrapping effect and variability estimates
Fitting a non-linear model and/or smoother to power and false positive estimates to better encapsulate and visualize trends
- Characterize uncertainty in these estimates

Optimizing Surveillance Test Designs

About Me

Professional Background

Areas of Statistical Experience

Hobbies and Interests

Aging Surveillance in DoD Programs

Data Collection

How Can We Address This?

Statistical Trending

Elements of Statistical Topics

Simulation Procedure

True Positive Rates

False Positive Rates

Conclusion