Using a Simulation-Based Approach to Detect Performance Degradation
Bryant Chen
About Me
Professional Background
Work
Statistician at Lockheed Martin Space (2016 - )
MMIII RSRV
Sentinel RVI
F35 ICP, IRADs, etc.
Math Tutor at Salt Lake Community College (2024 - )
Education
B.S & B.A in Industrial Engineering & Mathematics (2016)
M.S in Statistics (2022)
Memberships
SDNS Member
SLC R Users
Areas of Statistical Experience
Reliability Analyses
Bayesian Statistics
Computational Statistics and Simulations
Regression Modelling
Sampling
Hobbies and Interests
Skiing
Backpacking
Biking
Climbing
Pickleballing
Volunteering
Aging Surveillance in DoD Programs
Assessment of age-related degradation in component performance are a key part of evaluating the weapons stockpile
Aging mechanisms are monitored by gathering and trending data on performance as a function of age against limit(s) to predict age-out
Data Collection
Data collection and trending is a sequential process
Fielded assets are randomly selected according to predetermined sampling requirements (quantity and frequency)
Interim analyses are performed periodically to identify the presence of aging degradation
Asset availability is of concern due to population depletion from destructive testing
How Can We Address This?
Common Solutions
Procure more units? Can be too expensive.
Source new capable manufacturers? Lack of capable manufacturers.
Redesign for compatibility? Long lead-times.
Reduce testing to prolong asset availability? Risk undetected performance degradation.
Can we develop a dynamic decision criteria to reevaluate test quantities and testing intervals as we test? Let’s try.
Preference towards asset conservation
Statistical criteria
Statistical Trending
P-value: the probability of observing results obtained assuming that there is no real effect or relationship
I.e., the p-value reflects the probability that the result you’re seeing is due to chance, rather than a real effect
Need more data to see the “bigger picture” if the actual slope was small and there is a lot of noise in order to differentiate a true relationship from random chance with high probability
[1]
A large age effect in the presence of low variability has a higher chance of detection as a true trend with less data overall than a small age effect in the presence of high variability
By simulating data that is very similar to our original dataset, we can identify an optimal combination of a test quantity and test interval that maximizes our ability to detect a true aging slope with high probability, accounting for the inherent variability in the data
Elements of Statistical Topics
Regression Modelling
Simulation and Power Analysis
Sequential Methods
Adaptive Designs in Clinical Trials
[2]
[3]
Simulation Procedure
library(reshape2)library(tidyverse)library(magrittr)B0 =10# true intercept#B1 = 0 # for assessing the false positive rateB1 =c(.1, .5, .1, 2, 5) # vector of true slopessigma =c(.1, .5, 1, 2, 5) # vector of true sds max =20# max number of interim analysest.vec =c(1, 2, 3) # vector of test frequencies in yearsn.vec =c(2, 4, 6, 8) # vector of test quantitiesoverall_alpha = .05nsim =1000# Number of simulations# Function to calculate slope detection and p-valueslope.detect =function(x.new, B0, B1, sigma) { x.old = x_i y.old = y_i y_det = B0 + B1 * x.new y.new =rnorm(n = n, mean = y_det, sd = sigma) x =append(x.old, x.new) y =append(y.old, y.new) model =lm(y ~ x)return(c(x =list(x), y =list(y), p.value =coef(summary(model))["x", "Pr(>|t|)"]))}# Function for dynamic alpha to control false positive ratedynamic_alpha =function(m, overall_alpha){# z_alpha = qnorm(1 - overall_alpha / 2)# adj_alpha = 2 * (1 - pnorm(z_alpha / sqrt(m))) # O'Brien-Fleming formula adj_alpha = overall_alpha/mreturn(adj_alpha)}# Function to simulate powerpower.sim =function(x.new, B0, B1, sigma, nsim, m, overall_alpha) { adj_alpha =dynamic_alpha(m = m, overall_alpha = overall_alpha)# compute power power.result =sum(replicate(n = nsim, expr =slope.detect(x.new = sample.x,B0 = B0,B1 = B1,sigma = sigma)$p.value) < adj_alpha) / nsim# compute type 1 error alpha.result =sum(replicate(n = nsim, expr =slope.detect(x.new = sample.x,B0 = B0,B1 =0,sigma = sigma)$p.value) < adj_alpha) / nsimreturn(list(power.result = power.result, alpha.result = alpha.result))}#Create a grid of parameter combinations for power computationtable.power =array(data =NA, dim =c(length(n.vec),length(t.vec), max,length(B1),length(sigma)),dimnames =list(c(n.vec),c(t.vec),c(seq(1, max, by =1)),c(B1),c(sigma)))table.alpha =array(data =NA, dim =c(length(n.vec),length(t.vec), max,length(B1),length(sigma)),dimnames =list(c(n.vec),c(t.vec),c(seq(1, max, by =1)),c(B1),c(sigma)))for(l in1:length(n.vec)){for(c in1:length(t.vec)){for(i in1:length(B1)){for(j in1:length(sigma)){ t = t.vec[c] # set test frequency in years n = n.vec[l] # set test quantity continue =TRUE m =0 k =0 time =0 x_i =numeric() y_i =numeric() pop.N =400 pop.age =seq(0, 5, length.out =1000) |>sample(pop.N) |>round(digits =2) pop.age_indices =seq(1, pop.N, by =1)while (continue) { m = m +1 k = k + n time = time + t pop.age = pop.age + t sampled.indices =sample(pop.age_indices, size = n, replace =FALSE) # sampling the indices sample.x = pop.age[sampled.indices] table.power[l, c, m, i, j] =power.sim(x.new = sample.x, B0 = B0,B1 = B1[i],sigma = sigma[j],nsim = nsim, m, overall_alpha = .1)$power.result table.alpha[l, c, m,i, j] =power.sim(x.new = sample.x, B0 = B0,B1 =0,sigma = sigma[j],nsim = nsim, m, overall_alpha = .1)$alpha.result x_i =append(x_i, sample.x) sample.y =rnorm(n = n, mean = B0 + B1[i]*sample.x, sd = sigma[j]) y_i =append(y_i, sample.y) pop.age_indices = pop.age_indices[-sampled.indices] # remaining population indices after samplingif(m >= max){continue =FALSE} } } } }}# Convert the power results array to a data frametable.power = table.power |>melt(varnames =c("n","t","N","B1","sigma"))table.alpha = table.alpha |>melt(varnames =c("n","t","N","B1","sigma"))# Plot the power resultslabeller =label_bquote(cols = sigma == .(sigma), rows = beta[1] == .(B1))table.power |>ggplot() +geom_line(aes(x = N, y = value, col =as.factor(n), linetype =as.factor(t))) +facet_grid(rows =vars(B1), cols =vars(sigma), labeller = labeller) +theme_bw() +labs(title ="Power Curves",subtitle =expression(paste("Confidence (Power) as a function of ", beta[1], ", ", sigma, ", Test Quantity (n), Test Frequency (t), # of Trend Analyses (N) and Significance Lvl (", alpha, " = 0.05)")),y =expression(paste("P[t ">=" ", t[1-alpha], "|", beta[1] !=0, "]")),x ="N Trend Analyses",color ="Test Quantity", linetype ="Test Frequency") +scale_x_continuous(labels =c(2, 4, 6, 8, 10))# Plotting the type 1 errortable.alpha |>ggplot() +geom_line(aes(x = N, y = value, col =as.factor(n), linetype =as.factor(t))) +facet_grid(rows =vars(B1), cols =vars(sigma), labeller = labeller) +theme_bw() +labs(title ="Type I Error Curves",subtitle =expression(paste("Type I error as a function of ", beta[1], ", ", sigma, ", Test Quantity (n), Test Frequency (t), # of Trend Analyses (N) and Significance Lvl (", alpha, " = 0.01)")),y =expression(paste("P[t ">=" ", t[1-alpha], "|", beta[1] ==0, "]")),x ="N Trend Analyses",color ="Test Quantity", linetype ="Test Frequency") +scale_x_continuous(labels =c(2, 4, 6, 8, 10))# Smooth out the random fluctuationstable.power |>ggplot() +geom_line(aes(x = N, y = value, col =as.factor(n), linetype =as.factor(t)), alpha =0.5) +# Raw datageom_smooth(aes(x = N, y = value, col =as.factor(n), linetype =as.factor(t)), method ="loess", se =FALSE, linewidth = .5) +# Smoothed curvefacet_grid(rows =vars(B1), cols =vars(sigma), labeller = labeller) +theme_bw() +labs(title ="Power Curves with Smoothed and Raw Data",subtitle =expression(paste("Confidence (Power) as a function of ", beta[1], ", ", sigma, ", Test Quantity (n), Test Frequency (t), # of Trend Analyses (N) and Significance Lvl (", alpha, " = 0.10")),y =expression(paste("P[t ">=" ", t[1-alpha], "|", beta[1] !=0, "]")),color ="Test Quantity", linetype ="Test Frequency") +scale_x_continuous(labels =c(2, 4, 6, 8, 10))
True Positive Rates
False Positive Rates
Conclusion
This work facilitates efforts to modify original test quantities and intervals in order to support program life-extension efforts due to availability concerns
We can consider reducing the number of tests required and increasing the interval between tests, while still maintaining confidence in detecting performance degradation over time if a sufficient level of power has been achieved
Future Work
Extending Non-linear relationships
Bootstrapping effect and variability estimates
Fitting a non-linear model and/or smoother to power and false positive estimates to better encapsulate and visualize trends