Terminal High Altitude Area Defense interceptor

1 Terminal High Altitude Area Defense

Terminal High Altitude Area Defense (THAAD) is an American anti-ballistic missile defense system designed to shoot down short-, medium-, and intermediate-range ballistic missiles in their terminal phase (descent or reentry) by intercepting with a hit-to-kill approach. THAAD was developed after the experience of Iraq’s Scud missile attacks during the Gulf War in 1991. The THAAD interceptor carries no warhead, but relies on its kinetic energy of impact to destroy the incoming missile. A kinetic energy hit minimizes the risk of exploding conventional-warhead ballistic missiles, and the warhead of nuclear-tipped ballistic missiles will not detonate upon a kinetic-energy hit.

For more see:

https://en.wikipedia.org/wiki/Terminal_High_Altitude_Area_Defense

[THAAD Review video]

2 THAAD Test Events Data

We can import THAAD test events from the open source WIKI data like this, applying substitution FAILURE=FALSE and SUCCESS=TRUE. In fact we can produce two data sets for THAAD test events: Demonstration and Validation (DV) phase and Engineering and Manufacturing (EM) phase.

library(rvest)

## Loading required package: xml2

library(htmlTable)
library(DT)
url <- "https://en.wikipedia.org/wiki/Terminal_High_Altitude_Area_Defense"
#Demonstration and validation phase
thaad_test <- url %>%
  read_html()%>%
  html_nodes(xpath='//*[@id="mw-content-text"]/div/table[2]') %>%
  html_table()
thaad_test_data <- data.frame(cbind(thaad_test[[1]]$Date,thaad_test[[1]]$Result))

thaad_test_data[,2] <- ifelse(thaad_test[[1]]$Result=="Failure",FALSE,TRUE)

names(thaad_test_data)=c("Date","Result")

datatable(thaad_test_data,caption = "Demonstration and validation")

#print("Demonstration and validation")
#print(thaad_test_data) 

#Engineering and manufacturing  phase
thaad_test_2 <- url %>%
  read_html()%>%
  html_nodes(xpath='//*[@id="mw-content-text"]/div/table[3]') %>%
  html_table()
thaad_test_data_2 <- data.frame(cbind(thaad_test_2[[1]]$Date,thaad_test_2[[1]]$Result))

thaad_test_data_2[,2] <- ifelse(thaad_test_2[[1]]$Result=="Success",TRUE,FALSE)

names(thaad_test_data_2)=c("Date","Result")

datatable(thaad_test_data_2,caption = "Engineering and manufacturing")

#print("Engineering and manufacturing")
#print(thaad_test_data_2)

3 Reliability Point Estimate

3.1 Demonstration and Validation

summary(thaad_test_data$Result)

##    Mode   FALSE    TRUE 
## logical       6       5

NS <- sum(thaad_test_data$Result)
NALL <-length(thaad_test_data$Result) 
prop_success <- NS/NALL
prop_success

## [1] 0.4545455

As we see THAAD has got 45.5% reliability according to the test results during Demonstration and Validation phase.

3.2 Engineering and Manufacturing

summary(thaad_test_data_2$Result)

##    Mode   FALSE    TRUE 
## logical       4      13

NS2 <- sum(thaad_test_data_2$Result)
NALL2 <-length(thaad_test_data_2$Result) 
prop_success2 <- NS2/NALL2
prop_success2

## [1] 0.7647059

As we see THAAD has got 76.5% reliability according to the test results during Engineering and Manufacturing phase. Can we make out something else applying current data? Let’s see!

4 Reliability confidence interval

4.1 Demonstration and Validation

binom.test(NS,NALL,0.85)

## 
##  Exact binomial test
## 
## data:  NS and NALL
## number of successes = 5, number of trials = 11, p-value = 0.002657
## alternative hypothesis: true probability of success is not equal to 0.85
## 95 percent confidence interval:
##  0.1674881 0.7662064
## sample estimates:
## probability of success 
##              0.4545455

4.2 Engineering and Manufacturing

binom.test(NS2,NALL2,0.85)

## 
##  Exact binomial test
## 
## data:  NS2 and NALL2
## number of successes = 13, number of trials = 17, p-value = 0.3075
## alternative hypothesis: true probability of success is not equal to 0.85
## 95 percent confidence interval:
##  0.5010067 0.9318923
## sample estimates:
## probability of success 
##              0.7647059

One can see that THAAD has made great success during Engineering and Manufacturing phase compared to the Demonstration and Validation

5 Proportion of success model

We apply THAAD test data (both phases) for the proportion of success as function of the current test event using beta prior as Bayesian likelihood.

#The prop_model function - Rasmus Bååth R code

# This function takes a number of successes and failuers coded as a TRUE/FALSE
# or 0/1 vector. This should be given as the data argument.
# The result is a visualization of the how a Beta-Binomial
# model gradualy learns the underlying proportion of successes 
# using this data. The function also returns a sample from the
# posterior distribution that can be further manipulated and inspected.
# The default prior is a Beta(1,1) distribution, but this can be set using the
# prior_prop argument.

# Make sure the packages tidyverse and ggridges are installed, otherwise run:
# install.packages(c("tidyverse", "ggridges"))

# Example usage:
# data <- c(TRUE, FALSE, TRUE, TRUE, FALSE, TRUE, TRUE)
# prop_model(data)
prop_model <- function(data = c(), prior_prop = c(1, 1), n_draws = 10000,
                       gr_name="Proportion graph") {
  library(tidyverse)
 
  data <- as.logical(data)
  # data_indices decides what densities to plot between the prior and the posterior
  # For 20 datapoints and less we're plotting all of them.
  data_indices <- round(seq(0, length(data), length.out = min(length(data) + 1, 40)))
  
  # dens_curves will be a data frame with the x & y coordinates for the 
  # denities to plot where x = proportion_success and y = probability
  proportion_success <- c(0, seq(0, 1, length.out = 100), 1)
  dens_curves <- map_dfr(data_indices, function(i) {
    value <- ifelse(i == 0, "Prior", ifelse(data[i], "Success", "Failure"))
    label <- paste0("n=", i)
    probability <- dbeta(proportion_success,
                         prior_prop[1] + sum(data[seq_len(i)]),
                         prior_prop[2] + sum(!data[seq_len(i)]))
    probability <- probability / max(probability)
    data_frame(value, label, proportion_success, probability)
  })
  # Turning label and value into factors with the right ordering for the plot
  dens_curves$label <- fct_rev(factor(dens_curves$label, levels =  paste0("n=", data_indices )))
  dens_curves$value <- factor(dens_curves$value, levels = c("Prior", "Success", "Failure"))
  
  graph_label <- paste("Prior likelihood distribution Beta(a =", 
                       as.character(prior_prop[1]),", b =",
                                    as.character(prior_prop[2]),")") 
  
  p <- ggplot(dens_curves, aes(x = proportion_success, y = label,
                               height = probability, fill = value)) +
    ggridges::geom_density_ridges(stat="identity", color = "white", alpha = 0.8,
                                  panel_scaling = TRUE, size = 1) +
    scale_y_discrete("", expand = c(0.01, 0)) +
    scale_x_continuous("Proportion of success") +
    scale_fill_manual(values = hcl(120 * 2:0 + 15, 100, 65), name = "", drop = FALSE,
                      labels =  c("Prior   ", "Success   ", "Failure   ")) +
    ggtitle(paste0(gr_name, ": ", sum(data),  " successes, ", sum(!data), " failures"),
            subtitle = graph_label) +
    labs(caption = "based on Rasmus Bååth R code") +
    theme_light() +
    theme(legend.position = "top")
  print(p)
  
  # Returning a sample from the posterior distribution that can be further 
  # manipulated and inspected
  posterior_sample <- rbeta(n_draws, prior_prop[1] + sum(data), prior_prop[2] + sum(!data))
  invisible(posterior_sample)
}

5.1 Demonstration and Validation

5.1.1 Uniform prior Beta(1,1)

prop_model(data = thaad_test_data$Result,prior_prop = c(1,1),gr_name = "Demonstration and validation")

5.1.2 Gaussian prior Beta(2,2)

prop_model(data = thaad_test_data$Result,prior_prop = c(2,2),gr_name = "Demonstration and validation")

5.2 Engineering and Manufacturing

5.2.1 Uniform prior Beta(1,1)

prop_model(data = thaad_test_data_2$Result,prior_prop = c(1,1),gr_name = "Engineering and manufacturing")

5.2.2 Gaussian prior Beta(2,2)

prop_model(data = thaad_test_data_2$Result,prior_prop = c(2,2),gr_name = "Engineering and manufacturing")

6 Bayesian STAN beta-bernouilli model

See relevant math stuff here:

## 
## data { 
##   int<lower=0> N; 
##   int<lower=0,upper=1> y[N];
##   
##   real a;
##   real b;
## } 
## 
## transformed data{
##   int<lower=0,upper=1> yy[N];
##   for (k in 1:N)
##   yy[k]=abs(1-y[k]);
## }
## 
## 
## parameters {
##   real<lower=0,upper=1> theta[N];
## } 
## model {
##   for (j in 1:N){
##   theta[j] ~ beta(a+sum(y[1:j]),b+sum(yy[1:j]));
##       for (n in 1:j) //
##       y[j] ~ bernoulli(theta[j]);
##   }
## }
## 
## generated quantities {
##   real<lower=0,upper=1> pp[N]; 
##   for (j in 1:N)
##   pp[j]=beta_rng(a+sum(y[1:j]),b+sum(yy[1:j]));
## }

6.1 Posterior sample with Beta(2,2)

6.1.1 Demonstration and Validation

y <- as.integer(thaad_test_data$Result)
N <- length(y)
data_list <- list(N=N,y=y,a=2,b=2)
fit.ber <- stan(model_code = stan_code,data = data_list,chains = 2,verbose = FALSE,
            iter = 5000,warmup = 1000,seed=12345)

rstan::traceplot(fit.ber)

s1 <- as.array(fit.ber)
pp_par <- paste("pp[",1:NALL,"]",sep = "")
pp_par2 <- paste("theta[",1:NALL,"]",sep = "")
bayesplot::mcmc_intervals(x = s1,pars = pp_par)

bayesplot::mcmc_intervals(x = s1,pars = pp_par2)

6.1.2 Engineering and Manufacturing

y2 <- as.integer(thaad_test_data_2$Result)
N2 <- length(y2)
data_list <- list(N=N2,y=y2,a=2,b=2)
fit.ber.2 <- stan(model_code = stan_code,data = data_list,chains = 2,verbose = FALSE,
            iter = 5000,warmup = 1000,seed=12345)

rstan::traceplot(fit.ber.2)

s2 <- as.array(fit.ber.2)
pp_par2 <- paste("pp[",1:NALL2,"]",sep = "")
pp_par22 <- paste("theta[",1:NALL2,"]",sep = "")
bayesplot::mcmc_intervals(x = s2,pars = pp_par2)

bayesplot::mcmc_intervals(x = s2,pars = pp_par22)

7 Reliability crystall ball magic

7.1 Demonstration and Validation

last_par_pp<- paste("pp[",NALL,"]",sep = "")
last_par_theta<- paste("theta[",NALL,"]",sep = "")
bayesplot::mcmc_hist(x = s1,pars = c(last_par_pp,last_par_theta))

## `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.

plot(fit.ber,pars = c(last_par_pp,last_par_theta))

## ci_level: 0.8 (80% intervals)

## outer_level: 0.95 (95% intervals)

print(fit.ber,pars = c(last_par_pp,last_par_theta))

## Inference for Stan model: 8110aaf154812cced3e43a9d64740f79.
## 2 chains, each with iter=5000; warmup=1000; thin=1; 
## post-warmup draws per chain=4000, total post-warmup draws=8000.
## 
##           mean se_mean   sd 2.5%  25%  50%  75% 97.5% n_eff Rhat
## pp[11]    0.47       0 0.12 0.23 0.38 0.46 0.55  0.71  8000    1
## theta[11] 0.69       0 0.09 0.51 0.63 0.70 0.76  0.85  8000    1
## 
## Samples were drawn using NUTS(diag_e) at Tue Mar 19 16:26:59 2019.
## For each parameter, n_eff is a crude measure of effective sample size,
## and Rhat is the potential scale reduction factor on split chains (at 
## convergence, Rhat=1).

7.2 Engineering and Manufacturing

last_par_pp_2<- paste("pp[",NALL2,"]",sep = "")
last_par_theta_2<- paste("theta[",NALL2,"]",sep = "")
bayesplot::mcmc_hist(x = s2,pars = c(last_par_pp_2,last_par_theta_2))

## `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.

plot(fit.ber.2,pars = c(last_par_pp_2,last_par_theta_2))

## ci_level: 0.8 (80% intervals)

## outer_level: 0.95 (95% intervals)

print(fit.ber.2,pars = c(last_par_pp_2,last_par_theta_2))

## Inference for Stan model: 8110aaf154812cced3e43a9d64740f79.
## 2 chains, each with iter=5000; warmup=1000; thin=1; 
## post-warmup draws per chain=4000, total post-warmup draws=8000.
## 
##           mean se_mean   sd 2.5%  25%  50%  75% 97.5% n_eff Rhat
## pp[17]    0.71       0 0.10 0.50 0.65 0.72 0.78  0.89  8000    1
## theta[17] 0.84       0 0.06 0.72 0.81 0.85 0.88  0.94  8000    1
## 
## Samples were drawn using NUTS(diag_e) at Tue Mar 19 16:27:20 2019.
## For each parameter, n_eff is a crude measure of effective sample size,
## and Rhat is the potential scale reduction factor on split chains (at 
## convergence, Rhat=1).

Here is a moment of truth - we see significant difference between two distributions of the posterior samples: \(theta\), produced by the data through Bernoulli experiment with the prior Beta(a,b) distribution as Bayesian likelihood and \(pp\), produced by Beta(a,b) random distribution only. The last one is similar to the proportion model above. Both samples use Gaussian like \(Beta(2,2)\) distribution. Now we are ready to make the final conclusion about THAAD reliability.

8 Conclusion

The point estimate of THAAD reliability looks like \(P_{DV}=0.455\) and \(P_{EM}=0.765\) in accordance with the data we have so far.
The 95% confidence interval for THAAD reliability looks like \(0.167\le P_{DV}\le0.766\) and \(0.501\le P_{EM}\le0.932\).
Proportion model makes our belief in THAAD reliability stronger and stronger from one event to another, from phase to phase no matter what prior we use: \(Beta(1,1)\) or \(Beta(2,2)\).
Bayesian estimate of the posterior sample for the THAAD reliability \(theta\) produced by Bernoulli experiment with data and \(Beta(2,2)\) prior as likelihood is significantly greater than \(Beta(2,2)\) random distribution for the proportion of success.
We believe that true THAAD reliability in the current Engineering and Manufacturing phase looks like 95% credible interval \(0.72\le P_{EM}\le0.94\) with the mean value \(M[P_{EM}]=0.85\).
All the estimates made above are based on the assumption that THAAD flight test data are correct in terms of \(Success\) and \(Failure\) events.

Demystifying THAAD Reliability

Alexander Levakov, Senior Research Fellow, Ph.D

March, 2019