M51
Flight test data
Beta-Binomial Proportion model
- Results
  - Flat Prior
  - Gaussian Prior
Bayesian Binomial Model with Gaussian Prior
- Markov Chain Trace plots
- Posterior Samples
Comparing models
Conclusion

M51

The M51 SLBM is a French submarine-launched ballistic missile, built by ArianeGroup, and deployed with the French Navy. Designed to replace the M45 SLBM (In French terminology the MSBS – Mer-Sol-Balistique-Stratégique “Sea-ground-Strategic ballistic”), it was first deployed in 2010.

See:

Flight test data

We use open source data on M51 flight test program.

M51_data <- c("TRUE","TRUE","TRUE", "TRUE","FALSE","TRUE")

Beta-Binomial Proportion model

#The prop_model function - Rasmus Bååth R code

# This function takes a number of successes and failuers coded as a TRUE/FALSE
# or 0/1 vector. This should be given as the data argument.
# The result is a visualization of the how a Beta-Binomial
# model gradualy learns the underlying proportion of successes 
# using this data. The function also returns a sample from the
# posterior distribution that can be further manipulated and inspected.
# The default prior is a Beta(1,1) distribution, but this can be set using the
# prior_prop argument.

# Make sure the packages tidyverse and ggridges are installed, otherwise run:
# install.packages(c("tidyverse", "ggridges"))

# Example usage:
# data <- c(TRUE, FALSE, TRUE, TRUE, FALSE, TRUE, TRUE)
# prop_model(data)
prop_model <- function(data = c(), prior_prop = c(1, 1), n_draws = 10000,
                       gr_name="Proportion graph") {
  library(tidyverse)
  
  data <- as.logical(data)
  # data_indices decides what densities to plot between the prior and the posterior
  # For 20 datapoints and less we're plotting all of them.
  data_indices <- round(seq(0, length(data), length.out = min(length(data) + 1, 40)))
  
  # dens_curves will be a data frame with the x & y coordinates for the 
  # denities to plot where x = proportion_success and y = probability
  proportion_success <- c(0, seq(0, 1, length.out = 100), 1)
  dens_curves <- map_dfr(data_indices, function(i) {
    value <- ifelse(i == 0, "Prior", ifelse(data[i], "Success", "Failure"))
    label <- paste0("n=", i)
    probability <- dbeta(proportion_success,
                         prior_prop[1] + sum(data[seq_len(i)]),
                         prior_prop[2] + sum(!data[seq_len(i)]))
    probability <- probability / max(probability)
    data_frame(value, label, proportion_success, probability)
  })
  # Turning label and value into factors with the right ordering for the plot
  dens_curves$label <- fct_rev(factor(dens_curves$label, levels =  paste0("n=", data_indices )))
  dens_curves$value <- factor(dens_curves$value, levels = c("Prior", "Success", "Failure"))
  
  graph_label <- paste("Prior likelihood distribution Beta(a =", 
                       as.character(prior_prop[1]),", b =",
                       as.character(prior_prop[2]),")") 
  
  p <- ggplot(dens_curves, aes(x = proportion_success, y = label,
                               height = probability, fill = value)) +
    ggridges::geom_density_ridges(stat="identity", color = "white", alpha = 0.8,
                                  panel_scaling = TRUE, size = 1) +
    scale_y_discrete("", expand = c(0.01, 0)) +
    scale_x_continuous("Proportion of success") +
    scale_fill_manual(values = hcl(120 * 2:0 + 15, 100, 65), name = "", drop = FALSE,
                      labels =  c("Prior   ", "Success   ", "Failure   ")) +
    ggtitle(paste0(gr_name, ": ", sum(data),  " successes, ", sum(!data), " failures"),
            subtitle = graph_label) +
    labs(caption = "based on Rasmus Bååth R code") +
    theme_light() +
    theme(legend.position = "top")
  print(p)
  
  # Returning a sample from the posterior distribution that can be further 
  # manipulated and inspected
  posterior_sample <- rbeta(n_draws, prior_prop[1] + sum(data), prior_prop[2] + sum(!data))
  print(quantile(posterior_sample, c(0.025, 0.5, 0.975)))
  invisible(posterior_sample)
}

Results

Flat Prior

prop_model(data = M51_data,prior_prop = c(1,1),gr_name = "M51 flight test")

##      2.5%       50%     97.5% 
## 0.4272853 0.7716352 0.9623021

Gaussian Prior

prop_model(data = M51_data,prior_prop = c(2,2),gr_name = "M51 flight test")

##      2.5%       50%     97.5% 
## 0.3925804 0.7128904 0.9242871

Bayesian Binomial Model with Gaussian Prior

library(rstan)
library(dplyr)
options(mc.cores = parallel::detectCores())
rstan_options(auto_write = TRUE)
y <- recode(M51_data,"TRUE"=1,"FALSE"=0)

N <- length(y)
data_list <- list(N=N,y=y,a=2,b=2)
fit.ber <- stan(file = "stan_code.stan",data = data_list,chains = 2,verbose = FALSE,
                iter = 10000,warmup = 2000,seed=12345)
stan_model(file = "stan_code.stan")

## S4 class stanmodel 'stan_code' coded as follows:
##    
##    data { 
##      int<lower=0> N; 
##      int<lower=0,upper=1> y[N];
##      
##      real a;
##      real b;
##    } 
##    
##    transformed data{
##      int<lower=0,upper=1> yy[N];
##      for (k in 1:N)
##      yy[k]=abs(1-y[k]);
##    }
##    
##    
##    parameters {
##      real<lower=0,upper=1> theta[N];
##    } 
##    model {
##      for (j in 1:N){
##      theta[j] ~ beta(a+sum(y[1:j]),b+sum(yy[1:j]));
##          for (n in 1:j) //
##          y[j] ~ bernoulli(theta[j]);
##      }
##    }
##    
##    generated quantities {
##      real<lower=0,upper=1> pp[N]; 
##      for (j in 1:N)
##      pp[j]=beta_rng(a+sum(y[1:j]),b+sum(yy[1:j]));
##    }
##

Markov Chain Trace plots

traceplot(fit.ber, pars="theta",inc_warmup=TRUE)

Posterior Samples

NALL <- N
s1 <- as.array(fit.ber)
pp_par <- paste("pp[",1:NALL,"]",sep = "")
pp_par2 <- paste("theta[",1:NALL,"]",sep = "")

bayesplot::mcmc_intervals(x = s1,pars = pp_par2)

bayesplot::mcmc_intervals(x = s1,pars = pp_par)

bayesplot::mcmc_areas(x = s1,pars = pp_par2)

bayesplot::mcmc_areas(x = s1,pars = pp_par)

Comparing models

last_par_pp<- paste("pp[",NALL,"]",sep = "")
last_par_theta<- paste("theta[",NALL,"]",sep = "")
bayesplot::mcmc_hist(x = s1,pars = c(last_par_pp,last_par_theta))

plot(fit.ber,pars = c(last_par_pp,last_par_theta))

print(fit.ber,pars = c(last_par_pp,last_par_theta))

## Inference for Stan model: stan_code.
## 2 chains, each with iter=10000; warmup=2000; thin=1; 
## post-warmup draws per chain=8000, total post-warmup draws=16000.
## 
##          mean se_mean   sd 2.5%  25%  50%  75% 97.5% n_eff Rhat
## pp[6]    0.70       0 0.14  0.4 0.61 0.71 0.80  0.93 16424    1
## theta[6] 0.81       0 0.09  0.6 0.76 0.83 0.88  0.96 23060    1
## 
## Samples were drawn using NUTS(diag_e) at Mon Nov 18 10:41:50 2019.
## For each parameter, n_eff is a crude measure of effective sample size,
## and Rhat is the potential scale reduction factor on split chains (at 
## convergence, Rhat=1).

Conclusion

Both models should be applied for binomial proportion test.
Beta-Binomial model with Gaussian prior is preferable than flat prior one.
Bayesian binomial model is robust given Markov Chain trace plots results.
All in all Bayesian model gives more adequate estimate of M51 current reliability as credible interval \(P(0.6\le P_{M51}\le 0.96)=0.95\) with the mean value \(P_{M51}=0.83\)

M51 Reliability

Alexander Levakov, Senior Research Fellow, Ph.D

November, 2019

M51