alt text

library(binom)
library(pwr)
library(ggplot2)

Problem

We want to estimate sufficient and statistically significant number of trials in a real experiment, provided we have got knowledge of some historical precedent.

Main assumptions

We consider random experiment with exactly two possible outcomes, “success” and “failure”, in which the probability of success is the same every time the experiment is conducted. (http://en.wikipedia.org/wiki/Bernoulli_trial).
We need some example of a real experiment. Let’s see it here: https://en.wikipedia.org/wiki/M51_(missile), http://www.air-cosmos.com/reussite-d-un-tir-de-m51-43891.

Preliminary estimate

The data for M51 test flights is a real experiment with two possible outcomes: success and failure, relevant with Bernoulli trial. So far M51 has 6 trials with 5 successful outcomes. So we got our first estimate of the probability of success: \(P=5/6=0.83\)

Historical precedent

Is it enough? Good question! We do know that Trident D5 had gone through a full scaled test flight program, resulting in 28 events with 4 failures and 1 “no-test” event. http://fas.org/nuke/guide/usa/slbm/d-5.htm. So D5 had 27 trials with 23 successful outcomes. We got our estimate of the probability of success: \(P=23/27=0.85\)

alt text

We suppose that the submarine on patrol being on high alert must be able to launch no less than 90% of missiles on board. So we have got for french and american submarines low limits as \(Nfr=16*0.9=14\) and \(Nus=24*0.9=22\). What is the probability for successful launch of 90% of SLBM for french P(14<=X<=16) and american P(22<=X<=24) submarines ?

sum(dbinom(14:16,16,0.83))

## [1] 0.4723415

sum(dbinom(22:24,24,0.85))

## [1] 0.2798276

The result is unsatisfactory for both subs and especially for american one. We can estimate the low limit for probability of successful launch for one missile and the sub on the whole as the following:

sum(dbinom(14:16,16,0.85))

## [1] 0.5613793

sum(dbinom(22:24,24,0.9))

## [1] 0.5642737

Test for null hypothesis

First, we postulate null hypothesis H₀| P=0.85 and H₁| P<0.85 for M51. We use binom.test function.

binom.test(x = 5,n = 6,p = 0.85,alternative = "less")

## 
##  Exact binomial test
## 
## data:  5 and 6
## number of successes = 5, number of trials = 6, p-value = 0.6229
## alternative hypothesis: true probability of success is less than 0.85
## 95 percent confidence interval:
##  0.0000000 0.9914876
## sample estimates:
## probability of success 
##              0.8333333

Next, we postulate null hypothesis H₀| P=0.9 and H₁| P<0.9 for D5.

binom.test(x = 23,n = 27,p = 0.9,alternative = "less")

## 
##  Exact binomial test
## 
## data:  23 and 27
## number of successes = 23, number of trials = 27, p-value = 0.2821
## alternative hypothesis: true probability of success is less than 0.9
## 95 percent confidence interval:
##  0.0000000 0.9477668
## sample estimates:
## probability of success 
##              0.8518519

As we can see both M51 and D5 have got statistically significant results i.e. H₀ is not rejected.

Power of the test

To be honest 85% and 90% for successful launch of missile is not the goal of France and USA at all. The goal is 99%. How much trials do we need to get this result? We use pwr.p.test function.

pwr.p.test(h = ES.h(23/27,0.99),power = 0.8,alternative = "less")

## 
##      proportion power calculation for binomial distribution (arcsine transformation) 
## 
##               h = -0.5898645
##               n = 17.76901
##       sig.level = 0.05
##           power = 0.8
##     alternative = less

pwr.p.test(h = ES.h(5/6,0.99),power = 0.8,alternative = "less")

## 
##      proportion power calculation for binomial distribution (arcsine transformation) 
## 
##               h = -0.6407338
##               n = 15.0596
##       sig.level = 0.05
##           power = 0.8
##     alternative = less

So we got 18 additional trials for D5 and 15 ones for M51. What is the power of test for D5 and M51?

pwr.p.test(h = ES.h(23/27,0.99),n=27,alternative = "less")

## 
##      proportion power calculation for binomial distribution (arcsine transformation) 
## 
##               h = -0.5898645
##               n = 27
##       sig.level = 0.05
##           power = 0.9222212
##     alternative = less

pwr.p.test(h = ES.h(5/6,0.99),n=6,alternative = "less")

## 
##      proportion power calculation for binomial distribution (arcsine transformation) 
## 
##               h = -0.6407338
##               n = 6
##       sig.level = 0.05
##           power = 0.4699551
##     alternative = less

As we can see D5 test flight program has got 92% power while M51 - only 47% so far.

In fact D5 has got 136 consecutive successful events so far and together with 27 initial fly test events our estimate for P_D5 would be the following:

binom.test(x = 149,n = 153,p = 0.99,alternative = "less")

## 
##  Exact binomial test
## 
## data:  149 and 153
## number of successes = 149, number of trials = 153, p-value =
## 0.06851
## alternative hypothesis: true probability of success is less than 0.99
## 95 percent confidence interval:
##  0.0000000 0.9910216
## sample estimates:
## probability of success 
##              0.9738562

It means H₀|P_D5=0.99 can’t be rejected.

Bayes estimate

There is another way to consider this problem - Bayesian statistics (http://en.wikipedia.org/wiki/Bayesian_statistics). We can use binom.bayes and binom.bayes.densityplot functions for this purpose as far as M51 is considered:

binom.bayes(x = 5,n = 6)

##   method x n shape1 shape2      mean     lower     upper        sig
## 1  bayes 5 6    5.5    1.5 0.7857143 0.5047529 0.9989366 0.04999999

bs=binom.bayes(x = 5,n = 6)
binom.bayes.densityplot(bs,npoints = 500,alpha = 0.5)

binom.bayes(x = 5+15,n = 6+15)

##   method  x  n shape1 shape2      mean     lower     upper  sig
## 1  bayes 20 21   20.5    1.5 0.9318182 0.8282383 0.9998891 0.05

bs=binom.bayes(x = 5+15,n = 6+15)
binom.bayes.densityplot(bs,npoints = 500,alpha = 0.5)

The result is satisfactory for M51 so far and very optimistic for the future after 15 consecutive successful trials.

Binomial test: how much is enough?

Alexander Levakov