“Binomial test for Aegis”


author: “Alexander Levakov”

date: “Wednesday, November 12, 2014”

If you can keep your head when all about you
Are losing theirs and blaming it on you,
If you can trust yourself when all men doubt you,
But make allowance for their doubting too… Rudyard Kipling, If

Main assumptions

We consider random experiment with exactly two possible outcomes, success and failure, in which the probability of success is the same every time the experiment is conducted. (http://en.wikipedia.org/wiki/Bernoulli_trial).

library(binom)
library(pwr)
library(ggplot2)

Data set and scope of experiment

To be honest we need some example of a real experiment. Let’s see it here: http://www.mda.mil/global/documents/pdf/aegis_tests.pdf, http://en.wikipedia.org/wiki/Aegis_Ballistic_Missile_Defense_System. So the data for Aegis test flights is a real experiment with HIT (success) and MISS (failure) outcomes, relevant with Bernoulli trial.

As it is defined in Title 10 United States Code: The field test, under realistic combat conditions, of any item of (or key component of) weapons, equipment, or munitions for use in combat by typical military users; and the evaluation of the results of such tests.

Research quiestion

So our goal is to calculate the number of additional trials needed to be confident in the estimate of probability of success (HIT) as 99%: \[1 - (1-0.99)*(1-0.99)=0.9999\] i.e. firing two Aegis missiles (shots) we’ll have 99.99% probability to hit the target at least by one missile (shot). (https://www.princeton.edu/sgs/publications/sgs/pdf/8_2Wilkening.pdf)

dbinom(1,2,0.99)+dbinom(2,2,0.99)
## [1] 0.9999

Note

In fact we can consider our experiment for 1,2,3 and 4 shots. But two shots (blue color) is more realistic sutuation with Aegus so far.

pkill=seq(0.5,0.99,0.01)
p2=dbinom(1,2,pkill)+dbinom(2,2,pkill)
p3=dbinom(1,3,pkill)+dbinom(2,3,pkill)+dbinom(3,3,pkill)
p4=dbinom(1,4,pkill)+dbinom(2,4,pkill)+dbinom(3,4,pkill)+dbinom(4,4,pkill)
plot(pkill,pkill,type="l",col="black",main="Kill Probability with Multiple Shots", xlab="Pkill in one shot", ylab="Overall Pkill")
lines(pkill,p2,col="blue")
lines(pkill,p3,col="magenta")
lines(pkill,p4,col="red")
grid()

Null and alternative hypothesis

First, we postulate null hypothesis H0| P=0.99 and alternatve hypothesis H1|P<0.99. We use binom.test function to prove it or not.

binom.test(x = 29,n = 35,p = 0.99,alternative = "less")
## 
##  Exact binomial test
## 
## data:  29 and 35
## number of successes = 29, number of trials = 35, p-value =
## 1.265e-06
## alternative hypothesis: true probability of success is less than 0.99
## 95 percent confidence interval:
##  0.0000000 0.9226056
## sample estimates:
## probability of success 
##              0.8285714

We got very low p-value (probability of false rejecting H0 - error type 1) and should reject null hypothesis H0| P=0.99 in favor of alternative H1|P<0.99. In fact after 35 trials we got 29 successful i.e. \(29/35=0.83\) or 83% probability of success. Here is some interesting thing about binomial test for Bernoulli trial. If we consider the last test (number 35) as a cutting point for crossing the Rubicon we must use pwr.p.test function.

pwr.p.test(h = ES.h(29/35,0.99),power = 0.8,alternative = "less")
## 
##      proportion power calculation for binomial distribution (arcsine transformation) 
## 
##               h = -0.6534395
##               n = 14.47965
##       sig.level = 0.05
##           power = 0.8
##     alternative = less

So we need 15 additional trials to be 95% confident that our estimate of the probability of success is significantly different from current 0.83 with 80% power of experiment. If we got at least 14 success out of 15 trials i.e. 14/15 than binom.test would prove our confidence,

binom.test(x = 14,n = 15,p = 0.99,alternative = "less")
## 
##  Exact binomial test
## 
## data:  14 and 15
## number of successes = 14, number of trials = 15, p-value = 0.1399
## alternative hypothesis: true probability of success is less than 0.99
## 95 percent confidence interval:
##  0.0000000 0.9965863
## sample estimates:
## probability of success 
##              0.9333333

Note

But the point is that in our data we got 6 successful consecutive trials after failed one (25-Oct-12). It means that we should reconsider the number of additional trials starting from number 29 using pwr.p.test function.

pwr.p.test(h = ES.h(23/29,0.99),power = 0.8,alternative = "less")
## 
##      proportion power calculation for binomial distribution (arcsine transformation) 
## 
##               h = -0.7440925
##               n = 11.16642
##       sig.level = 0.05
##           power = 0.8
##     alternative = less

So we get 11 additional tests to prove our strong believe in 99% probability of success. It means that after 6 successful trials we need only 4 additional successful outcomes. Let’s prove it.

binom.test(x = 10,n = 11,p = 0.99,alternative = "less")
## 
##  Exact binomial test
## 
## data:  10 and 11
## number of successes = 10, number of trials = 11, p-value = 0.1047
## alternative hypothesis: true probability of success is less than 0.99
## 95 percent confidence interval:
##  0.0000000 0.9953478
## sample estimates:
## probability of success 
##              0.9090909

Power of test

Power of test is another side of our confidence in the estimate of probability of success. Here is short introduction to the power of test: http://en.wikipedia.org/wiki/Statistical_power. In our case we are interested in the cut off number of trials we need to provide 80% confidence. To do this we use binom.power function.

ntrials=1:30
power=binom.power(p.alt=0.5,n=ntrials,p=0.99)
plot(ntrials,power,type="l",main="Power of binomial test",xlab="number of trials")
grid()

binom.power(p.alt=0.5,n=c(24,25),p=0.99)
## [1] 0.7854478 0.8615198

So to be 80% confident that we can see the difference between 0.99 (H0) and 0.5 (H1) values of probability we need at least 25 trials or even more. 24 trials are not enough to be confident at 80%. It’s very important moment for making decision about experiment. For example, if someone after 19 trials has got only 11 in success and keeps beleiving in H0|P=0.99 he deceives not only himself - he is fooling everyone with his experiment.

binom.power(p.alt=11/19,n=19,p=0.99)
##      alpha 
## 0.08104226

Because in case of this example we got 8% confidence in the estimate of probability of success as 0.99 i.e. 92% chance to make error (error type 2) while accepting H0|P=0.99.

Bayesian statistics

There is another way to consider this problem - Bayesian statistics (http://en.wikipedia.org/wiki/Bayesian_statistics). We can use binom.bayes and binom.bayes.densityplot functions for this purpose.

binom.bayes(x = 10,n = 11)
##   method  x  n shape1 shape2  mean     lower     upper        sig
## 1  bayes 10 11   10.5    1.5 0.875 0.6947615 0.9996966 0.04999999
bs=binom.bayes(x = 10,n = 11)
binom.bayes.densityplot(bs,npoints = 500,alpha = 0.5)

Densityplot

We can use binom.bayes.densityplot to see the difference in our estimate of probability of success after 29 trials before we made a decision for additional trials.

binom.bayes(x = 23,n = 29)
##   method  x  n shape1 shape2      mean     lower     upper        sig
## 1  bayes 23 29   23.5    6.5 0.7833333 0.6369896 0.9189328 0.05000001
bs=binom.bayes(x = 23,n = 29)
binom.bayes.densityplot(bs,npoints = 500,alpha = 0.5)

Conclusions

I hope this example of binomial test with R was useful no matter what data we used. Please, dont’t deceive youself about Aegis capability: the current SM-3 Block I is considered too slow to intercept a strategic ballistic missile. The game is not over!