Formally, the problem of estimating the maximum of a discrete uniform distribution from sampling without replacement
Named due to WW2 - Allies wanted to estimate the total number of German Tanks from the serial numbers of captured tanks.
Reasoning relies on the mediocrity principle: Very unlikely that a random sample of the serial numbers would all be clustered at the end or the beginning of the set of numbers.
Has also been used to estimate iPod and Commodore 64 production - you can use it with random user ID's to estimate traffic to websites etc.
obs<-c(2,6,7,14)
m<-max(obs)
k<-length(obs)
freqN<-m+(m/k)-1
freqN
[1] 16.5
lowconfinv<-m/(0.975^(1/k))
highconfinv<-m/(0.025^(1/k))
paste0("[",format(lowconfinv,digits=5),",",
format(highconfinv,digits=5),"]")
[1] "[14.089,35.208]"
Point estimate with confidence intervals.
obs<-c(2,6,7,14)
m<-max(obs)
k<-length(obs)
bayesMean<-(m-1)*((k-1)/(k-2))
bayesSD<-sqrt(((k-1)*(m-1)*(m-k+1))
/((k-3)*((k-2)^2)))
paste0(format(bayesMean,digits=5),"±",
format(bayesSD,digits=5))
[1] "19.5±10.356"
Can estimate probability distribution - only computed parameters here as computing plot of distribution can be computationally intensive.