Prussian Cavalry Example

In 1898, a Russian economist and statistician, Ladislaus Josephovich Bortkiewicz, published an interesting findings about the probability distribution of Prussian soldiers accidentally killed by horse-kick. The data was derived from ten army corps who were observed over 20 years. There were a total of 200 observations and 122 soldiers were killed by horse kick over that 20 years. In average, the number of death is

\[ \lambda = \frac{122}{200} = 0.61\] By using lambda value 0.61, Bortkiewicz applied Poisson formula to predict the probability of number of death, \(x\), with x = 0, 1, 2, 3, 4, 5, 6:

dpois(0:6,lambda=0.61) %>% round(4)
## [1] 0.5434 0.3314 0.1011 0.0206 0.0031 0.0004 0.0000

Simulation Exercise

set.seed(12345)
Cavalry <- rpois(200,lambda=0.61)
Cavalry
##   [1] 1 2 1 2 0 0 0 0 1 3 0 0 1 0 0 0 0 0 0 2 0 0 2 1 1 0 1 1 0 0 1 0 0 1 0 0 1
##  [38] 2 1 0 1 0 2 1 0 0 0 0 0 1 2 1 0 0 1 0 1 0 0 0 1 0 3 1 3 0 2 0 1 2 1 0 0 0
##  [75] 0 1 2 1 0 0 1 0 0 0 0 0 1 0 1 0 2 1 0 1 1 0 1 1 0 0 0 1 2 1 0 2 1 1 0 0 1
## [112] 0 1 0 1 2 3 1 0 0 1 0 0 1 1 1 2 0 0 1 0 1 1 1 2 2 1 0 0 1 0 1 0 1 0 2 3 0
## [149] 2 0 1 1 1 0 0 1 1 0 1 1 1 1 1 1 0 0 3 0 1 0 2 0 1 0 0 1 1 2 1 1 0 0 0 0 2
## [186] 1 0 2 0 0 1 2 0 0 3 0 0 1 0 0

Mean and Variance

mean(Cavalry);var(Cavalry)
## [1] 0.705
## [1] 0.6612814

Dispersion Parameter

\[ \theta = \frac{Var(X)}{E(X)} \approx 1\]

Theta <-  var(Cavalry)/mean(Cavalry)

Theta %>% round(3)
## [1] 0.938

Simulation Exercise

N <-  100000
Theta <- numeric(N)

for ( i in 1:N){
   X=rpois(200,lambda=5);
   Theta[i] = var(X)/mean(X)
}

95% Tolerance Interval of Theta Values

hist(Theta, breaks = 100, col=c("lightblue","lightpink"))

quantile(Theta, c(0.025,0.975))  %>% round(3)
##  2.5% 97.5% 
## 0.816 1.207