This is a “Rpub format” of my page: https://dataz4s.com/statistics/the-binomial-distribution/
Explained with these slides:
Let X be the number of tails from flipping a fair coin 5 times. We can denote:
\[X= \text{the number of tails in 5 flips}\] \[\Leftrightarrow X=0,1,2,3,4,5\] \[X\sim B(5; 0.5)\]
Total possible outcomes = 32: \[2 \times 2 \times 2 \times 2 \times 2 = 2^5 = 32\] The probability of flipping 0 tails in 5 flips can be written: \[\frac{^5C_0}{32}\]
Each point probability for flipping 0-5 tails in 5 flips can be expressed like this:
\[P(X=0)= \frac{^5C_0}{32}= \frac{1}{32}\] \[P(X=1)= \frac{^5C_1}{32}= \frac{5}{32}\] \[P(X=2)= \frac{^5C_2}{32}= \frac{10}{32}\] \[P(X=3)= \frac{^5C_3}{32}= \frac{10}{32}\] \[P(X=4)= \frac{^5C_4}{32}= \frac{5}{32}\] \[P(X=5)= \frac{^5C_5}{32}= \frac{1}{32}\]
If I have a 25% chance of winning a tennis match, what is then the probabiltiy that I win exactly 4 matches out of 6?
\[P(\text{Exactly 4 matches out of 6}) = \] \[P(X=4) = \binom{6}{4}\times 0.25^4 \times 0.75^2 \]
There are 15 possible outcomes of the 6 matches:
\[^6C_4 = \binom{6}{4} = \frac{6!}{4!(6-4)!} = 15\]
The point probabilities:
\[P(\text {k wins in n attempts}) = \binom{n}{k}\times \text {win}^{k} \times lose^{n-k} \]
Generalized formula:
\[\Leftrightarrow P(k) = \binom{n}{k}\times p^{k} \times (1-p)^{n-k} \]
What is my probability of losing more than 3 times out of 6 matches?
\[\text{X= Number of lost matches}\] \[X\sim B(n=6;p=0.75)\] \[P(X>3)\]
\[P(X=0) = \binom{6}{0}\times p^{0} \times (1-0.75)^{6-0} = 0.0002\] \[P(X=1) = \binom{6}{1}\times p^{1} \times (1-0.75)^{6-1}= 0.0044\] \[P(X=2) = \binom{6}{2}\times p^{2} \times (1-0.75)^{6-2}=0.0330\] \[P(X=3) = \binom{6}{3}\times p^{3} \times (1-0.75)^{6-3}=0.1318 \] \[P(X=4) = \binom{6}{4}\times p^{4} \times (1-0.75)^{6-4}=0.2966 \] \[P(X=5) = \binom{6}{5}\times p^{5} \times (1-0.75)^{6-5}= 0.3560\] \[P(X=6) = \binom{6}{6}\times p^{6} \times (1-0.75)^{6-6}= 0.1780\] \[\Leftrightarrow P(X>3) = 0.2966 + 0.3560 + 0.1780 = 0.8306\]
Let’s look at the dbinom, pbinom, rbinom and qbinom functions in R:
Say that X is binomially distributed with n=20 trials and p=1/6 prob of success:
\[X\sim BIN(n=20; p=1/6)\]
dbinom can be used to find values for the probability density function of X, f(x)
# P(X=3)
dbinom(x=3, size = 20, prob = 1/6)
## [1] 0.2378866
Multiple probailities:
# P(X=0) & P(X=1) & P(X=2) & P(X=3)
dbinom(x=0:3, size = 20, prob=1/6)
## [1] 0.02608405 0.10433621 0.19823881 0.23788657
# P( <= 3)
sum( dbinom(x=0:3, size = 20, prob=1/6) )
## [1] 0.5665456
This calculation can also be done with the pbinom function:
pbinom command returns values for the probability distribution function of X, F(x)
# P( <= 3)
pbinom(q=3, size = 20, prob = 1/6, lower.tail = T)
## [1] 0.5665456
rbinom command can be used to take a random sample from a binomial distribution. Say we are producing digital gadgets for a IT company and that we produce 150 widgets per day. Defective gadgets must are returned for reworking. There is a 5% error rate. The number of gadgets that we need to do per 5-days working week can be estimated like this:
rbinom(5,150,0.05)
## [1] 12 5 6 8 8
The rbinom can model Bernoulli trials by setting the ‘size’ (number of trials) equal to on. For example, for the outcome of 10 coin flips:
# 10 coin flips
rbinom(10, 1,.5)
## [1] 1 0 1 0 0 0 0 0 0 0
or for 10 flips of 100 coins:
# 10 flips of 100 coins
rbinom(10,100,.5)
## [1] 52 47 52 55 49 47 48 47 46 50
The qbinom function calculates the inverse binomial distribution inverting the operation performed by pbinom. We provide the function specifying the percentile we want to be at or below and it will generate the number of successes associated with just that cumulative probability, for example:
qbinom(0.25,10,.5)
## [1] 4