R Functions dbinom, pbinom, and rbinom

Random varaible \(X\) is distributed \(X \sim b(n,p)\) with mean \(\mu=np\) and variance \(\sigma^2 = np(1-p)\) if \(X\) is the count of successful events in \(n\) identical and independent Bernoulli trials with constant probability of success \(p\). The probability of \(X=x\) successes in \(n\) trials is \(P(X=k) = \frac{{n!}}{{x!(n-x)!}} p^x (1-p)^{n-x}\).

R function dbinom(x, size, prob) is the probability of x successes in size trials when the probability of success is prob. R function pbinom(q, size, prob, lower.tail) is the cumulative probability (lower.tail = TRUE for left tail, lower.tail = FALSE for right tail) of less than or equal to q successes. R function rbinom(n, size, prob) returns n random numbers from the binomial distribution x~b(size,prob).

Example

What is the probability of 2 heads in 10 coin flips where probability of heads is 0.3?

# exact
dbinom(x = 2, size = 10, prob = 0.3)
## [1] 0.2334744
# simulated
mean(rbinom(n = 10000, size = 10, prob = 0.3) == 2)
## [1] 0.2351
library(dplyr)
library(ggplot2)
#library(scales)

data.frame(heads = 0:10, prob = dbinom(x = 0:10, size = 10, prob = 0.3)) %>%
  mutate(Heads = ifelse(heads == 2, "2", "other")) %>%
ggplot(aes(x = factor(heads), y = prob, fill = Heads)) +
  geom_col() +
  geom_text(
    aes(label = round(prob,2), y = prob + 0.01),
    position = position_dodge(0.9),
    size = 3,
    vjust = 0
  ) +
  labs(title = "Probability of X = 2 successes.",
       subtitle = "b(10, .3)",
       x = "Successes (x)",
       y = "probability") 

Example

What is the probability of <=5 heads in 10 coin flips where probability of heads is 0.3?

# exact
pbinom(q = 5, size = 10, p = 0.3, lower.tail = TRUE)
## [1] 0.952651
# simulated
mean(rbinom(n = 10000, size = 10, prob = 0.3) <= 5)
## [1] 0.9549
library(dplyr)
library(ggplot2)

data.frame(heads = 0:10, 
           pmf = dbinom(x = 0:10, size = 10, prob = 0.3),
           cdf = pbinom(q = 0:10, size = 10, prob = 0.3, 
                        lower.tail = TRUE)) %>%
  mutate(Heads = ifelse(heads <= 5, "<=5", "other")) %>%
ggplot(aes(x = factor(heads), y = cdf, fill = Heads)) +
  geom_col() +
  geom_text(
    aes(label = round(cdf,2), y = cdf + 0.01),
    position = position_dodge(0.9),
    size = 3,
    vjust = 0
  ) +
  labs(title = "Probability of X <= 5 successes.",
       subtitle = "b(10, .3)",
       x = "Successes (x)",
       y = "probability") 

Example

What is the probability of >=5 heads in 10 coin flips where probability of heads is 0.3?

# exact
pbinom(q = 4, size = 10, p = 0.3, lower.tail  = FALSE)
## [1] 0.1502683
# simulated
mean(rbinom(n = 10000, size = 10, prob = 0.3) >= 5)
## [1] 0.1452
library(dplyr)
library(ggplot2)

data.frame(heads = 0:10, 
           pmf = dbinom(x = 0:10, size = 10, prob = 0.3),
           cdf = pbinom(q = -1:9, size = 10, prob = 0.3, 
                        lower.tail = FALSE)) %>%
  mutate(Heads = ifelse(heads >= 5, ">=5", "other")) %>%
ggplot(aes(x = factor(heads), y = cdf, fill = Heads)) +
  geom_col() +
  geom_text(
    aes(label = round(cdf,2), y = cdf + 0.01),
    position = position_dodge(0.9),
    size = 3,
    vjust = 0
  ) +
  labs(title = "Probability of X >= 5 successes.",
       subtitle = "b(10, .3)",
       x = "Successes (x)",
       y = "probability") 

Example

What is the expected number of heads in 25 coin flips where probability of heads is 0.3?

# exact
25 * 0.3
## [1] 7.5
mean(rbinom(n = 10000, size = 25, prob = .3))
## [1] 7.5342
# Variance
25 * 0.3 * (1 - 0.3)
## [1] 5.25
var(rbinom(n = 10000, size = 25, prob = .3))
## [1] 5.2044
library(dplyr)
library(ggplot2)

data.frame(heads = 0:25, 
           pmf = dbinom(x = 0:25, size = 25, prob = 0.3)) %>%
ggplot(aes(x = factor(heads), y = pmf)) +
  geom_col() +
  geom_text(
    aes(label = round(pmf,2), y = pmf + 0.01),
    position = position_dodge(0.9),
    size = 3,
    vjust = 0
  ) +
  labs(title = "Probability of X = x successes.",
       subtitle = "b(25, .3)",
       x = "Successes (x)",
       y = "probability") 

Example

Suppose X is a random b(10, .6) variable and Y is a random b(10, .7) variable, and X and Y they are independent. What is the probability that either variable is <=4?

(x_le4 <- pbinom(q = 4, size = 10, prob = 0.6, lower.tail = TRUE))
## [1] 0.1662386
(y_le4 <- pbinom(q = 4, size = 10, prob = 0.7, lower.tail = TRUE))
## [1] 0.04734899
(x_or_y_le4 <- x_le4 + y_le4 - x_le4 * y_le4)
## [1] 0.2057164

Example

A pharmaceutical company claims that a new treatment is successful in reducing fever in more than 60% of the cases. The treatment was tried on 40 randomly selected cases and 11 were successful. Do you doubt the company’s claim?

If the claim is valid, \(\pi = 0.6\)

pi = 0.6
n = 40
k = 11
p = k / 40

# probability of k <= 11 when n = 40 and pi = 0.6
pbinom(q = k, size = 40, prob = pi)
## [1] 3.163713e-05
# 3 standard deviations from the expected value
n * pi - 3 * sqrt(n * pi * (1 - pi))
## [1] 14.70484
n * pi + 3 * sqrt(n * pi * (1 - pi))
## [1] 33.29516

Doubt the claim because 11 successes are not within 3 SD of expected value n * p = 24 (interval is 15 to 33).