Discrete distributions refer to probability
distributions that are connected with discrete
random variables.
Unlike continuous random variables, which can take on
any value within a given range, discrete random variables assume
distinct and separate values typically represented by whole numbers or
counts. Each possible value of the random variable is assigned a
corresponding probability indicating its likelihood of occurrence.
Bernoulli: Distribution representing a binary outcome (success or failure).
Binomial: Counts the number of successes in a fixed number of independent Bernoulli trials.
Multinomial: Generalizes the binomial distribution to more than two categories.
Poisson: Models the number of events occurring in a fixed interval of time or space.
Geometric: Represents the number of trials needed for the first success in a sequence of independent Bernoulli trials.
Hypergeometric: Models the number of successes in a sample drawn without replacement from a finite population.
Negative Binomial: Generalizes the geometric distribution and represents the number of trials needed for a fixed number of successes.
In some statistical applications, situations arise where one needs to
simulate (generate) random scenarios that are binomial. To do this, we
need to use the following R function (when you type ?rbinom
in the R console, you get the R Documentation):
rbinom(number of experiments (m), number of trials (n), probability of success (p))Note: If number of trials, \(n=1\), we have a
Bernoulli random variable (distribution).
Simulate or generate 50 binomial random numbers from the
Binomial distribution with parameters: \(n=5\), \(p=0.60\) using the rbinom()
function.
# Set parameters for the simulation
m <- 50 # Number of experiments
n <- 5 # Number of trials
p <- 0.60 # Probability of success
# Generate binomial random numbers
X <- rbinom(m, n, p) # Binomial random numbers
# Output
X
Plot a bar graph or chart of the simulated binomial random numbers (Frequency Distribution).
barplot(table(X), xlab = "X",
ylab = "Frequency",
col = "green",
main = "Frequency Distribution of X")
grid()
The bar graph is an estimate of the probability distribution \(P(X = x)\). The bar chart is appropriate since the data are discrete.
The theoretical Binomial distribution is given as follows:
\[P(X=x) = \frac{n!}{x!(n-x)!}\cdot p^x \cdot (1-p)^{n-x} \text{ for } x=0, 1, \ldots,n.\]
\(P(X=x)\) can be calculated using
dbinom(x, n, p) function in R.
From the binomial distribution defined in Example 1, calculate \(P(X=0), P(X=1),\cdots, P(X=5)\) using the
dbinom() function.
# Set parameters
n <- 5 # Number of trials
x <- 0:n # Possible number of successes
p <- 0.6 # Probability of success
# Calculate binomial probabilities
probabilities <- dbinom(x, n, p)
# Output
probabilities
Plot the probabilities using the barplot() function.
barplot(probabilities, xlab = "Number of Successes", ylab = "Probability",
main = "Binomial Probabilities (n = 5, p = 0.6)",
col = "green",
ylim = c(0,0.4))
grid()
Calculate the sample mean, sample variance, and sample standard deviation of the generated or simulated binomial random numbers (sample) in Example 1.
# Calculate sample mean, variance, and standard deviation
sample_mean <- mean(X)
sample_variance <- var(X)
sample_sd <- sd(X)
sample_mean
sample_variance
sample_sd
The sample mean is an estimate of the
population mean (or expected value). The expected value of
a binomial random variable is given by:
\[E(X) = \mu = n \times p\] The variance of a binomial random variable is calculated from the formula: \[Var(X) = n\cdot p\cdot(1 − p).\] The corresponding standard deviation is:
\[SD(X)=\sqrt{Var(X)}=\sqrt{n\cdot p\cdot(1 − p)}.\]
For the Binomial distribution defined in Example 1 (Parameters: \(n = 5\) and \(p = 0.60\)), calculate these quantities: mean, variance, and standard deviation.
# Parameters
n <- 5 # Number of trials
p <- 0.60 # Probability of success
# Calculate mean, variance, and standard deviation
mean_binomial <- n * p
variance_binomial <- n * p * (1 - p)
sd_binomial <- sqrt(variance_binomial)
# Output
mean_binomial
variance_binomial
sd_binomial
Compare your results from Example 4 to what you would obtain from a simulated sample of 10000 binomial random variables.
# Set parameters for the larger sample
m <- 10000 # Number of experiments
n <- 5 # Number of trials
p <- 0.60 # Probability of success
# Generate a large sample of binomial random numbers
X <- rbinom(m, n, p)
# Calculate sample mean, variance, and standard deviation for the larger sample
sample_mean_large <- mean(X)
sample_variance_large <- var(X)
sample_sd_large <- sd(X)
# Output
sample_mean_large
sample_variance_large
sample_sd_large
rpois() function in R can be used to simulate
N independent Poisson random variables. To do this, we need
to use the following function:
rpois(number of random values, lambda)Generate 10 Poisson random numbers with parameter \(\lambda = 3\) as follows:
# Set the parameter for the Poisson distribution
lambda <- 3 # Rate parameter
# Generate 10 Poisson random numbers
X <- rpois(10, lambda)
# Output
X
Plot a bar graph or chart of the simulated Poisson random numbers (Frequency Distribution).
# Calculate frequencies
frequency_table <- table(X)
# Create a barplot of the frequencies
barplot(frequency_table,
xlab = "Number of Events",
ylab = "Frequency",
ylim = c(0,4),
col = "blue",
main = "Frequencies of Simulated Poisson Distribution with lambda = 3")
grid()
The theoretical Poisson distribution is given as follows:
\[P(X=x) = \frac{ \text{e}^{-\lambda} \cdot \lambda^x}{x!}, \text{ } x=0, 1, \ldots\]
\(P(X=x)\) can be calculated using
dpois(x, lambda, log = FALSE) function in R.
From the Poisson distribution defined in Example 1, calculate \(P(X=0), P(X=1),\cdots, P(X=5)\) using the
dpois() function.
# Set the parameter for the Poisson distribution
lambda <- 3 # Rate parameter
N <- 0:5 # Values from 0 to 5
# Calculate probabilities for values 0 to 5
probabilities <- dpois(N, lambda)
# Display the calculated probabilities
probabilities
Plot the probabilities using the barplot() function.
barplot(probabilities,
xlab = "Number of Events",
ylab = "Probability",
col = "blue",
ylim = c(0, 0.4),
main = "Poisson Probabilities (lambda = 3)")
grid()
Calculate the sample mean, sample variance, and sample standard deviation of the generated or simulated Poisson random numbers (sample) in Example 1.
# Calculate sample mean, variance, and standard deviation
sample_mean <- mean(X)
sample_variance <- var(X)
sample_sd <- sd(X)
# Output
sample_mean
sample_variance
sample_sd
The sample mean is an estimate of the
population mean (or expected value). The expected value of
a Poisson random variable is given by: \(E(X)
= \lambda\).
The variance of a Poisson random variable is \(Var(X) = \lambda\). The corresponding standard deviation is: \(SD(X)=\sqrt{Var(X)}=\sqrt{\lambda}\).
For the Poisson distribution defined in Example 1 (Parameter: \(\lambda = 3\)), calculate these quantities: mean, variance, and standard deviation.
\[\text{mean}=\text{variance}=\lambda=3\] \[\text{standard deviation}=\sqrt{\lambda}=\sqrt{3} \approx 1.7321\]
Compare your results from Example 4 to what you would obtain from a simulated sample of 10000 Poisson random variables.
n <- 10000 # number of random values
lambda <- 3 # Poisson parameter
# Generate a large sample of Poisson random numbers
large_sample <- rpois(n, lambda)
# Calculate statistics for the large sample
large_sample_mean <- mean(large_sample)
large_sample_variance <- var(large_sample)
large_sample_sd <- sd(large_sample)
# Output
large_sample_mean
large_sample_variance
large_sample_sd