Discrete distributions refer to probability
distributions that are connected with discrete
random variables.
Unlike continuous random variables, which can take on
any value within a given range, discrete random variables assume
distinct and separate values typically represented by whole numbers or
counts. Each possible value of the random variable is assigned a
corresponding probability indicating its likelihood of occurrence.
Bernoulli: Distribution representing a binary outcome (success or failure).
Binomial: Counts the number of successes in a fixed number of independent Bernoulli trials.
Multinomial: Generalizes the binomial distribution to more than two categories.
Poisson: Models the number of events occurring in a fixed interval of time or space.
Geometric: Represents the number of trials needed for the first success in a sequence of independent Bernoulli trials.
Hypergeometric: Models the number of successes in a sample drawn without replacement from a finite population.
Negative Binomial: Generalizes the geometric distribution and represents the number of trials needed for a fixed number of successes.
In some statistical applications, situations arise where one needs to
simulate (generate) random scenarios that are binomial. To do this, we
need to use the following R function (when you type ?rbinom
in the R console, you get the R Documentation):
rbinom(number of experiments (m), number of trials (n), probability of success (p))Note: If number of trials, \(n=1\), we have a
Bernoulli random variable (distribution).
Simulate or generate 50 binomial random numbers from the Binomial
distribution with parameters: \(n=5\),
\(p=0.60\) using the
rbinom() function.
# Set parameters for the simulation
m <- 50 # Number of experiments
n <- 5 # Number of trials
p <- 0.60 # Probability of success
# Generate binomial random numbers
X <- rbinom(m, n, p) # Binomial random numbers
# Output
X
Plot a bar graph or chart of the simulated binomial random numbers (Frequency Distribution).
barplot(table(X), xlab = "X", ylab = "Frequency", main = "Frequency Distribution of X")
The bar graph is an estimate of the probability distribution \(P(X = x)\). The bar chart is appropriate since the data are discrete.
The theoretical Binomial distribution is given as follows:
\[P(X=x) = \frac{n!}{x!(n-x)!}\cdot p^x \cdot (1-p)^{n-x} \text{ for } x=0, 1, \ldots,n.\]
\(P(X=x)\) can be calculated using
dbinom(x, n, p) function in R.
From the binomial distribution defined in Example 1, calculate \(P(X=0), P(X=1),\cdots, P(X=5)\) using the
dbinom() function.
# Set parameters
n <- 5 # Number of trials
x <- 0:n # Possible number of successes
p <- 0.6 # Probability of success
# Calculate binomial probabilities
probabilities <- dbinom(x, n, p)
# Output
probabilities
Plot the probabilities using the barplot() function.
barplot(probabilities)
Calculate the sample mean, sample variance, and sample standard deviation of the generated or simulated binomial random numbers (sample) in Example 1.
# Calculate sample mean, variance, and standard deviation
sample_mean <- mean(X)
sample_variance <- var(X)
sample_sd <- sd(X)
sample_mean
sample_variance
sample_sd
The sample mean is an estimate of the
population mean (or expected value). The expected value of
a binomial random variable is given by:
\[E(X) = \mu = n \times p\] The variance of a binomial random variable is calculated from the formula: \[Var(X) = n\cdot p\cdot(1 − p).\] The corresponding standard deviation is:
\[SD(X)=\sqrt{Var(X)}=\sqrt{n\cdot p\cdot(1 − p)}.\]
For the Binomial distribution defined in Example 1 (Parameters: \(n = 5\) and \(p = 0.60\)), calculate these quantities: mean, variance, and standard deviation.
# Parameters
n <- 5 # Number of trials
p <- 0.60 # Probability of success
# Calculate mean, variance, and standard deviation
mean_binomial <- n * p
variance_binomial <- n * p * (1 - p)
sd_binomial <- sqrt(variance_binomial)
# Output
mean_binomial
variance_binomial
sd_binomial
Compare your results from Example 4 to what you would obtain from a simulated sample of 10000 binomial random variables.
# Set parameters for the larger sample
m <- 10000 # Number of experiments
n <- 5 # Number of trials
p <- 0.60 # Probability of success
# Generate a large sample of binomial random numbers
X <- rbinom(m, n, p)
# Calculate sample mean, variance, and standard deviation for the larger sample
sample_mean_large <- mean(X)
sample_variance_large <- var(X)
sample_sd_large <- sd(X)
# Output
sample_mean_large
sample_variance_large
sample_sd_large
rpois() function in R can be used to simulate
N independent Poisson random variables. To do this, we need
to use the following function:
rpois(number of random values, lambda)Generate 10 Poisson random numbers with parameter \(\lambda = 3\) as follows:
# Set the parameter for the Poisson distribution
lambda <- 3 # Rate parameter
# Generate 10 Poisson random numbers
X <- rpois(10, lambda)
# Output
X
Plot a bar graph or chart of the simulated Poisson random numbers (Frequency Distribution).
# Calculate frequencies
frequency_table <- table(X)
# Create a barplot of the frequencies
barplot(frequency_table, xlab = "Number of Events", ylab = "Frequency",
main = "Frequencies of Simulated Poisson Distribution with lambda = 3")
The theoretical Poisson distribution is given as follows:
\[P(X=x) = \frac{ \text{e}^{-\lambda} \cdot \lambda^x}{x!}, \text{ } x=0, 1, \ldots\]
\(P(X=x)\) can be calculated using
dpois(x, lambda, log = FALSE) function in R.
From the Poisson distribution defined in Example 1, calculate \(P(X=0), P(X=1),\cdots, P(X=5)\) using the
dpois() function.
# Set the parameter for the Poisson distribution
lambda <- 3 # Rate parameter
N <- 0:5 # Values from 0 to 5
# Calculate probabilities for values 0 to 5
probabilities <- dpois(N, lambda)
# Display the calculated probabilities
probabilities
Plot the probabilities using the barplot() function.
barplot(probabilities, xlab = "Number of Events", ylab = "Probability",
main = "Poisson Probabilities (lambda = 3)")
Calculate the sample mean, sample variance, and sample standard deviation of the generated or simulated Poisson random numbers (sample) in Example 1.
# Calculate sample mean, variance, and standard deviation
sample_mean <- mean(X)
sample_variance <- var(X)
sample_sd <- sd(X)
# Output
sample_mean
sample_variance
sample_sd
The sample mean is an estimate of the
population mean (or expected value). The expected value of
a Poisson random variable is given by: \(E(X)
= \lambda\).
The variance of a Poisson random variable is \(Var(X) = \lambda\). The corresponding standard deviation is: \(SD(X)=\sqrt{Var(X)}=\sqrt{\lambda}\).
For the Poisson distribution defined in Example 1 (Parameter: \(\lambda = 3\)), calculate these quantities: mean, variance, and standard deviation.
\[\text{mean}=\text{variance}=\lambda=3\] \[\text{standard deviation}=\sqrt{\lambda}=\sqrt{3} \approx 1.7321\]
Compare your results from Example 4 to what you would obtain from a simulated sample of 10000 Poisson random variables.
n <- 10000 # number of random values
lambda <- 3 # Poisson parameter
# Generate a large sample of Poisson random numbers
large_sample <- rpois(n, lambda)
# Calculate statistics for the large sample
large_sample_mean <- mean(large_sample)
large_sample_variance <- var(large_sample)
large_sample_sd <- sd(large_sample)
# Output
large_sample_mean
large_sample_variance
large_sample_sd