MA223: Unit 5 R Tutorials (Spring 2024)

Discrete Distributions in R

Discrete Distributions

Discrete distributions refer to probability distributions that are connected with discrete random variables.

Unlike continuous random variables, which can take on any value within a given range, discrete random variables assume distinct and separate values typically represented by whole numbers or counts. Each possible value of the random variable is assigned a corresponding probability indicating its likelihood of occurrence.

Common discrete distributions

Bernoulli: Distribution representing a binary outcome (success or failure). - Example: Coin flip (success: heads, failure: tails).

Binomial: Counts the number of successes in a fixed number of independent Bernoulli trials. - Example: Number of heads in 10 coin flips.

Multinomial: Generalizes the binomial distribution to more than two categories. - Number of 1s, 2s, 3s, 4s, 5s, and 6s in 10 rolls of a die.

Poisson: Models the number of events occurring in a fixed interval of time or space. - Example: Number of typos per page.

Geometric: Represents the number of trials needed for the first success in a sequence of independent Bernoulli trials. - Example: Number of coin flips until the first head.

Hypergeometric: Models the number of successes in a sample drawn without replacement from a finite population. - Example: Number of green balls drawn from a bag without replacement.

Negative Binomial: Generalizes the geometric distribution and represents the number of trials needed for a fixed number of successes. - Example: Number of coin flips until the third head.

Simulating Binomial Random Variables

In some statistical applications, situations arise where one needs to simulate random scenarios that are binomial. To do this, we need to use the following function:

rbinom(number of experiments, number of trials, probability of success)

Note: If number of trials \(=1\), we have Bernoulli random variable (distribution).

Example 1:

Generate or simulate 50 binomial random numbers from the Binomial distribution with parameters: \(n=5\) \(p=0.60\) using the rbinom() function.

m <- 50    # number of experiments
n <- 5     # number of trials
p <- 0.60  # probability of success
X <- rbinom(m, n, p) # binomial random numbers

Plot a bar graph or chart of the simulated binomial random numbers.

barplot(table(X))

The bar graph is an estimate of the probability distribution \(P(X = x)\). The bar chart is appropriate since the data are discrete.

The theoretical Binomial distribution is given as follows:

\[P(X=x) = \frac{n!}{x!(n-x)!}\cdot p^x \cdot (1-p)^{n-x} \text{ for } x=0, 1, \ldots,n.\]

\(P(X=x)\) can be calculated using dbinom(x, n, p) function in R.

Example 2:

From the binomial distribution defined in Example 1, calculate \(P(X=0), P(X=1),\cdots, P(X=5)\) using the dbinom() function.

probabilities <- dbinom(0:5, 5, 0.6)
probabilities

## [1] 0.01024 0.07680 0.23040 0.34560 0.25920 0.07776

Plot the probabilities using the barplot() function.

barplot(probabilities)

Example 3:

Calculate the sample mean, sample variance, and sample standard deviation of the generated or simulated binomial random numbers (sample) in Example 1.

mean(X)

## [1] 3.24

var(X)

## [1] 1.206531

sd(X)

## [1] 1.098422

The sample mean is an estimate of the population mean (or expected value). The expected value of a binomial random variable is given by:

\[E(X) = \mu = n \times p\] The variance of a binomial random variable is calculated from the formula: \[Var(X) = n\cdot p\cdot(1 − p).\] The corresponding standard deviation is:

\[SD(X)=\sqrt{Var(X)}=\sqrt{n\cdot p\cdot(1 − p)}.\]

Example 4:

For the Binomial distribution defined in Example 1 (Parameters: \(n = 5\) and \(p = 0.60\)), calculate these quantities: mean, variance, and standard deviation.

n*p

## [1] 3

n*p*(1-p)

## [1] 1.2

sqrt(n*p*(1-p))

## [1] 1.095445

Example 5:

Compare your results from Example 4 to what you would obtain from a simulated sample of 10000 binomial random variables.

m <- 10000    # number of experiments
n <- 5     # number of trials
p <- 0.60  # probability of success
X <- rbinom(m, n, p) # binomial random numbers

mean(X)

## [1] 3.0064

var(X)

## [1] 1.186678

sd(X)

## [1] 1.089347

Simulation Poisson Random Variables

The rpois() function can be used to simulate N independent Poisson random variables. To do this, we need to use the following function:
- rpois(number of random values, lambda)

Example 1:

Generate 10 Poisson random numbers with parameter \(\lambda = 3\) as follows:

X <- rpois(10, 3)

Plot a bar graph or chart of the simulated Poisson random numbers.

barplot(table(X))

The theoretical Poisson distribution is given as follows:

\[P(X=x) = \frac{ \text{e}^{-\lambda} \cdot \lambda^x}{x!}, \text{ } x=0, 1, \ldots\]

\(P(X=x)\) can be calculated using dpois(x, lambda, log = FALSE) function in R.

Example 2:

From the binomial distribution defined in Example 1, calculate \(P(X=0), P(X=1),\cdots, P(X=5)\) using the dpois() function.

probabilities <- dpois(0:5, 3)
probabilities

## [1] 0.04978707 0.14936121 0.22404181 0.22404181 0.16803136 0.10081881

Plot the probabilities using the barplot() function.

barplot(probabilities)

Example 3:

Calculate the sample mean, sample variance, and sample standard deviation of the generated or simulated Poisson random numbers (sample) in Example 1.

mean(X)

## [1] 2.5

var(X)

## [1] 1.388889

sd(X)

## [1] 1.178511

The sample mean is an estimate of the population mean (or expected value). The expected value of a Poisson random variable is given by: \(E(X) = \lambda\)

The variance of a Poisson random variable is \(Var(X) = \lambda\)

The corresponding standard deviation is: \(SD(X)=\sqrt{Var(X)}=\sqrt{\lambda}\)

Example 4:

For the Poisson distribution defined in Example 1 (Parameters: \(\lambda = 3\)), calculate these quantities: mean, variance, and standard deviation.

\[\text{mean}=\text{variance}=\lambda=3\] \[\text{standard deviation}=\sqrt{\lambda}=\sqrt{3} \approx 1.7321\]

Example 5:

Compare your results from Example 4 to what you would obtain from a simulated sample of 10000 Poisson random variables.

n <- 10000    # number of random values
lambda <- 3     # Poisson parameter
X <- rpois(n, lambda) # Poisson random numbers

mean(X)

## [1] 3.0136

var(X)

## [1] 2.996915

sd(X)

## [1] 1.73116

MA223: Unit 5 R Tutorials (Spring 2024)

5.3 - 5.4: Simulation and Calculating Probabilities from Discrete Distributions in R

2024-03-28

Discrete Distributions in R

Discrete Distributions

Common discrete distributions

Simulating Binomial Random Variables

Example 1:

Example 2:

Example 3:

Example 4:

Example 5:

Simulation Poisson Random Variables

Example 1:

Example 2:

Example 3:

Example 4:

Example 5: