MULTINOMIAL DISTRIBUTION

Author

SRAVYA

ABSTRACT

In probability theory, the Multinomial distribution is a generalization of the binomial distribution. For example, it models the probability of counts for each side of a k-sided dice rolled n times. For n independent trials each of which leads to success for exactly one of k categories, with each category having a given fixed success probability, the multinomial distribution gives the probability of any particular combination of numbers of successes for the various categories.

INTRODUCTION

The Multinomial distribution is a common distribution for characterizing categorical variables. Suppose a random variable Z has k categories, we can code each category as an integer, leading to Z ∈ {1, 2, · · ·, k}. Suppose that P(Z = k) = pk. The parameter {p1, · · ·, pk} describes the entire distribution of k (with the constraint that (pj = 1). Suppose we generate Z1, · · · , Zn IID from the above distributions and let

to denote a Multinomial distribution.

Example:

Suppose that in a three-way election for a large country, candidate A received 20% of the votes, candidate B received 30% of the votes, and candidate C received 50% of the votes. If six voters are selected randomly, what is the probability that there will be exactly one supporter for candidate A, two supporters for candidate B and three supporters for candidate C in the sample?

Note: Since we're assuming that the voting population is large, it is reasonable and permissible to think of the probabilities as unchanging once a voter is selected for the sample. Technically speaking this is sampling without replacement, so the correct distribution is the multivariate hypergeometric distribution, but the distributions converge as the population grows large in comparison to a fixed sample size.

DEFINITIONS

PROBABILITY MASS FUNCTION

Suppose one does an experiment of extracting n balls of k different colors from a bag, replacing the extracted balls after each draw. Balls of the same color are equivalent. Denote the variable which is the number of extracted balls of color i (i = 1, ..., k) as Xi, and denote as pi the probability that a given extraction will be in color i. The probability mass function of this Multinomial distribution is:

PROPERTIES

EXPECTED VALUE AND VARIANCE

All covariances are negative because, for fixed n, an increase in one component of a multinomial vector requires a decrease in another component. When these expressions are combined into a matrix with i, j element

the result is a k × k positive semi-definite covariance matrix of rank k − 1. In the special case where k = n and where the pi is all equal, the covariance matrix is the centering matrix.

The entries of the corresponding correlation matrix are

MATRIX NOTATION

FORMULAS OF THE DISTRIBUTION

APPLICATIONS

Baye’s rule Example

As mentioned before, multinomial distributions are generalized versions of binomial distributions. In chemical engineering applications, multinomial distributions are relevant to situations where there are more than two possible outcomes (temperature = {high, med, low}). Multinomial systems are a useful analysis tool when a "success-failure" description is insufficient to understand the system. A continuous form of the multinomial distribution is the Dirichlet distribution.

Using Bayes’ Rule is one of the major applications of multinomial distributions. For example, Bayes’ Rule can be used to predict the pressure of a system given the temperature and statistical data for the system. Bayes’ Rule can be used to determine the probability of an event or outcome as mentioned above. Additional details on Bayes’ Rule can be found at Bayes’ Rule, conditional probability, independence.

PROBLEMS AND SOLUTIONS

CHESS GAME PREDICTION

Two chess players have the probability Player A would win is 0.40, Player B would win is 0.35, game would end in a draw is 0.25.

The multinomial distribution can be used to answer questions such as: "If these two chess players played 12 games, what is the probability that Player A would win 7 games, Player B would win 2 games, the remaining 3 games would be drawn?"

dmultinom(x=c(7,2,3), prob = c(0.4,0.35,0.25))
[1] 0.02483712

THROWING DICE AS MULTINOMIAL DISTRIBUTION

A distribution that shows the likelihood of the possible results of a experiment with repeated trials in which each trial can result in a specified number of outcomes that is greater than two. A multinomial distribution could show the results of tossing a dice, because a dice can land on one of six possible values. By contrast, the results of a coin toss would be shown using a binomial distribution because there are only two possible results of each toss, heads or tails.

Two additional key characteristics of a multinomial distribution are that the trials it illustrates must be independent (e.g., in the dice experiment, rolling a five does not have any impact on the number that will be rolled next) and the probability of each possible result must be constant (e.g., on each roll, there is a one in six chance of any number on the die coming up).

1. ROLLING A DIE N = 100 TIMES

one.dice <- function(){
  dice <- sample(1:6, size = 1, replace = TRUE)
  return(dice)
}

one.dice()
[1] 2
one.dice()
[1] 1
par(mfrow=c(2,2))

for (i in 1:4){
sims <- replicate(100, one.dice())
table(sims)
table(sims)/length(sims)
plot(table(sims), xlab = 'Event', ylab = 'Frequency')
}

ROLLING A DIE N= 10000 TIMES

par(mfrow=c(2,2))

for (i in 1:4){
sims <- replicate(10000, one.dice())
table(sims)
table(sims)/length(sims)
plot(table(sims), xlab = 'Event', ylab = 'Frequency')
}

REFERENCES

https://eng.libretexts.org/Bookshelves/Industrial_and_Systems_Engineering/Book%3A_Chemical_Process_Dynamics_and_Controls_(Woolf)/13%3A_Statistics_and_Probability_Background/13.10%3A_Multinomial_Distributions

https://en.wikipedia.org/wiki/Multinomial_distribution

https://rpubs.com/JanpuHou/296336

https://www.britannica.com/science/multinomial-distribution