JOINT = MARGINAL * CONDITIONAL

This will be a whimsical example of the great factorization of joint distributions.

Suppose we pick ten letters from an infinite pile of scrabble tiles, and arrange them face down. What’s the probability that they spell STATISTICS?
This is the probability of getting the right letters in the right order,
where \(j\) is the position in the word,
and \(m_j\) is the count for each of the 26 letters.

\[{p_S}{p_T}{p_A}{p_T}{p_I}{p_S}{p_T}{p_I}{p_C}{p_S} = p_S^3p_T^3p_A^{}p_I^2p_C^{} = \prod\limits_{j = 1}^{26} {p_j^{{m_j}}} \]

Suppose that each letter has chance 1/26. All \(p_j\) = 1/26. The pile being infinite, this is the same as if there just one of each,
but you always put it back and mix them up again: sampling WITH REPLACEMENT.

ourword = "STATISTICS"
wordLength = nchar(ourword)
jointProbability = print( (1/26)^wordLength )
## [1] 7.083804e-15

You’d have to be pretty dang lucky.
Lucking out and getting STATISTICS is the joint distribution of
two lucky events.

Marginal probability

First, you need to select the right number of each letter.
Second, you need to get those letters in the right order.

For Step One, the required number of each letter is given by

byLetter = strsplit(ourword, split="")[[1]]
letterTable = print(table(byLetter))
## byLetter
## A C I S T 
## 1 1 2 3 3
letterFreq = rep(0, 26)
names(letterFreq) = LETTERS
letterFreq[names(letterTable)] = letterTable
print(letterFreq)
## A B C D E F G H I J K L M N O P Q R S T U V W X Y Z 
## 1 0 1 0 0 0 0 0 2 0 0 0 0 0 0 0 0 0 3 3 0 0 0 0 0 0
marginalProbability = dmultinom(
  x = letterFreq, #size = 10, not necessary.
  prob = rep(1/26, 26 ) )
print(marginalProbability)
## [1] 3.570237e-10

This is the multinomial probability:

\[\Pr (\underset{\raise0.3em\hbox{$\smash{\scriptscriptstyle\thicksim}$}}{M} = \underset{\raise0.3em\hbox{$\smash{\scriptscriptstyle\thicksim}$}}{m} ) = \left( {\begin{array}{*{20}{c}} n \\ {\begin{array}{*{20}{c}} {{m_1}}&{...}&{{m_{26}}} \end{array}} \end{array}} \right)p_1^{{m_1}}...p_{26}^{{m_{26}}} = \frac{{n!}}{{\prod\limits_{j = 1}^{26} {{m_j}!} }}\;\;\prod\limits_{j = 1}^{26} {p_j^{{m_j}}} \]

This is the probability of getting the right number of each letter.
Marginal, in the sense that,
in the worlds in which that lucky thing happens,
still almost none also have the second lucky thing happen.

Conditional probability

Step Two, given the right number of each letter,
what is the conditional probability of putting the letters in the right order?
Each permutation is equally likely, so this is the proportion of permutations which are correct:

\[\frac{{\Pr ({\text{correct letter order}})}}{{\Pr (\underset{\raise0.3em\hbox{$\smash{\scriptscriptstyle\thicksim}$}}{M} = \underset{\raise0.3em\hbox{$\smash{\scriptscriptstyle\thicksim}$}}{m} )}} = \frac{{\prod\limits_{j = 1}^{26} {p_j^{{m_j}}} }}{{\frac{{n!}}{{\prod\limits_{j = 1}^{26} {{m_j}!} }}\;\;\prod\limits_{j = 1}^{26} {p_j^{{m_j}}} }} = \frac{{\prod\limits_{j = 1}^{26} {{m_j}!} }}{{n!}}\]

which is:

nDistinctLetters = length(letterTable)
conditionalProbability = print(
  prod(factorial(letterTable)) / 
    factorial(wordLength)
)
## [1] 1.984127e-05

This makes sense, because
\(n!\)
is the total number of permutations of the \(n\) slots,
and the numerator \(\prod\limits_{j = 1}^{26} {{m_j}!}\)
is the number of permutations that match
the correct slots for the word STATISTICS.

Pretty unlikely, but 10^5 more likely than the first event.

Confirming on the boundary cases:

Let’s check our work:

JOINT = MARGINAL * CONDITIONAL, really?

marginalProbability * conditionalProbability
## [1] 7.083804e-15

Verify:

jointProbability
## [1] 7.083804e-15

Sure enough.