I use Markov Chains to solve Michael Harris’s coin toss problem:
Let \(X_t\) denote the result of the \(t^\text{th}\) coin toss. The state space of this random variable is \(\{\text{H}, \text{T}\}\). We make two assumptions:
We do not build the Markov chain with \(X_t\), but rather with \(Y_t\): \[ Y_t = \begin{cases} X_t & t=1 \\ X_{t-1}X_t & t=2 \\ X_{t-2}X_{t-1}X_t & t=3 \\ \end{cases} \] Note that here we are not multipliying, but rather “concatenating” or making a “string” of the last 3 coin tosses. The state space of \(Y_t\) is \(\{\text{H}, \text{HT}, \text{HH}, \text{HTT}, \text{HTH}, \text{HHT}, \text{HHH}, \text{TTT}, \text{TTH}, \text{THT}, \text{THH}, \text{TT}, \text{TH}, \text{T} \}\). The associated Markov chain, expressed as a graph, is:
The graph looks a bit chaotic, but we can see how the one-letter states lead to two-letter states and these, in turn, to three-letter states. Then, the MC loops among all three-letter states. In this markov chain, we would call all three-letter states a recurrent class. The probability transition matrix of this MC is: \[ P = \begin{bmatrix} 0 & 1/2 & 1/2 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 \\ 0 & 0 & 0 & 1/2 & 1/2 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 \\ 0 & 0 & 0 & 0 & 0 & 1/2 & 1/2 & 0 & 0 & 0 & 0 & 0 & 0 & 0 \\ 0 & 0 & 0 & 0 & 0 & 0 & 0 & 1/2 & 1/2 & 0 & 0 & 0 & 0 & 0 \\ 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 1/2 & 1/2 & 0 & 0 & 0 \\ 0 & 0 & 0 & 1/2 & 1/2 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 \\ 0 & 0 & 0 & 0 & 0 & 1/2 & 1/2 & 0 & 0 & 0 & 0 & 0 & 0 & 0 \\ 0 & 0 & 0 & 0 & 0 & 0 & 0 & 1/2 & 1/2 & 0 & 0 & 0 & 0 & 0 \\ 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 1/2 & 1/2 & 0 & 0 & 0 \\ 0 & 0 & 0 & 1/2 & 1/2 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 \\ 0 & 0 & 0 & 0 & 0 & 1/2 & 1/2 & 0 & 0 & 0 & 0 & 0 & 0 & 0 \\ 0 & 0 & 0 & 0 & 0 & 0 & 0 & 1/2 & 1/2 & 0 & 0 & 0 & 0 & 0 \\ 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 1/2 & 1/2 & 0 & 0 & 0 \\ 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 1/2 & 1/2 & 0 \\ \end{bmatrix} \] where the order of the states is the same as in the state space list. We save this matrix to a variable:
P <- c(
# H HT HH HTT HTH HHT HHH TTT TTH THT THH TT TH T
0.0, 0.5, 0.5, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, # H
0.0, 0.0, 0.0, 0.5, 0.5, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, # HT
0.0, 0.0, 0.0, 0.0, 0.0, 0.5, 0.5, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, # HH
0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.5, 0.5, 0.0, 0.0, 0.0, 0.0, 0.0, # HTT
0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.5, 0.5, 0.0, 0.0, 0.0, # HTH
0.0, 0.0, 0.0, 0.5, 0.5, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, # HHT
0.0, 0.0, 0.0, 0.0, 0.0, 0.5, 0.5, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, # HHH
0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.5, 0.5, 0.0, 0.0, 0.0, 0.0, 0.0, # TTT
0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.5, 0.5, 0.0, 0.0, 0.0, # TTH
0.0, 0.0, 0.0, 0.5, 0.5, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, # THT
0.0, 0.0, 0.0, 0.0, 0.0, 0.5, 0.5, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, # THH
0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.5, 0.5, 0.0, 0.0, 0.0, 0.0, 0.0, # TT
0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.5, 0.5, 0.0, 0.0, 0.0, # TH
0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.5, 0.5, 0.0 # T
)
P <- matrix(P, byrow = T, nrow = 14)
The riddle asks for the probability of a streak of three heads (\(\text{HHH}\)) or a streak of three tails (\(\text{TTT}\)), after having tosses the coin 20 times. For this, we define a new MC but with the states \(\text{HHH}\) and \(\text{TTT}\) being absorbent. The new probability transition matrix of this MC is:
\[ Q = \begin{bmatrix} 0 & 1/2 & 1/2 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 \\ 0 & 0 & 0 & 1/2 & 1/2 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 \\ 0 & 0 & 0 & 0 & 0 & 1/2 & 1/2 & 0 & 0 & 0 & 0 & 0 & 0 & 0 \\ 0 & 0 & 0 & 0 & 0 & 0 & 0 & 1/2 & 1/2 & 0 & 0 & 0 & 0 & 0 \\ 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 1/2 & 1/2 & 0 & 0 & 0 \\ 0 & 0 & 0 & 1/2 & 1/2 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 \\ 0 & 0 & 0 & 0 & 0 & 0 & 1 & 0 & 0 & 0 & 0 & 0 & 0 & 0 \\ 0 & 0 & 0 & 0 & 0 & 0 & 0 & 1 & 0 & 0 & 0 & 0 & 0 & 0 \\ 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 1/2 & 1/2 & 0 & 0 & 0 \\ 0 & 0 & 0 & 1/2 & 1/2 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 \\ 0 & 0 & 0 & 0 & 0 & 1/2 & 1/2 & 0 & 0 & 0 & 0 & 0 & 0 & 0 \\ 0 & 0 & 0 & 0 & 0 & 0 & 0 & 1/2 & 1/2 & 0 & 0 & 0 & 0 & 0 \\ 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 1/2 & 1/2 & 0 & 0 & 0 \\ 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 1/2 & 1/2 & 0 \\ \end{bmatrix} \]
We also save this to a variable:
Q <- P
Q[7,] <- 0
Q[7,7] <- 1
Q[8,] <- 0
Q[8,8] <- 1
The 20-step probability matrix is obtained by multiplying \(Q\) by itself 20 times: \[ Q^{20} = \underbrace{QQ \dots QQ}_\text{20 times} \] Let’s calculate what this matrix is:
Q_2 = Q %*% Q
Q_4 = Q_2 %*% Q_2
Q_8 = Q_4 %*% Q_4
Q_16 = Q_8 %*% Q_8
Q_20 = Q_16 %*% Q_4
round(Q_20, 4)
## [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10] [,11]
## [1,] 0 0 0 0.0032 0.0032 0.0020 0.5630 0.4201 0.0020 0.0032 0.0032
## [2,] 0 0 0 0.0032 0.0032 0.0020 0.4201 0.5630 0.0020 0.0032 0.0032
## [3,] 0 0 0 0.0020 0.0020 0.0012 0.7091 0.2805 0.0012 0.0020 0.0020
## [4,] 0 0 0 0.0020 0.0020 0.0012 0.2805 0.7091 0.0012 0.0020 0.0020
## [5,] 0 0 0 0.0032 0.0032 0.0020 0.5630 0.4201 0.0020 0.0032 0.0032
## [6,] 0 0 0 0.0032 0.0032 0.0020 0.4201 0.5630 0.0020 0.0032 0.0032
## [7,] 0 0 0 0.0000 0.0000 0.0000 1.0000 0.0000 0.0000 0.0000 0.0000
## [8,] 0 0 0 0.0000 0.0000 0.0000 0.0000 1.0000 0.0000 0.0000 0.0000
## [9,] 0 0 0 0.0032 0.0032 0.0020 0.5630 0.4201 0.0020 0.0032 0.0032
## [10,] 0 0 0 0.0032 0.0032 0.0020 0.4201 0.5630 0.0020 0.0032 0.0032
## [11,] 0 0 0 0.0020 0.0020 0.0012 0.7091 0.2805 0.0012 0.0020 0.0020
## [12,] 0 0 0 0.0020 0.0020 0.0012 0.2805 0.7091 0.0012 0.0020 0.0020
## [13,] 0 0 0 0.0032 0.0032 0.0020 0.5630 0.4201 0.0020 0.0032 0.0032
## [14,] 0 0 0 0.0032 0.0032 0.0020 0.4201 0.5630 0.0020 0.0032 0.0032
## [,12] [,13] [,14]
## [1,] 0 0 0
## [2,] 0 0 0
## [3,] 0 0 0
## [4,] 0 0 0
## [5,] 0 0 0
## [6,] 0 0 0
## [7,] 0 0 0
## [8,] 0 0 0
## [9,] 0 0 0
## [10,] 0 0 0
## [11,] 0 0 0
## [12,] 0 0 0
## [13,] 0 0 0
## [14,] 0 0 0
Then, for instance, the probability of encountering 3 heads in a row after 20 coin tosses, given the first toss was tails, is the element in the row corresponding to \(\text{T}\) (last row) and in the column corresponding to \(\text{HHH}\) (\(7^\text{th}\) row): 0.4201.
The probability of encountering 3 heads in a row or 3 tails in a row after 20 coin tosses, given the first toss was tails, is therefore
Q_20[14,7] + Q_20[14,8]
## [1] 0.9831095
By symmetry, this number is the same given the first toss was heads, but to be sure:
Q_20[1,7] + Q_20[1,8]
## [1] 0.9831095
Denote the event of encountering 3 heads or 3 tails in a row as \(\mathcal{A}\) and we can see that \[ \mathbb{P}(\mathcal{A}|Y_1 = \text{H}) = \mathbb{P}(\mathcal{A}|Y_1 = \text{T}) = 0.9831095 \]
Finding the unconditional probability \(\mathbb{P}(\mathcal{A})\) is easy: \[ \begin{align} \mathbb{P}(\mathcal{A}) &= \mathbb{P}(\mathcal{A}|Y_1 = \text{H})\mathbb{P}(Y_1 = \text{H}) + \mathbb{P}(\mathcal{A}|Y_1 = \text{T})\mathbb{P}(Y_1 = \text{T})\\ &= \mathbb{P}(\mathcal{A}|Y_1 = \text{H})\frac{1}{2} + \mathbb{P}(\mathcal{A}|Y_1 = \text{T})\frac{1}{2} \\ &= 0.9831095 \end{align} \] Since the state space of \(Y_1\) is just \(\{ \text{H}, \text{T}\}\).
This method does not allow for an easy generalization of the problem in the direction of “what is the probability of \(n\) heads or tails in a row?” However, we can easily generalize the number of coin tosses, and see how the probabilities vary. There is most certainly an analytically closed form of this function, but I’ll leave that for another time. We loop through each time step and retrieve the desired probabilities \(\mathbb{P}(\mathcal{A}_t)\), where \(\mathcal{A}_t\) is the event that, after \(t\) tosses, we encounter at least 3 heads or 3 tails in a row.
prob <- matrix(nrow = 40, ncol = 1)
temp <- diag(1, 14)
for (i in 1:40){
temp <- temp %*% Q
prob[i] <- 0.5*(temp[1,7]+temp[14,7]) + 0.5*(temp[1,8]+temp[14,8])
}
plot(prob, type = "l",
xlab = "t", ylab = expression(P(A[t])))