Smith is in jail and has 1 dollar; he can get out on bail if he has 8 dollars.
A guard agrees to make a series of bets with him. If Smith bets A dollars,
he wins A dollars with probability .4 and loses A dollars with probability .6.
Find the probability that he wins 8 dollars before losing all of his money if
he bets 1 dollar each time (timid strategy).
he bets, each time, as much as possible but not more than necessary to
bring his fortune up to 8 dollars (bold strategy).
Under the timid strategy, Smith’s wealth follows a Markov chain process with 9 nodes representing the states of wealth. We describe and visualize that Markov chain below and then calculate the probabilities of being absorbed into either terminal node.
Smith’s wealth at time \(t \geq 1\) depends on the ending wealth at time \(t-1 \geq 0\) and the outcome of the bet at time \(t\). In the Markov chain, the nodes are the wealth levels which can vary from 0 (no wealth) to 8 (wins the game).
The wealth nodes 0 and 8 are absorbing states. Once Smith’s wealth reaches either state, the Markov chain never leaves that state. Below, we depict the transition probabilities between nodes by directed edges of the digraphs.
We will seek the probability that starting from node 1, Smith reaches the node 8 of the Markov chain.
Using the canonical form for absorbing Markov chains as described in section 11.2 of the Grinstead, Snell textbook, we know the transition matrix \(P\) has size \(9\times 9\) and we know its submatrices \(Q\) between transient states as size \(7 \times 7\) and its absorbing matrix \(R\) as dimensions \(7 \times 2\).
Using Theorem 11.6 of the textbook, we need to calculate the \(7 \times 2\) matrix \(B\) which defines the probability \(p_{ij}\) of being absorbed into state \(j\) from starting state \(i\). The theorem states that \[B = NR\] for which \(N\) is the \(7 \times 7\) fundamental matrix \(N = (I - Q)^{-1}\) and \(R\) as described above.
We write out explicitly what \(R\), \(Q\) look like and show the equations tying them to \(B\).
\[ R = \left[ \begin{array}{rr} 0.6 & 0 \\ 0 & 0 \\ 0 & 0 \\ 0 & 0 \\ 0 & 0 \\ 0 & 0 \\ 0 & 0.4 \\ \end{array} \right] , Q = \left[ \begin{array}{rrrrrrr} 0 & 0.4 & 0 & 0 & 0 & 0 & 0 \\ 0.6 & 0 & 0.4 & 0 & 0 & 0 & 0 \\ 0 & 0.6 & 0 & 0.4 & 0 & 0 & 0 \\ 0 & 0 & 0.6 & 0 & 0.4 & 0 & 0 \\ 0 & 0 & 0 & 0.6 & 0 & 0.4 & 0 \\ 0 & 0 & 0 & 0 & 0.6 & 0 & 0.4 \\ 0 & 0 & 0 & 0 & 0 & 0.6 & 0 \\ \end{array} \right] \]
\[ B = NR = (I - Q )^{-1}R \]
Observe that the desired answer will be the \((1,2)\) entry of \(B\) once we do the matrix calculations. That is the probability of transitioning from starting wealth 1 to absorbing wealth 8. In other words, it gives the probability of the strategy succeeded.
Let’s do the above calculations in R below by first constructing \(Q\) and \(N\).
I = diag(7) # identity matrix 7x7
Q = I # initialize as the identity matrix but add off diagonal values
for ( j in c(2:7) )
{
Q[j,j-1] = 0.6
Q[j-1,j] = 0.4
}
(Q = Q - I) # Subtract the main diagonal 1's
## [,1] [,2] [,3] [,4] [,5] [,6] [,7]
## [1,] 0.0 0.4 0.0 0.0 0.0 0.0 0.0
## [2,] 0.6 0.0 0.4 0.0 0.0 0.0 0.0
## [3,] 0.0 0.6 0.0 0.4 0.0 0.0 0.0
## [4,] 0.0 0.0 0.6 0.0 0.4 0.0 0.0
## [5,] 0.0 0.0 0.0 0.6 0.0 0.4 0.0
## [6,] 0.0 0.0 0.0 0.0 0.6 0.0 0.4
## [7,] 0.0 0.0 0.0 0.0 0.0 0.6 0.0
(N = solve( I - Q )) # the fundamental matrix
## [,1] [,2] [,3] [,4] [,5] [,6] [,7]
## [1,] 1.6328311 1.054718 0.6693101 0.4123711 0.2410785 0.1268834 0.05075337
## [2,] 1.5820777 2.636796 1.6732752 1.0309278 0.6026963 0.3172086 0.12688343
## [3,] 1.5059477 2.509913 3.1792228 1.9587629 1.1451229 0.6026963 0.24107851
## [4,] 1.3917526 2.319588 2.9381443 3.3505155 1.9587629 1.0309278 0.41237113
## [5,] 1.2204600 2.034100 2.5765266 2.9381443 3.1792228 1.6732752 0.66931007
## [6,] 0.9635210 1.605868 2.0340999 2.3195876 2.5099128 2.6367962 1.05471848
## [7,] 0.5781126 0.963521 1.2204600 1.3917526 1.5059477 1.5820777 1.63283109
Next construct \(R\):
R = matrix( 0, nrow = 7, ncol = 2)
R[1,1] = 0.6
R[7,2] = 0.4
(R) # display the matrix
## [,1] [,2]
## [1,] 0.6 0.0
## [2,] 0.0 0.0
## [3,] 0.0 0.0
## [4,] 0.0 0.0
## [5,] 0.0 0.0
## [6,] 0.0 0.0
## [7,] 0.0 0.4
(B = N %*% R)
## [,1] [,2]
## [1,] 0.9796987 0.02030135
## [2,] 0.9492466 0.05075337
## [3,] 0.9035686 0.09643140
## [4,] 0.8350515 0.16494845
## [5,] 0.7322760 0.26772403
## [6,] 0.5781126 0.42188739
## [7,] 0.3468676 0.65313243
(prob_reaching_8 = B[1,2])
## [1] 0.02030135
(prob_reaching_0 = B[1,1])
## [1] 0.9796987
We conclude that the probability of the timid strategy reaching 8 (and Smith winning) is \(2.03\%\) while the probability of ending up broke is \(97.97\%\).
The bold strategy does not require a Markov chain to solve the problem. So we do it as directly as possible by analyzing a simple decision tree of the scenarios. Smith’s starting wealth is 1. If he bets his entire wealth each time and wins, the chain of events, however unlikely, would proceed as follows:
On toss 1, Smith doubles his wealth from 1 to 2. On toss 2, Smith doubles his wealth from 2 to 4. On toss 3, Smith doubles his wealth from 4 to 8, thereby getting out of jail.
In all other scenarios, Smith loses at either toss 1, 2 or 3 ending his game.
The probability of the scenarios 1, 2, and 3 happening in a row is:
(prob_winning_8 = 0.4 * 0.4 * 0.4)
## [1] 0.064
We conclude the probability of winning 3 tosses in a row (and getting out of fail is) 6.4%.
Smith should prefer the Bold strategy which gives him a \(6.4\%\) probability of winning and is three times more likely to succeed than the timid strategy.