Assignment 1 - Discrete-time Markov chains

Question 1

a) The pollution protocol of Madrid can be modeled as a Markov chain as the probability of visiting a state only depends on the state of the previous step (we don’t need additional information).

The following flow chart shows the possible transitions between the different states:

This chain has 6 states: No pollution, Scenarios 1, 2, 3 and 4, and the Alert state. The transitions are described in the problem statement. From the No pollution state, you can revisit the same state, move to Scenario 1, or directly to Alert state. From each of the sequential scenarios (1 to 4), you can either go to the next numbered scenario, increasing the number by one, to the Alert state, or to the No pollution state. Also, in state 4 you can revisit the same state, if the pollution level is maintained for more than 4 days, as the number cannot increase. Finally, from the Alert state you can only go back to No pollution state or stay in Alert state.

This finite state Markov chain has only one communication class, as all its states communicate, so we can safely say that it will have a unique limiting distribution by definition.

b) The maximum likelihood estimation was done using an R script, which can be found in Annex: Question 1 . It converts the character data into factors, and counts the transitions between the six different states in a matrix. Each row is then divided by the total number of transitions from that state. The resulting transition matrix is shown below:

	NR	Sc1	Sc2	Sc3	Sc4	Alert
NR	0.9631	0.0369	0.0000	0.0000	0.0000	0.0000
Sc1	0.4314	0.0000	0.5294	0.0000	0.0000	0.0392
Sc2	0.4074	0.0000	0.0000	0.4444	0.0000	0.1481
Sc3	0.5833	0.0000	0.0000	0.0000	0.4167	0.0000
Sc4	0.5000	0.0000	0.0000	0.0000	0.1667	0.3333
Alert	0.8889	0.0000	0.0000	0.0000	0.0000	0.1111

The transitions we found in Section a) are mostly consistent with the ones found in the data, with the exception of the transitions from No pollution and Scenario 3 to the Alert state. Although these transitions are possible theoretically, we did not find any examples in our dataset.

c) We know that the first week the pollution states are a first day of Scenario 1 followed by 6 days of No pollution. Thus, given that our initial state is a non-pollution state day (𝛼 = [1,0,0,0,0,0]) , we need to calculate:

\[ (\alpha P)_1 P_{10}P_{00}P_{00}P_{00}P_{00}P_{00} = 0.0369 \times 0.4314 \times (0.9631)^5 = 0.0132 \]

So the probability of finding Scenario 1 the first day (from an initial state of No pollution), followed by 6 No pollution days is 0.0132.

d) The stationary distribution of the transition matrix can be computed using matrix operations as follows:

\[ \pi P^* = (1,0,0,0,0,0)\]

Where $P^*$ is the transition matrix once we have subtracted the identity matrix and replaced the first column by a vector of 1s. Then, simply solving the linear system, we obtain the stationary distribution ($\pi$). For these computations, we used the script developed in class, which can also be found in the Appendix: Question 1 .

NR	Sc1	Sc2	Sc3	Sc4	Alert
0.9294	0.0343	0.0182	0.0081	0.004	0.0061

To implement the estimation directly from the data, we used the definition of $\pi$, which ends up being the proportion of time spent at each state. With that in mind, we measured the number of times a given state occurred, and divided it by the total number of samples (1460). This produced the following table:

V1	V2	V3	V4	V5	V6
NR	Sc1	Sc2	Sc3	Sc4	Alert
0.9281	0.0349	0.0185	0.0082	0.0041	0.0062

As we can see, the results are quite similar to those obtained using the transition matrix. Differences can be attributed to the time the chain takes to stabilize at the stationary distribution. When computing the estimation from the data, these first states are bound to skew our calculations.

e) Given that we have a finite state irreducible Markov chain, there is a unique stationary distribution, as explained in Section a). For this distribution to be the limiting distribution, we must now check whether it is periodic and also its recurrency.

Since both properties are class properties, we only have to check one state, thus, we looked at No pollution state. It is clearly an aperiodic and positive recurrent state as it communicates with itself, with period equal to 1. Thus, the unique stationary distribution is also the limiting distribution.

This means that the pollution episodes will end up following a determined distribution, that of the limiting distribution, regardless of the initial state. Following our calculations in Section d) , we can say that most of the days there will be no pollution, and the few days of Scenario 1 rarely develop into worse states. Thus, high risk scenarios (Alert, Scenario 3 and Scenario 4) are very unlikely.

f) In order to calculate the expected number of days of Scenario 3 or worse, we must first obtain the limiting distribution of the Markov chain. However, we know from Exercise e) that it will be our stationary distribution, which we found in Exercise d). Thus, we use it to calculate:

\[ \pi_{Sc3}+\pi_{Sc4}+\pi_{Alert} \times 365 = (0.0081 + 0.0040 + 0.0061) \times 365 = 0.0182 \times 365 = 6.6430 \space \text{days}\]

Thus, in a single year, we expect driving to be forbidden 7 days because of pollution.

Question 2

a) In order to obtain the stationary distribution of the chain (for $p \neq 0,1$) we can use the following scheme (Dobrow, Stochastic Processes with R).

To find a non-negative vector $x$ such that $xP = x$, with $P$ as the transition matrix probabilities. Then, solve the system for $x = (1, x_{2}, x_{3},...)$, where the first component of $x$ is replaced by $1$.
To obtain the unique probability vector solution $\pi = cx$ can be obtained if $c = \frac{1}{\sum_{j}x_{j}}$.
The unique stationary distribution is:

\[\pi = \frac{1}{1 + x_{2} + x_{3} + ... }(1 + x_{2} + x_{3} + ...)\]

As a result, the system $xP = x$ with $x = (1, x_{2}, x_{3},...)$ is:

\[\begin{align*} p + px_{2} + px_{3} + px_{4} + ... &= 1 \\ (1 - p) &= x_{2} \\ (1 - p)x_{2} &= x_{3} \\ (1 - p)x_{3} &= x_{4} \\ \vdots &= \vdots \end{align*}\]

It is important to observe the following pattern for the variables $x_{2}$, $x_{3}$ and so on:

\[\begin{align*} x_{2} &= (1 - p) \\ x_{3} &= {(1 - p)}^{2} \\ x_{4} &= {(1 - p)}^{3} \\ \vdots &= \vdots \end{align*}\]

As a result $x_{n} = {(1 - p)}^{n-1}$, we obtain the solution $x = (1, (1-p), {(1-p)}^2,...)$

Finally:

\[1 + x_{2} + x_{3} + ... = 1 + \sum_{n \geq 2} x_{n} = 1 + \sum_{n \geq 2} {(1 - p)}^{n-1} = \sum_{n \geq 1} {(1 - p)}^{n-1}\]

Due to $|1 - p| < 1$, we have that:

\[\sum_{n \geq 1} {(1 - p)}^{n-1} = \frac{1}{1 - (1-p)} = \frac{1}{p}\] As a result, the stationary distribution is given by:

\[\begin{align*} \pi &= \frac{1}{\sum_{n \geq 1} x_{n} } (1, x_{2}, x_{3},...) \\ &= p(1, (1-p), {(1-p)}^{2}, {(1-p)}^{3},... ) \\ &= (p, p(1-p), p{(1-p)}^{2}, p{(1-p)}^{3},... ) \end{align*}\]

In part b), we will prove that this Markov Chain is an irreducible and ergodic. As a result, the expected return time to $0$ is given by:

\[\mu_{0} = E[T_{0}|X_{0} = 0] = \frac{1}{\pi_{i}} = \frac{1}{p}\]

b) In order to determine the number of communication classes of this chain, we can observe that it is possible to go from any state to any other state in a finite number of steps. In fact, take states $i$ and $j$, so:

If $i < j$, you can go from $i$ to $j$ in $j - i$ steps.
If $i \geq j$, you can go from $i$ to $j$ in $j+1$ steps.

As a result, there is only one communication class. Then the chain is also irreducible.

On the other hand, for studying the classification of the states, by irreducibility, it is enough to study the behavior of one state. In the case, taking the state $i = 0$, we can see that probability of returning to state $0$ is given by:

\[\begin{align*} f_{0} &= \sum_{n \geq 1} f_{00}^{n} \\ &= p + p{(1-p)} + p{(1-p)}^{2} + p{(1-p)}^{3} + ...\\ &= \sum_{n \geq 1} p{(1-p)}^{n-1} \\ &= p \sum_{n \geq 1} {(1-p)}^{n-1} \\ &= p \cdot \frac{1}{1 - (1-p)} \\ &= p \cdot \frac{1}{p}\\ &= 1 \end{align*}\]

So, the probability of revisiting the state $i = 0$ is 1. As a result, the chain is recurrent.

Then, we can also prove that this chain is positive recurrent. For this, it is enough to prove for one state (by irreducibility) that the expected returned time to state $i = 0$ is finite.

The expected return time to state $i = 0$ is given by

\[E(T_{0}|X_{0} = 0) = \sum_{n \geq 1} n f_{00}^{n} = \sum_{n \geq 1} np{(1-p)}^{n-1}\] Taking $a_{n} = np{(1-p)}^{n-1}$, and applying the ratio test of series, we can see that:

\[\begin{align*} \lim_{n \to \infty } \frac{|a_{n+1}|}{|a_{n}|} &= \lim_{n \to \infty } \frac{(n+1)p{(1-p)}^{n}}{np{(1-p)}^{n-1}} \\ &= \lim_{n \to \infty } \frac{(n+1){(1-p)}}{n} \\ &= p(1-p) \\ &< 1 \end{align*}\]

As a result, the series $\sum_{n \geq 1} np{(1-p)}^{n-1}$ converges and the expected return time to state $i = 0$ is finite.

Finally, the chain is aperiodic. We can see how the greatest common divisor for the state $i = 0$ is 1. (You can return to $0$ in any number of steps given $n \geq 1$ depending of the starting state). Because of $0$ is a aperiodic state and given that this chain is irreducible all states are aperiodic.

Finally, it is important to consider that given:

The Markov chain is infinite irreducible.
All states are positive recurrent.

Then we have that the stationary distribution $\pi = (p, p(1-p), p{(1-p)}^{2}, p{(1-p)}^{3},... )$ is unique.

Also, we can add that:

All states are aperiodic.

As a result, this Markov Chain is ergodic. Then, the unique stationary distribution is the limiting distribution of the chain. In summary:

\[\pi_{j} = \lim_{n \to \infty}(P^{n})_{ij}\]

c) In order to create this simulation, we create the R function called simulation with the following parameters:

$p$: Transition probability
$x_0$:Initial value of the chain.
$n$: number of steps to generate.

The core of this function is the use of the R function sample() with transition probabilities $p$ for the probability of returning to $0$, and $1-p$ for the event of starting in the state $i$ moving to the next state $i+1$. Details of the code can be find in the appendix.

One example of application of this function with parameters:

Transition probability: $p = 0.5$
Initial position: $x_0 = 25$
Number of steps: $n = 1000$

is the following is given by the figure .

$Example of the function simulation\label{fig:exa1}$

Example of the function simulation

d) In this exercise, the probability is given by:

\[p = \frac{1}{k+1} = \frac{1}{4+1} = \frac{1}{5}\] Moreover, we generate $8$ trajectories of length $1000$ for the following initial values:

Initial values of $x_{0}$ used in the simulations
Initial values
100
228
357
485
614
742
871
1000

Then, the representation of the trajectories is given by figure

$Simulation of 8 trajectories of lenght 1000\label{fig:exa2}$

Simulation of 8 trajectories of lenght 1000

In conclusion, independently of the starting points, as this chain has a limiting distribution, it will reach the same state after a few steps. Also, we can see how the theoretical expected return time to $0$ is given by $1/0.2 = 5$ steps. For the empirical vectors we have the following sample mean of number of steps for reaching the state 0 for each initial value $x_{0}$.

Initial values of $x_{0}$ and expected number of steps of returning to zero
Initial Value	Empirical return to zero
100	5.0000
228	4.5972
357	4.6215
485	5.0821
614	4.6355
742	4.9799
871	4.8873
1000	4.7990

Appendix: Question 1

Question 1 section b: Maximum likelihood estimation

load('PollutionMadrid.RData')
x=X[4,] # i is your position in the class list, in our case, 4

trans_mat = matrix(0,nrow= 6, ncol = 6) # Matrix to save the transitions
idx_mat = match(x,c("NR","Sc1","Sc2","Sc3","Sc4","Alert")) # Convert data to factors
v = rep(0,6) # Vector to save the total number of transitions per row

# Iterate through the length of the sample vector
for(i in 2:length(x)){

# Increase the count for the transition matrix and vector depending on the previous and current states
v[idx_mat[i-1]] = v[idx_mat[i-1]] +1
trans_mat[idx_mat[i-1],idx_mat[i]] = trans_mat[idx_mat[i-1],idx_mat[i]] +1
}

# Compute the MLE transition matrix using the values obtained above
stoch_mat = trans_mat[1:6,] / v[1:6] # Compute stochastic matrix
stoch_mat = round(stoch_mat,4)

# Format and print the resulting table
colnames(stoch_mat) = c("NR","Sc1","Sc2","Sc3","Sc4","Alert")
rownames(stoch_mat) = c("NR","Sc1","Sc2","Sc3","Sc4","Alert")
formattable(as.data.frame(stoch_mat))

Exercise 1 section d: Obtention of $\pi$ from the transition matrix, and directly from the data

# Obtaining 𝞹 from the transition matrix

# Function to calculate 𝞹
stat_dis = function(P){
 n = nrow(P)
 A = P - diag(n) # Substract the identity to the input matrix
 A[,1] = rep(1,n) # Replace a column of ones
 b = c(1,rep(0,n-1)) # Create the output vector (1,0,0,...,0)
 pi = solve(t(A),b) # Solve the system for 𝞹

 return(pi)
}

# We use the function developed in class, and format the output
StationaryDistribution = round(stat_dis(stoch_mat),4)
formattable(as.data.frame(t(StationaryDistribution)))

# Obtaining 𝞹 directly from the data

# Obtain the frequency of each state, and divide by the number of # samples
statDisData = t(as.data.frame(round(table(x)/1460,4)))
statDisData = statDisData[,c(2,3,4,5,6,1)]    # Reorder the values
formattable(as.data.frame(statDisData)) # Create the output table

Appendix: Question 2

R function for the simulation of sequences of the question 2.

simulation <- function(p, x0, n){
  
  # p: Transition probability
  # x0:Initial value of the chain.
  # n: number of steps to generate.
  
  # vector of states
  x <- x0
  
  for(i in 1:n){
    
    mov <- sample(x = c(0, x[i]+1), 
                  size = 1, 
                  prob=c(p, 1-p)
                  ) #values: return to 0 or move 1 to right
    x <- append(x, mov)
  
  } #end for
  
  return(x)
  
}

Example of simulation part c) question 2.

#Probability
p <- 0.5

#Initial position
x0 <- 25

# Steps to generate
n <- 1000

# Simple application of the function
set.seed(2020)
X <- simulation(p, x0, n)
plot(X,type = "l", main = "Simulation MC with probability 0.5", xlab = "Time", ylab = "states")

Code used question d)

# Vector with initial values in the range
initial_values <- floor(seq(from = 100, to = 1000, length.out = 8))

# Probability group
k <- 4
p <- 1/(k+1)

# Vector of values
results <- list()

plot(1, type="n", 
     xlab="Length", 
     ylab="Initial Value", 
     xlim=c(0, 1000), 
     ylim=c(0, 1100),
     main = "Simulation MC with probability p = 0.2 \n 8 different initial values on [100,1000]" )
for(i in 1:8){set.seed(i)
  X <- simulation(p, initial_values[i], n)
  results[[i]] <- X
  points(X,
         type="l",
         col=i)}

Assignment 1 - Discrete-time Markov chains

Cesar Conejo Villalobos, Josu Esarte Araiz, Cecilia Gallego Carracedo, Xavier Bryant (Group 4)

01/12/2020

Question 1

Question 2

Appendix: Question 1

Appendix: Question 2