Markov Chain Basics

Markov chains are an essential component of Markov chain Monte Carlo (MCMC) techniques.

Introduction: http://www.dartmouth.edu/~chance/teaching_aids/books_articles/probability_book/Chapter11.pdf.

Good reference: https://people.eecs.berkeley.edu/~sinclair/cs294/n2.pdf.

Another one: https://www.cs.cmu.edu/~venkatg/teaching/CStheory-infoage/book-chapter-5.pdf.

In probability theory and statistics, a sequence or other collection of random variables is independent and identically distributed (i.i.d. or iid or IID) if each random variable has the same probability distribution as the others and all are mutually independent.

Definition: A stochastic process X1,X2,.,X1,X2,., is a Markov process ( or Markov chain) if for any discrete time index n=1,2,. ,Pr(Xn+1=xn+1|Xn=xn,.,X1=x1)=Pr(Xn+1=xn+1|Xn=xn)

In Markov chain, the probability of a random variable X being equal to some value x at time n + 1, given all the x values that came before it in the sequence, is equal to the probability of X being equal to some value x at time n + 1 given just the value of x that came before it. In other words, X at time n + 1 is only dependent on x at time n, not any other value of x. So in a sequence, you can say that X at time n + 1 is independent of all other x except X at time n.

All values in the Markov Chain are not independent of each other because P(Xn+1|Xn)???P(Xn+1)P(Xn+1|Xn)???P(Xn+1). After enough iterations, the chain (usually) converges in distribution so they would be identically distributed.

Markov chains represent a class of stochastic processes of great interest for the wide spectrum of practical applications. In particular, discrete time Markov chains (DTMC) permit to model the transition probabilities between discrete states by the aid of matrices.

Note: the row sum automatically equals to 1

What is Transition Matrix (Stochastic Matrix)?

library(markovchain)
## Package:  markovchain
## Version:  0.6.9.8-1
## Date:     2017-08-15
## BugReport: http://github.com/spedygiorgio/markovchain/issues
library(diagram)
## Loading required package: shape
# define a transition matrix
tmA <- matrix(c(0.25,0.65,0.1,.25,0.25,.5,.35,.25,0.4),nrow = 3, byrow = TRUE)
# create the DTMC
dtmcA <- new("markovchain",transitionMatrix=tmA, states=c("No Rain","Light Rain","Heavy Rain"), name="MarkovChain A") 
dtmcA
## MarkovChain A 
##  A  3 - dimensional discrete Markov Chain defined by the following states: 
##  No Rain, Light Rain, Heavy Rain 
##  The transition matrix  (by rows)  is defined as follows: 
##            No Rain Light Rain Heavy Rain
## No Rain       0.25       0.65        0.1
## Light Rain    0.25       0.25        0.5
## Heavy Rain    0.35       0.25        0.4
plot(dtmcA)

Visulization by plotmat function from diagram package

stateNames <- c("No Rain","Light Rain","Heavy Rain")
row.names(tmA) <- stateNames; colnames(tmA) <- stateNames
plotmat(tmA,pos = c(1,2), 
        lwd = 1, box.lwd = 2, 
        cex.txt = 0.8, 
        box.size = 0.1, 
        box.type = "circle", 
        box.prop = 0.5,
        box.col = "light blue",
        arr.length=.1,
        arr.width=.1,
        self.cex = .6,
        self.shifty = -.01,
        self.shiftx = .14,
        main = "Markov Chain")

n-steps and steady states distribution

You can simulate states distribution after n-steps and steady states.

# It is possible to simulate states distribution after n-steps
initialState<-c(0,1,0)
steps<-2
finalState<-initialState*dtmcA^steps #using power operator
finalState
##      No Rain Light Rain Heavy Rain
## [1,]     0.3       0.35       0.35
# As well as steady states distribution
steadyStates(dtmcA)
##        No Rain Light Rain Heavy Rain
## [1,] 0.2850877  0.3640351  0.3508772

Forecasting Example: Baby Crying, Eating and Sleeping

Let’s assume a young baby has 3 states, eat, cry and sleep. We watch and record her behavior every hour, and determine the possibility from one state to another to develop her transition matrix.

An we are interested in what is the possibility that after eating this hour, that she will go to sleep without crying in next hour?

Note: the efficient operator %^% from the expm package is used to raise the Oz matrix to the third power.

Fit the pass performance to a transition matrix

library(expm)
## Loading required package: Matrix
## 
## Attaching package: 'expm'
## The following object is masked from 'package:Matrix':
## 
##     expm
library(pracma)
## 
## Attaching package: 'pracma'
## The following objects are masked from 'package:expm':
## 
##     expm, logm, sqrtm
## The following objects are masked from 'package:Matrix':
## 
##     expm, lu, tril, triu
stateNames <- c("Crying","Eating","Sleeping")
Oz <- matrix(c(0.25,0.65,0.1,.25,0.25,.5,.35,.25,0.4),
             nrow=3, byrow=TRUE)
row.names(Oz) <- stateNames; colnames(Oz) <- stateNames
Oz
##          Crying Eating Sleeping
## Crying     0.25   0.65      0.1
## Eating     0.25   0.25      0.5
## Sleeping   0.35   0.25      0.4
plotmat(Oz,pos = c(1,2), 
        lwd = 1, box.lwd = 2, 
        cex.txt = 0.8, 
        box.size = 0.1, 
        box.type = "circle", 
        box.prop = 0.5,
        box.col = "light blue",
        arr.length=.1,
        arr.width=.1,
        self.cex = .4,
        self.shifty = -.01,
        self.shiftx = .13,
        main = "")

Define current state

She is eating now, that means starting vector (0,1,0)

x1 <- matrix(c(0,1,0),nrow=1, byrow=TRUE)

Forecast the state in next hour

Predict what will happen in next hour

x1 %*% Oz
##      Crying Eating Sleeping
## [1,]   0.25   0.25      0.5

That means she has 50% chance will go to sleep in the next hour

Forecast the state after 5 hours

Oz5 <- Oz %^% 5
Oz5=round(Oz5,3)

plotmat(Oz5,pos = c(1,2), 
        lwd = 1, box.lwd = 2, 
        cex.txt = 0.8, 
        box.size = 0.12, 
        box.type = "circle", 
        box.prop = 0.7,
        box.col = "light blue",
        arr.length=.4,
        arr.width=.2,
        self.cex = .6,
        self.shifty = -.01,
        self.shiftx = .17,
        main = "Markov Chain Transition Matrix")

round(x1 %*% Oz5,3)
##      Crying Eating Sleeping
## [1,]  0.285  0.364    0.351

Now she has only 35% chance will go to sleep after 5 hours

Forecasting Example: Rain Fall Time Series

Let’s study rain fall as an example, if we record the past 1096 days rain fall accumulation, classify them into 3 states: (0), (1-5 inches), and (6+ inches). Based on the record, we can fit the transition matrix.

Fit the pass time series to a transition matrix

data(rain)
mysequence<-rain$rain
createSequenceMatrix(mysequence)
##       0 1-5  6+
## 0   362 126  60
## 1-5 136  90  68
## 6+   50  79 124
head(rain)
##   V1 rain
## 1  3   6+
## 2  2  1-5
## 3  2  1-5
## 4  2  1-5
## 5  2  1-5
## 6  2  1-5
myFit<-markovchainFit(data=mysequence,confidencelevel = .9,method = "mle")
myFit
## $estimate
## MLE Fit 
##  A  3 - dimensional discrete Markov Chain defined by the following states: 
##  0, 1-5, 6+ 
##  The transition matrix  (by rows)  is defined as follows: 
##             0       1-5        6+
## 0   0.6605839 0.2299270 0.1094891
## 1-5 0.4625850 0.3061224 0.2312925
## 6+  0.1976285 0.3122530 0.4901186
## 
## 
## $standardError
##              0        1-5         6+
## 0   0.03471952 0.02048353 0.01413498
## 1-5 0.03966634 0.03226814 0.02804834
## 6+  0.02794888 0.03513120 0.04401395
## 
## $confidenceLevel
## [1] 0.9
## 
## $lowerEndpointMatrix
##             0       1-5         6+
## 0   0.6160891 0.2036763 0.09137435
## 1-5 0.4117506 0.2647692 0.19534713
## 6+  0.1618105 0.2672305 0.43371243
## 
## $upperEndpointMatrix
##             0       1-5        6+
## 0   0.7050788 0.2561777 0.1276038
## 1-5 0.5134195 0.3474757 0.2672379
## 6+  0.2334464 0.3572754 0.5465247
## 
## $logLikelihood
## [1] -1040.419
alofiMc<-myFit$estimate
alofiMc
## MLE Fit 
##  A  3 - dimensional discrete Markov Chain defined by the following states: 
##  0, 1-5, 6+ 
##  The transition matrix  (by rows)  is defined as follows: 
##             0       1-5        6+
## 0   0.6605839 0.2299270 0.1094891
## 1-5 0.4625850 0.3061224 0.2312925
## 6+  0.1976285 0.3122530 0.4901186
a11=alofiMc[1,1]
a12=alofiMc[1,2]
a13=alofiMc[1,3]
a21=alofiMc[2,1]
a22=alofiMc[2,2]
a23=alofiMc[2,3]
a31=alofiMc[3,1]
a32=alofiMc[3,2]
a33=alofiMc[3,3]

## Hard code the transition matrix
stateNames <- c("No Rain","Light Rain","Heavy Rain")
ra <- matrix(c(a11,a12,a13,a21,a22,a23,a31,a32,a33),nrow=3, byrow=TRUE)
#ra <- matrix(c(0.660,0.230,0.110,0.463,0.306,0.231,0.198,0.312,0.490),nrow=3, byrow=TRUE)

dtmcA <- new("markovchain",transitionMatrix=ra, states=c("No Rain","Light Rain","Heavy Rain"), name="MarkovChain A") 

dtmcA
## MarkovChain A 
##  A  3 - dimensional discrete Markov Chain defined by the following states: 
##  No Rain, Light Rain, Heavy Rain 
##  The transition matrix  (by rows)  is defined as follows: 
##              No Rain Light Rain Heavy Rain
## No Rain    0.6605839  0.2299270  0.1094891
## Light Rain 0.4625850  0.3061224  0.2312925
## Heavy Rain 0.1976285  0.3122530  0.4901186
plot(dtmcA)

## Using plotmat from diagram package

row.names(ra) <- stateNames; colnames(ra) <- stateNames
ra = round(ra,3)
plotmat(ra,pos = c(1,2), 
        lwd = 1, box.lwd = 2, 
        cex.txt = 0.8, 
        box.size = 0.1, 
        box.type = "circle", 
        box.prop = 0.5,
        box.col = "light blue",
        arr.length=.1,
        arr.width=.1,
        self.cex = .4,
        self.shifty = -.01,
        self.shiftx = .13,
        main = "Markov Chain Transition Matrix")

Define current state

If today has No Rain, that means starting vector (1,0,0)

x1 <- matrix(c(1,0,0),nrow=1, byrow=TRUE)

Forecast the state of tomorrow

Predict what will happen tomorrow (in one day)

x1 %*% ra
##      No Rain Light Rain Heavy Rain
## [1,]   0.661       0.23      0.109

That means we will have 66% chance will have No Rain tomorrow, 23% chance will have Light Rain, and 11% chance to have Heavy Rain.

Forecast the states for the next 7 days

ra2 <- ra %^% 2
ra3 <- ra %^% 3
ra4 <- ra %^% 4
ra5 <- ra %^% 5
ra6 <- ra %^% 6
ra7 <- ra %^% 7
cat("Day 1 Forecast")
## Day 1 Forecast
round(x1%*%ra,3)
##      No Rain Light Rain Heavy Rain
## [1,]   0.661       0.23      0.109
cat("Day 2 Forecast")
## Day 2 Forecast
round(x1%*%ra2,3)
##      No Rain Light Rain Heavy Rain
## [1,]   0.565      0.256      0.179
cat("Day 3 Forecast")
## Day 3 Forecast
round(x1%*%ra3,3)
##      No Rain Light Rain Heavy Rain
## [1,]   0.528      0.264      0.208
cat("Day 4 Forecast")
## Day 4 Forecast
round(x1%*%ra4,3)
##      No Rain Light Rain Heavy Rain
## [1,]   0.512      0.267      0.221
cat("Day 5 Forecast")
## Day 5 Forecast
round(x1%*%ra5,3)
##      No Rain Light Rain Heavy Rain
## [1,]   0.506      0.268      0.226
cat("Day 6 Forecast")
## Day 6 Forecast
round(x1%*%ra6,3)
##      No Rain Light Rain Heavy Rain
## [1,]   0.503      0.269      0.228
cat("Day 7 Forecast")
## Day 7 Forecast
round(x1%*%ra7,3)
##      No Rain Light Rain Heavy Rain
## [1,]   0.502      0.269      0.229
ra7=round(ra7,3)

plotmat(ra7,pos = c(1,2), 
        lwd = 1, box.lwd = 2, 
        cex.txt = 0.8, 
        box.size = 0.1, 
        box.type = "circle", 
        box.prop = 0.5,
        box.col = "light blue",
        arr.length=.1,
        arr.width=.1,
        self.cex = .6,
        self.shifty = -.01,
        self.shiftx = .14,
        main = "The transition matrix after 7 days")