Can we uncover the weather on a given day based on how many individuals went to the park? This application goal is to dive into Hidden Markov Models by developing a HMM to predict the weather (hidden state) based on a set of park mobility observations.
There are 5 components we need to define to run our HMM model:
Hidden states
Set of observations
Transition matrix
Emissions Matrix
Initial Probability Distribution
The data used has been created from two datasets:
Google mobility data: https://www.google.com/covid19/mobility/
NOAA climate data from Chicago O’Hare: https://www.ncdc.noaa.gov/
The dataset consists of 90 days of observed weather and park mobility from March 3, 2020 to June 1, 2020. These date were chosen based on the availability of the Google’s mobility data.
weather_mob <- read.csv("/Users/andrewleonard/Library/Mobile Documents/com~apple~CloudDocs/Grad School/Linear:Non Linear/weather_mobility.csv")
#pull out spring dates only
spring <- weather_mob[1:90,]
Define 3 states:
Clear (below 53 degrees and no precipitation)
Rainy (below 53 degrees with precipitation)
Warm (53 degrees or above)
Add the states to the dataset.
#find rainy days
spring$rain <- ifelse(spring$prcp > 0, 1, 0)
#classify each observation as a specific state: rain, clear, or warm
spring$markov_states <- ifelse(spring$rain == 1 & spring$tmax < 53, "rain",
ifelse(spring$rain == 0 & spring$tmax < 53, "clear", "warm"))
Park mobility observations are taken from the Google Mobility data. The data compares mobility for the report date to the baseline day. Calculated for the report date (unless there are gaps) and reported as a positive or negative percentage. The baseline day is the median value from the 5‑week period Jan 3 – Feb 6, 2020.
Bin the mobility data to create 3 types of observations:
Low Mobility (less than -10)
Moderate Mobility (-10 to 40)
High Mobility (> 40)
plot(spring$parks)
abline(h = c(-10, 40), col = "red")
#bin mobilty data as low, moderate, high mobility
spring$park_obs <- ifelse(spring$parks < -10, "L",
ifelse(spring$parks >= -10 & spring$parks < 40, "M", "H"))
# Initialise HMM
hmm = initHMM(S, V_labels, startProbs = pi, transProbs = A, emissionProbs = B)
print(hmm)
## $States
## [1] "clear" "rain" "warm"
##
## $Symbols
## [1] "H" "L" "M"
##
## $startProbs
## clear rain warm
## 0.26 0.14 0.60
##
## $transProbs
## to
## from clear rain warm
## clear 0.1428571 0.42857143 0.4285714
## rain 0.4545455 0.09090909 0.4545455
## warm 0.2105263 0.01754386 0.7719298
##
## $emissionProbs
## symbols
## states H L M
## clear 0.04761905 0.3809524 0.5714286
## rain 0.00000000 1.0000000 0.0000000
## warm 0.25862069 0.1724138 0.5689655
Our model is defined as \(\theta\) and our sequence of observations \(\ V^T\)
Given \(\theta\) and \(V^T\), estimate \(p(V^T|\theta)\)
In other words, let us calculate the probability our model generated a particular sequence \(V^T\).
If we do not know the specific series sequence of weather events (our hidden states) that have generated our observations, we can compute the probability of seeing our observations vector by summing over all possible weather events, weighted by their probability (that is, our transition probability).
Calculate the joint probability of the sequence of observations generated by a specific sequences of hidden states.
\(P(V^T, S^T) = p(V^T|S^T)p(S^T)\)
Simple Example:
spring$park_obs[1:3]
## [1] "M" "M" "M"
spring$markov_states[1:3]
## [1] "clear" "warm" "clear"
\(p(M, M, M, clear, warm, clear) = p(clear|initial state) \times p(warm|clear) \times p(clear|warm) \times p(M|clear) \times p(M|warm) \times p(M|clear)\)
This gives the probability for a specific sequence of hidden states. To answer our question we must compute the probability of all the different possible sequences of hidden states by summing over all the joint probabilities of \(V^T\) and \(S^T\)
The forward algorithm is more efficient as it derives the probability of the next step on the computed probability of the current step. Goal of forward algorithm compute the joint distribution on \(S_k\) given \(x_{1:k}\)
Simple example: HMM model is \(\theta\). What is the probability the set of observations \((M,M,M)\) came from model $? \(p(V^T|\theta)\).
#take first 5 observations
V_sample <- c("M", "M", "M")
forward = function(v, a, b, initial_distribution){
T = length(v)
m = nrow(a)
alpha = matrix(0, T, m)
alpha[1, ] = initial_distribution*b[, v[1]]
for(t in 2:T){
tmp = alpha[t-1, ] %*% a
alpha[t, ] = tmp * b[, v[t]]
}
return(alpha)
}
answer <- forward(V_sample,A,B,pi)
answer
## [,1] [,2] [,3]
## [1,] 0.14857143 0 0.3413793
## [2,] 0.05319647 0 0.1861623
## [3,] 0.02673803 0 0.0947343
sum(answer[3,])
## [1] 0.1214723
\(3^3\) possible sequences.
library(gtools)
## Warning: package 'gtools' was built under R version 4.0.2
#all possible sequences
test <- permutations(3,3,V_labels, repeats = TRUE)
total <- 0
for (i in 1:27){
x <- (test[i,])
print(sum(forward(x,A,B,pi)[3,]))
total <- total + sum(forward(x,A,B,pi)[3,])
}
## [1] 0.00697557
## [1] 0.008380835
## [1] 0.01863386
## [1] 0.007182617
## [1] 0.01503328
## [1] 0.0204982
## [1] 0.01723678
## [1] 0.0273872
## [1] 0.04622502
## [1] 0.01063998
## [1] 0.01374071
## [1] 0.02844846
## [1] 0.01911666
## [1] 0.04803857
## [1] 0.05982393
## [1] 0.02887392
## [1] 0.05610437
## [1] 0.07770931
## [1] 0.01826295
## [1] 0.02200183
## [1] 0.04878749
## [1] 0.02569668
## [1] 0.05662683
## [1] 0.0792162
## [1] 0.04528922
## [1] 0.07259721
## [1] 0.1214723
total
## [1] 1
test[24,]
## [1] "M" "L" "M"
#backward
backward = function(V, A, B){
T = length(V)
m = nrow(A)
beta = matrix(1, T, m)
for(t in (T-1):1){
tmp = as.matrix(beta[t+1, ] * B[, V[t+1]])
beta[t, ] = t(A %*% tmp)
}
return(beta)
}
backward(V_sample,A,B)
## [,1] [,2] [,3]
## [1,] 0.1629997 0.2292378 0.2848891
## [2,] 0.3254750 0.5183609 0.5595022
## [3,] 1.0000000 1.0000000 1.0000000
The decoding problem finds the most probable hidden state at each time step. In other words, what is the most probable path given our set of observations?
Find most likely states based on A and B estimates from observed data. use the viterbi algorithm.
#HMM most likely states
HMM_states <- viterbi(hmm, V)
#compare model to actual
cbind(HMM_states, spring$markov_states)[1:15]
## [1] "warm" "warm" "warm" "warm" "warm" "warm" "warm" "warm" "warm"
## [10] "clear" "rain" "clear" "rain" "clear" "rain"
#how'd we do? % of states correct
sum(HMM_states == spring$markov_states)/length(spring$markov_states)
## [1] 0.7222222
plot(HMM_states == spring$markov_states)
Functions for forward, backward, Baum-Welch, and Viterbi are from: http://www.adeveloperdiary.com/data-science/machine-learning/implement-viterbi-algorithm-in-hidden-markov-model-using-python-and-r/