J Faleiro
Jul 18 2016
Predicting Sequencing of Events using Sample Data and Markov Chains
On this case study we leverage a corpus of about XXX words, extracted from different media:
On this case study we develop a model to predict the next word you will type, in real-time, based on Markov Chains.
This model is based on the assumption that the probability of a word depends on previous words already in a sentence (a Markov assumption). Markov models are the class of probabilistic models that assume we can predict the probability of some future unit without looking too far into the past.[1]
Challenges
Method
Parsing of n-grams, and assembly of markov chains. For each sentence \( s \) we calculate the conditional probability of word \( w \) by approximation to
\( P(w|s) = C(paste(s,w)) / C(s) \)
where \( P(A|B) \) is a conditional probability of A given B and \( paste(s_1, s_2, ..., s_n) \) is a string concatenation function
Findings
Future Work
Go ahead, give it a try
https://jfaleiro.shinyapps.io/markovchains/
(the app takes about 30 seconds to load, please be patient)
References