Mark St. John
Nov 29, 2018
Performance was measured by creating a test set of 2–5 grams from 3000 sentences randomly sampled from the corpus (withheld from model training) and measuring how often the actual next word was predicted.
The Predictive Text App provides a web interface to use the algorithm.
Alternative approaches considered:
Success of the Markov Chain Backoff approach lies in storing a massive amount of training data (the corpus) in compact frequency lookup tables.