Next-Word Prediction a la SwiftKey

Chia Lee
September 2017

The Next Word Predictor App

  • On-the-fly prediction of the next word in a partially complete sentence.

  • As the user starts typing in a sentence, the app automatically proposes the most likely next word in real time.

  • Great tool for improving ease and speed of typing on mobile devices.

  • App URL: https://lchiaying.shinyapps.io/Next-Word-Predictor/

Demo

demo

Predictive Model

  • The model was trained on corpora comprising of blogs, news and twitter documents
  • Combination of multiple sub-models:
    • \( N \)-gram model: Generate document-frequency matrices for 1-, 2-, 3-, 4-grams from the data, so that the prediction is based on the last 0, 1, 2, or 3 typed words
    • Markov chain model: Estimate the probability of the next word given the last \( N-1 \), using the frequency counts of the corresponding \( N \)-gram.
    • Back-off model: first, use the last 3 words for prediction, failing which use the last 2 words for prediction, and so on and so forth.

Analytics for Model Prediction Accuracy

  • In-sample testing of 3000 randomly selected documents, from the beginning to a random point in the document.
  • The score is the proportion of correct predictions.
  • Compare with random model: predict a randomly selected word from the dictionary based on its probability of occurring.
Model Random
0.1163929 0.0045439
  • The predictive model shows a significant improvement in accuracy over the random model