Next Word Prediction

Jacob Govshteyn
Aug 2015

Swiftkey capstone project.

App description

Internals:

Ngrams in the range of 2-5 where constructed from ~150,000 lines of news, blogs , and twitter social media training data entries:

To analyze n-gram frequencies, the following preprocessing steps were performed:

Remove punctuation from text corpus.
Transform words to lower case.
Strip text of additional white spaces.
replaced all sparse words with an <UNK> placeholder

How the app works:

Enter Partial Phrase in Text Box

Submit Server Request alt text

Complete The Phrase

Prediction Algorithm

We want a heuristic that more accurately estimates the number of times we might expect to see word w in a new unseen context. The Kneser-Ney intuition is to base our estimate on the number or different contexts word w has appeared in( Huang, X. & Deng, L. (2010). An Overview of Modern Speech Recognition.).

\( P_{\mathit{KN}}(w_i \mid w_{i-1}) = \dfrac{\max(c(w_{i-1} w_i) - \delta, 0)}{\sum_{w'} c(w_{i-1} w')} + \lambda \dfrac{\left| \{ w_{i-1} : c(w_{i-1}, w_i) > 0 \} \right|}{\left| \{ w_{j-1} : c(w_{j-1},w_j) > 0\} \right|} \)

where \[ \lambda(w_{i-1}) = \dfrac{\delta}{c(w_{i-1})} \left| \{w' : c(w_{i-1}, w') > 0\} \right| \]

Links and references

Word Predictor Shiny app
Data Science Specialization by Johns Hopkins University
Natural Language Processing by Stanford University on coursera