Maria Mendoza
August 19, 2015
This app is a simple text prediction application that attempts to predict the next word based on your previous words.
The algorithm uses 4-grams and trigrams that backs off to bigrams then unigrams.
The probability used to select the next word was calculated with Kneser-Ney smoothing.
Try the app here: https://mariamendoza.shinyapps.io/TextPredictionApp
notes:
The Kneser-Ney smoothing implemented applies a discount of 0.75 on the probability of a word (given an ngram) and distributes the discount to all words that follow the same ngram.
\[ \begin{aligned} \ P_{kn}(w_i|w_{i-1}) = \frac {max(c(w_{i-1},w_i)-d, 0)}{c(w_{i-1})} + \lambda(w_{i-1})P_{continuation}(w_i) \ \end{aligned} \] where: \[ \begin{aligned} \ \lambda(w_{i-1}) = \frac d{c(w_{i-1})}|\{w:c(w_{i-1}, w) > 0\}| \ \end{aligned} \]
source: https://web.stanford.edu/~jurafsky/NLPCourseraSlides.html