Sebastian
03-Sep-19
Base of this app is the n-gram model as a type of probabilistic language model for predicting the next word item.
Such ngrams model the sequence of elements in the form of a (n-1)-order Markov model (conditional probabilities). Wikipedia
An n-gram model wants to predict the term \[ xi \] under the condition having seen previous elements \[ x_{i-(n-1)}, \dots, x_{i-1}. \]
In common probability terms, this is \[ P(x_{i}\mid x_{{i-(n-1)}},\dots ,x_{{i-1}}) \]
In this app a custom model and algorithm is used to find next word predictions similar to the Katz's stupid back-off model.
Several orders of n-grams are looked through until 6 matches are found.
Starting in the highest (n=4) 4-gram and sequentially to the lower orders.
Lower order n-grams are not considered if enough matches are found earlier to improve performance.
The Maximum likelihood estimation (MLE) is used to calculate the probabitilies.
\[ P(w_n | w_1...w_{n-1}) = \frac{c(w_1...w_n)}{c(w_1...w_{n-1})} \]
No smoothing has been used on the probabitilies.
The goal was to keep things simple, yet accurate and performant without too long delays, as for a web apps it is important to keep latency to a minimum.
Type in a single or few words into the text field
the app will use this parsed input to look for predictions in the n-grams and will return the result set.
Under the text field there are 6 buttons displayed with the current word predictions in decreasing probability order.
Just pick one and press such a button of a word that you wanted to type out and it is appended in the text area.
As it is appended per button or when typed by yourself this will trigger a new prediction at once.
You can continue to do so and create a chain of predicted words and form a sentence together with manually typed words as well.
Go online and try yourself right away!
Looking forward to hear your feedback!