Word Predictor App

Marcelo Tibau
March 10, 2017

The Model

Based on Google's stupid backoff algorithm.

Despite its name, it performs almost as well as some more complex models given very large data. On the authors' own words: Stupid Backoff is inexpensive to calculate in a distributed environment while approaching the quality of Kneser-Ney smoothing for large amounts of data.

For more details: Brants, Thorsten; Popat, Ashok C; Xu, Peng; Och, Franz J; Dean, Jeffrey. “Large language models in machine translation”. EMNLP/CoNLL. 2007.

Basic Idea of the Algorithm

  • Take the input and return the last two words.
  • Search for the two first input words in the 3-grams model and if matched, predict the third word.
  • If it doesn't match, search the last word based on the inputted word in the 2-grams model.If matched, predict the second word.
  • If it doesn't match, predict based on the most common words in the 1-gram model.

The Application

An interactive web app that takes in text input and return the predicted upcoming terms.

The app

How to use

  • Enter your text at the defined spot.
  • Select the maximum number of words with the lever.
  • You can choose to show 5, 10 or 100 entries.
  • See the options at the word cloud or right below NextWord.
  • Have fun.

Step right up at: https://marcelotibau.shinyapps.io/word_predictor/