This app is an interactive one that takes any string as input and produces the predicted next word as output. In addition, the app shows the five words that are most likely to follow the user’s string, each with its own likelihood “score”. Scores are based off of how often each word follows the previous 3-gram, 2-gram and 1-gram, as well as how often the word follows the most recent non-stopword in skipGrams (n-2) that exclude stopwords.
This app uses millions of tweets and blogs to help predict the user’s next word. It uses data.table for its data storage, including tables of ngrams and their frequencies combined with dictionaries that pair ngrams with integer lookups.