Stefan Putra Lionar
2 May 2018
Predicting the words one intend to type can speed up typing and reduce misspelling.
This application demonstrates the capability of creating next word predictions and making word completion based on user input sentences/phrases.
The application is available at: https://splionar.shinyapps.io/wordpredict/
The predictive model is created by learning text data from Twitter, news, and blog corpus.
N-grams of size 1,2 and 3 are obtained from the sample after appropriate cleaning, i.e. conversion to lowercase, removal of number and special characters.
Backoff model is then generated by the following rules:
Goal: Make prediction for word completion
Input Sentence: “I want to go to am”
From the sentence above:
Algorithm will work as following:
Caching mechanism: Each time the user key in new filter, if there is no change in trigram and bigram input, possible word predictions are cached so that filtering only needs to scan through the cached possible predictions, instead of keep generating n-gram frequencies table.