Jas Sohi - www.jassohi.com
December 12, 2014
The “Next Text” web app allows the user to enter in any 3 word phrase and predicts the next word that most users would type.
In addition, it offers other likely predictions (if the predicted word is not what the user was thinking) in the form of a word cloud.
EXAMPLE: If the user types: “What are you” He/she gets a prediction of “doing”.
I used a bag of words approach to count the frequency of Ngrams in the corpora(documents used to train the model). The position relative to other Ngrams doesn't matter.
I combined it with a back-off approach. We first compare the user's 3 words to a 4gram model(all four consecutive words - but only need to match the first 3 words), if no matches then a trigram model(all three consecutive words try to match the user's last 2 words to the first 2 words of the trigram), and if still no match we compare the last word with a bigram(all 2 word combinations).
I removed frequencies that only appeared once since they did not improve the accuracy of the prediction.
Wait for the app to load the instructions and required data.
Once “Done Loading!” appears, enter any 3 word phrase into the text box.
Click on the Predict! button.
Wait a few seconds and you will see the predicted, most likely next word, and a word cloud if there are any other alternative predictions (less likely).