Alejandro Morales
December, 2014.
It is estimated that the average cellphone user sends or recieves up to 40 text messages per day with younger users can send or recive up to a hundred messages (Pew Research Center: how americans text, 2011)
nextWord uses a trigram language model with interpolation and Kneser-Ney smoothing.
The last two words are used to predict the next word, e.g.:
may no [nextWord] --> 1. longer
If that combination of words is not existent, we back off to a bigram model, e.g.:
no [nextWord] --> 1.one, 2.longer, 3.matter
Data
[1] Coursera data
[2] N-gram language model
Predictive power
*When tested against a large text.