predicting your next word

Plus One takes the sentence and
- turns all the letters to lower case,
- changes UTF-8 into latin1,
- converts contractions into full words,
- and keeps only letters and numbers.
The cleaned-up sentence gets broken up into words, and, using a backoff model, only the last 3 words are used to search.
Plus One returns the word with the highest frequency that matches the search.

Plus One uses a backoff model, and searches with the last 3 words from the input sentence.
Plus One searches the 4-grams, then the 3-grams, then the 2-grams, and if no search matches with any of those, then the search turns to 1-grams.
The n-grams are aggregated by frequency, and those with low frequency were discarded so PlusOne would use less memory and have a faster search speed.
1-grams predict using a modified Kneser-Ney Smoothing algorithm. The most frequent search results from the 2-grams set are put into a bucket, and one is chosen randomly to be the search result.

Interpolation did not produce significantly different results from the backoff model, and required more resources, so its use was abandoned.
Profanity was kept in the n-gram search set, but filtered out of the predictions. If Plus One predicts profanity, (censored) is returned.
The 4-grams are given the highest priority when matching for predictive purposes. This may have caused overfitting of the training set, though testing outside the initial data showed good predictions.