Kevin Roche
My next word prediction app, available here, allows users to input a sentence into a text box. The app then outputs the top 5 most probable next words in the users sentence.
To make its prediction, the app uses 3-gram, 4-gram, and 5-gram Stupid Back-Off (SBO) models, where the n-gram refers to the number of words the model uses to make its prediction.
The app has been designed to be as intuitive as possible to use - instructions are on the next slide.
Stupid Back-Off (SBO) models are used to predict the next word in the sentence. SBO's work as follows:
To evaluate the model, I split the corpus data into a training set (containing 80% of the data) and a testing set (containing the remaining 20% of the data).
I then trained the model on the training data, and used it to predict the next word in the testing data.
The model was able to predict the next word with ~19% accuracy. Given the ambiguity of the English language, correctly predicting the next word of a sentence 19% of the time is an impressive feat.
Thanks for checking out my product!