07/11/2020

Synopsis

The intent of the project was to build a next word suggestion (or prediction) app. This was done with a NLP model using 3 datasets provided by Swiftkey. One from twitter, one from news and one from blogs. The dataset provided the means on which the model would be trained.

The resulted app was made available to use at this webpage.

Prediction Model

The app uses a Katz’ back-off model, a generative n-gram language model, to predict next words in a sentence.

The model estimates the conditional probability of a word given its n-gram history, limited by the dataset. The result is a next word with highest probability found by the model.

The results doesn’t always make sense in long sentences, but that is because the model doesn’t take into account long sentence context, just the last words.

The app

The app

Below can be seen a demonstration of the app.