Data Science Capstone - JHU

Reinaldo Maciel
december 03, 2017

The goal of this project is to develop a Natural Language Processing data product for SwiftKey that is a next word prediction;
The source data was an unstructured text data in english language;

The algorithm developed to predict the next word is based on a classic N-gram model. http://en.wikipedia.org/wiki/N-gram

1.The text prediction algorithm is based on building a vocabulary of N-grams over the training data.

2.The n-grams are arranged in descending order of their frequencies.

3.The same happens for the user input.

4.User input in form of n-grams are compared with the model.

5.The first match are returned with the highest frequency for the next word.

The application can be used by the following link:

Enjoy it! ;)