12 December 2018
This project uses natural language processing to predict the next word, based on the text input by the user. Text prediction is becoming increasingly important as people spend more and more time on their mobile devices for email, social networking, banking and a whole range of other activities.
The app allows users to input text and it makes a prediction on the next possible word. Its prediction algorithm is based on a large data set which includes text from the news, blogs and twitter.
To use the app:
It is that simple to use!
The prediction algorithm is based on the capstone dataset, which includes entries from blogs, news and twitter in the English language.
A sample dataset is used, as not all the data is required to build a model. Often, relatively few randomly selected chunks can yield an accurate approximation to results that would be obtained using all the data. The subset is then cleaned to ensure consistency.
Appropriate tokens are identified which allowed for n-grams are built. The prediction algorithm uses n-grams, which provide a better understanding of frequencies of word and word pairs in the data set.
Thank you!