João Pedro Schmitt
07, Feb 2017
Project presented to the data science capstone project for the Data Science Specialization from Coursera.
Around the world, people are spending an increasing amount of time on their mobile devices for email, social networking, banking and the whole range of other activities. But typing on mobile devices can be a serious pain. This project builds a smart algorithm that can suggest for people the next word of a phrase, for example if someone type: I went to the the keyboard presents three options for what the next word might be. For example, the three words might be gym, store, restaurant.
Goal: Using the dataset provided by SwiftKey, build an app that predict the next word of a phrase.
The algorithm is composed in 6 parts:
The webapp for word prediction is available at WordPrediction App, this app receives any phrase (1) and calculates the probability for the next possible words (2). To the sentence entered the last three words are used to predict the next word, for example, if the sentence is “i like to buy a” the words “to buy a” will be used to predict next words.
The predictions from the app are very concise for very common cases, but in the specific cases the words found would not be the right choices. A very common example: if I writte “my name” the next word predicted will be “is” that is correct in the most case, but if I write “today i will buy a” the word “book” could not be my next choice.
The algorithm available at WordPrediction Code shows very organised and with the use of the standard libraries, was did a huge research in a various articles, books and examples to found a good mixed of strategies that could be used to build a good algorithm, the accuracy achieved was a mean of 20% in twitter, news and blogs.