Charles Floyd
4/26/15
A simple but powerful app to predict the next word you're going to type based on what's come before it! End of the Line is
End of the Line uses data collected from twitter, blogs, and news articles to create a rich library of phrases, idioms and sentences from different kinds of writing, including formal and informal communication, story-telling, shorthand, and more. The app lets the user choose a context suited to their anticipated writing style or draw from all the available data for the best prediction!
The prediction algorithm builds a database of writings from cleaned text data (removing profanity, punctuation, numerical and non-alphanumeric characters). When a string is input for text prediction, it is also cleaned and the result is looked up in the database, using the most common match from the training data to inform the prediction. When a phrase is unseen in the training data, the prediction is simply the most common word for the prediction context. This method is extremely lightweight and uses minimal resources, allowing it to be portable to practically any computing environment!
The app's design is extremely flexible. A developer who wants even more accurate predictions and has higher computing resources can take larger samples from the text data used to power the present model. It's also easy to include more or different datasets by preprocessing (cleaning) and loading those files into the app's directory. And a developer wanting the app to learn from its current user, could feed the user's input into its own input file allowing future predictions to benefit from the user's writing history!