Data Science Capstone: Final Project

Jenny
August 2015

The Text Prediction App


Description

The application serves to predict the user's next word based on his previous inputs. A common usage for such an application is text messaging, when the system provides guesses of what the user is going to type next, to save time and increase the user's overall typing speed.

Access the app here.

The Text Prediction App (2)


How to use the app?

  1. Input your text (English only!) in the space prompted.
  2. Press the “Enter” button to see the predicted next word.

Behind the app


The algorithm

A cleaned data sample was tokenised into n-grams, or sequences of n-items to create frequency matrices, which form our dictionaries for the prediction model.

The process

  1. The algorithm determines N, the number of words inputted (=< 4).
  2. The algorithm searches through the quadgram frequency table to return the most common next word.
  3. If no match is available, the algorithm moves to the next largest n-gram frequency table.
  4. The algorithm repeats the process until a solution is found.

Further Improvements

This application is a very simple example of natural language processing. Of course, due to size and speed constraints (as well as the abilities of this data scientist), the app is limited in both scope and depth, and not very sophisticated.

Therefore, several improvements can be made to this app to make it more useful to users, some of which are:

  1. Multiple word suggestions
  2. Ability to predict in multiple languages