Predictive Text App

Ross Sweet

Text Prediction

To better serve our users, we developed an online application to predict the next word in a phrase from user text input. Our goals in creating this app were:

  1. Ease of use
  2. Speed
  3. Accuracy

Application

Our Predictive Text App provides a simple and intuitive interface. The user:

  1. Types into a text field,
  2. Clicks the “Submit” button,
  3. Sees their phrase followed by the highlighted predicted word.

We include error detection and messaging to ensure the user is entering enough text for the algorithm to predict the next word.

Training Set

We constructed a clean data set to base our predictions off of.

  • Based on a training set of more than 3.4 million lines of text.
  • Profanity, non-English characters and words, and numbers are filtered out.
  • Build a dictionary of all words that occur at least five times.
  • Common pairs (2-grams) and triples (3-grams) of words are found and ranked.

Backoff Algorithm Steps

  • Break the user input text into the last two words entered.
  • Compare to the training set of 3-grams.
  • If user text appears in the list of 3-grams, take the most common third word.
  • If not, repeat with 2-grams for the last user word, then to the ranked dictionary list.
  • Output the most likely next word.

Performance

  • Using our cleaning process, the training set is reduced to under 70 Mb.
  • User runtime of the app is slightly over 1 second for a line of user text.
  • Nearly all predictions retain proper grammatical structure.