Finding the Final Word

Kristin Abkemeier
June 17, 2018

A word prediction app that fits on your smartphone!

Final Word: Executive Summary

Summary: Final Word is a word prediction app that suggests the top 3 words to complete a phrase that you type into a textbox.

  • Final Word provides a colorful, easy-to-read graphical display of the relative probabilities of words.

  • The interface is compact enough to fit onto a smartphone screen.

  • Final Word's accuracy for top-three word prediction rated well against a popular benchmarking program, and it is fairly compact at 61 MB of memory usage.

  • Try Final Word for yourself!

Using the App

The top three next words found when you type in a phrase are displayed in color in brackets after your echoed verbatim typed phrase. Naughty words are cleaned up!

Final Word adds a colorful bar plot showing the relative frequencies of the top three final words. Each bar represents the percentage likelihood of the predicted word within the group of three words returned.

The lookup words actually used along with the predicted word are printed next to the percent bar.

Final Word on smartphone screen

Final Word: How It Works

Final Word takes the phrase that you type in, converts it to lowercase, and strips out punctuation, profanity, and non-ASCII characters.

The cleaned-up input is split into as many as three lookup words. The lookup words become the search keys to a series of four lookup tables that contain the most frequent sequences of one, two, three, and four consecutive words derived from over 4.3 million text samples from tweets, blogs, and news articles.

A “stupid backoff” model is used, beginning with using three lookup words to find 4-word sequences, and thus a set of next words.

  • If fewer than three such sequences exist, then two lookup words would be used to find 3-word sequences, and thus the next word, with discounting of likelihood by a factor of 0.4.
  • If no success with 3 words, then one lookup word finds 2-word sequences.
  • If no other predictions, the top 3 most common single words are returned.

Final Word by the Numbers

Overall top-3 score 18.72% for 28464 samples used in benchmark application

The model was run against a benchmark prediction program that has been used by many of the Coursera data science capstone students (available at HernĂ¡n Foffani's GitHub account).

Overall top-3 prediction scores between 15% and 19% have been commonly reported by students in the Coursera forums. Final Word performs near the high end of the range, while using a memory footprint of only 61 MB.

Try Final Word now!