wabi-sabi

Benjamin Kemp
December 13, 2014

A simple text prediction engine.

Overview

This simple, but effective, text prediction engine takes the word or phrase entered by the user, predicts what the next word will be then presents the predicted word to the user.

  • Results are displayed in less than a second
  • Prediction Performance Statistics:
    • Min: 0.135 seconds, Max: 0.855 seconds, Avg: 0.602 seconds
  • Up to five words can be used for predicion
  • The predicion database contains millions of words, but is only 19.5 MB

The Algorithm ( Part 1 )

The last five words entered by the user are separated. “Once upon a time there” becomes: “Once”, “upon”, “a”, “time”.

A list is then created starting with all of the words, then by dropping the first word, then the second word, etc. The last word in the list is used as a keyword in Part 2.

For example:

  • “Once upon a time there”
  • “upon a time there”
  • “a time there”
  • “time there”…

The Algorithm ( Part 2 )

This list is then used to generate search queries to a locally stored dabase derived from millions of Twitter tweets, blog posts, and news articles.

The queries are sorted by highest frequency of occurance in the database and return one result. The results are parsed for the keyword and the word which follows it is stored. This list is aggregated and the word with the highest frequency of occurance, weighted by the number of words used in the query, is returned to the user.

  • “Once upon a time there” returns “was” using 5 words.
  • “upon a time there” returns “was” using 4 words…
  • “there” returns “is” using 1 word
  • “was” has the highest frequency of occurance and a weight of 9 and is returned to the user

The App

Using this app is easy! Simply enter a word or phrase in the text box…

alt text
…and the predicted word appears below. The bottom of the page includes the time it took to generate an answer for those that are interested.