Word Prediction For Virtual Keyboard Applications

David M. Leonard
11/26/2017

The Challenge: tablets and smartphones rely on a virtual keyboard

  • Typing on a virtual keyboard is much slower than a physical keyboard
    • Lack of tactile feedback - often results in selecting the wrong key
    • Difficult to position hands for optimal typing performance
  • The solution: suggest options for the next word
    • Tapping a suggested word inserts it into the text stream
    • Reduces keystrokes by n - 1 letters; large n = large savings in time and keystrokes
    • As letters of a word are typed, can dynamically update suggestions for word completion
  • Word prediction/completion needs to be fast and accurate
    • Must be able to suggest words in less than 100 milliseconds to appear to be instantaneous
    • Accuracy is important to encourage adoption
    • Ultimately, it isn't about the number of words guessed - it's about how many keystrokes are saved - implementing word completion can improve this savings dramatically

The Solution: A text entry app that predicts the next word

  • Emulates the experience of a virtual keyboard user and suggests the next word
    • Can predict the next word before you start typing it, and
    • It can complete the word you are typing (if enabled by a checkbox)
  • As you type text, new predictions appear and can be chosen for insertion
  • It displays three suggested words; left to right highest to lowest probability
  • A dashboard displays metrics on prediction/completion accuracy and keystroke savings
  • A word prediction map highlights color-coded predicted and completed words
  • Prediction accuracy was measured by using 60% of available data to build a lookup table, and testing against a random sample of 10,000 sentences containing a total of 109,899 words
Predicted Word Rank Predicted Word Frequency Average Predicted Word Length Keystroke Reduction Next Word Prediction Accuracy
1 17,727 3.9 12% 16%
2 6,765 4.1 5% 6%
3 4,436 4.2 3% 4%
Total 28,928 4.0 20% 26%

Word Prediction Details

  • Build lookup tables:

    • Gather a large corpus (collection of text documents) comprised of tweets, blog posts, and news stories
    • Break up the text into individual fragments, called ngrams, consisting of from one to five words
    • Build three lookup tables:
      • a table listing ngrams, the next word that occurred, and the number of times that next word occurred
      • a list of most frequent first words in a sentence
      • a word (1-gram) dictionary used to complete a word as it is being typed
  • When the beginning of a sentence is detected, use recommendations from the First Word table

  • Use the ngram table and Katz's Back-off Model to find the most likely choices for next word

  • Once the user begins typing a word, offer suggestions to complete the word using the word dictionary (a potential enhancement is to search the ngram table first, then the dictionary)

Text Editor With Word Prediction: The Application