Word Prediction For Virtual Keyboard Applications
David M. Leonard
11/26/2017
The Challenge: tablets and smartphones rely on a virtual keyboard
- Typing on a virtual keyboard is much slower than a physical keyboard
- Lack of tactile feedback - often results in selecting the wrong key
- Difficult to position hands for optimal typing performance
- The solution: suggest options for the next word
- Tapping a suggested word inserts it into the text stream
- Reduces keystrokes by n - 1 letters; large n = large savings in time and keystrokes
- As letters of a word are typed, can dynamically update suggestions for word completion
- Word prediction/completion needs to be fast and accurate
- Must be able to suggest words in less than 100 milliseconds to appear to be instantaneous
- Accuracy is important to encourage adoption
- Ultimately, it isn't about the number of words guessed - it's about how many keystrokes are saved - implementing word completion can improve this savings dramatically
The Solution: A text entry app that predicts the next word
- Emulates the experience of a virtual keyboard user and suggests the next word
- Can predict the next word before you start typing it, and
- It can complete the word you are typing (if enabled by a checkbox)
- As you type text, new predictions appear and can be chosen for insertion
- It displays three suggested words; left to right highest to lowest probability
- A dashboard displays metrics on prediction/completion accuracy and keystroke savings
- A word prediction map highlights color-coded predicted and completed words
- Prediction accuracy was measured by using 60% of available data to build a lookup table, and testing against a random sample of 10,000 sentences containing a total of 109,899 words
| Predicted Word Rank |
Predicted Word Frequency |
Average Predicted Word Length |
Keystroke Reduction |
Next Word Prediction Accuracy |
| 1 |
17,727 |
3.9 |
12% |
16% |
| 2 |
6,765 |
4.1 |
5% |
6% |
| 3 |
4,436 |
4.2 |
3% |
4% |
| Total |
28,928 |
4.0 |
20% |
26% |
Word Prediction Details
Build lookup tables:
- Gather a large corpus (collection of text documents) comprised of tweets, blog posts, and news stories
- Break up the text into individual fragments, called ngrams, consisting of from one to five words
- Build three lookup tables:
- a table listing ngrams, the next word that occurred, and the number of times that next word occurred
- a list of most frequent first words in a sentence
- a word (1-gram) dictionary used to complete a word as it is being typed
When the beginning of a sentence is detected, use recommendations from the First Word table
Use the ngram table and Katz's Back-off Model to find the most likely choices for next word
Once the user begins typing a word, offer suggestions to complete the word using the word dictionary (a potential enhancement is to search the ngram table first, then the dictionary)
Text Editor With Word Prediction: The Application