Brian Francis
07-October-2016
Typing text into a smartphone or tablet can be quite slow. To improve the user experience, we propose an application that presents a “next word prediction” as the user types and allows the user to easily select a word while typing.
The prototype tool has good accuracy, but is also fast, memory efficient, and intuitive to use.
To accomplish this we implemented an n-gram language model trained over a large corpus of random text from the web.
First we parsed 3,415,742 text documents from news websites, blog entries, and twitter into word phrases between one and five words. We then estimated the probability of the “next word”“ based on the last four words the user entered.
For example, given the phrase "Joe and I got some yummy popcorn at the”, we can estimate the most likely word to follow the phrase “yummy popcorn at the” based on the the phrases we saw in our training set.
Model performance statistics below are based on a held out set of 426,967 documents. Application performance based on 227 input predictions.
| Model Performance | Application Performance |
|---|---|
| Accuracy: 29.04% | Speed: 10 msec |
| OOV: 0.84% | Memory: 225 MB |