Word Up! ... a simple yet powerful word prediction app
Paramjot Singh
2017-05-30
Motivation
- Do you want to type faster?
- Do you want to avoid inaccuracies while typing when using a small keyboard on your phone or tablet?
- We have got an app for you!!
Welcome to 'Word Up!'

App Features
- A simple and intuitive interface - simply start typing in the text-box
- Fast response time
- Under 10 MB of final dictionary that loads quickly
- Displays top 3 most likely words almost instantaneously
Working Details
Building the data dictionary
- 80% of the entire data is used to build the final dictionary
- To overcome the limitations of RAM, the data was divided into 6 chunks (500,000 entries each) for further processing
- Each dataset is first cleaned by converting to lowercase, removing extra whitespace, numbers, punctuation and non-english characters
- After cleaning, one through five grams are generated and stored in a frequency-sorted data table to allow for efficient processing
- To get a better trade-off between accuracy and model-size, only n-grams that occur 5 or more times are kept in the final dictionary. This allowed to keep the size of final dictionary under 10 MB but with a good accuracy.
Word Prediction Algorithm
- Last four words from input string are used for predicting the next word
- These four words are converted to unigram, bigram, trigram and four-gram to search for the next most likely word using the backoff approach
- Backoff approach first tries to find the most likely match in a four-gram, followed by trigram, bigram and finally unigram
- If no match is found, three of the top 10 frequently occurring unigrams from training set are presented.
Acknowledgements
- John Hopkins University's Data Science Specialization Team (Jeff Leek, Roger Peng and Brian Caffo) for teaching wonderful courses in the specialization
- SwiftKey for providing the data
- Fellow learners in the specialization and the vast and active online R community
Thanks for checking the app out!