Word Up! ... a simple yet powerful word prediction app

Paramjot Singh
2017-05-30

Motivation


  • Do you want to type faster?
  • Do you want to avoid inaccuracies while typing when using a small keyboard on your phone or tablet?
  • We have got an app for you!!

Welcome to 'Word Up!'


App Features

  • A simple and intuitive interface - simply start typing in the text-box
  • Fast response time
  • Under 10 MB of final dictionary that loads quickly
  • Displays top 3 most likely words almost instantaneously




Working Details

Building the data dictionary

  • 80% of the entire data is used to build the final dictionary
  • To overcome the limitations of RAM, the data was divided into 6 chunks (500,000 entries each) for further processing
  • Each dataset is first cleaned by converting to lowercase, removing extra whitespace, numbers, punctuation and non-english characters
  • After cleaning, one through five grams are generated and stored in a frequency-sorted data table to allow for efficient processing
  • To get a better trade-off between accuracy and model-size, only n-grams that occur 5 or more times are kept in the final dictionary. This allowed to keep the size of final dictionary under 10 MB but with a good accuracy.

Word Prediction Algorithm

  • Last four words from input string are used for predicting the next word
  • These four words are converted to unigram, bigram, trigram and four-gram to search for the next most likely word using the backoff approach
  • Backoff approach first tries to find the most likely match in a four-gram, followed by trigram, bigram and finally unigram
  • If no match is found, three of the top 10 frequently occurring unigrams from training set are presented.

Acknowledgements

  • John Hopkins University's Data Science Specialization Team (Jeff Leek, Roger Peng and Brian Caffo) for teaching wonderful courses in the specialization
  • SwiftKey for providing the data
  • Fellow learners in the specialization and the vast and active online R community

Thanks for checking the app out!