Text Prediction Application

Poobalan
22 April 2016

This application attempts to predict the next word based on user input (using maximum of 3 words to predict). The prediction is based on datasets provided namely twitter, blog and news data from SwiftKey.

Challenges

The following challenges were faced:

  • hardware limitation
  • data cleansing
  • data size

Solution:

  • using smaller sample size of about 10% of provided dataset size.
  • comprehensive cleansing of data by removing urls, RTs, and symbols such as @

Algorithm

Extract from formatted and cleaned data:

Usage Instructions

User can enter input and click on submit button on the sidebar.

The resulting prediction will appear in the main panel. If not matches are found, the application will predict the word with highest frequency.

Performance, Limitations, Resources

The source code and application is accessible via the links below:

Screenshot