Capstone Project Final Presentation

Cvetan Veljanovski
19/03/2021

Description of the application

This application “Predictor” is used to predict the next word when a user inputs series of words in the input field. Once the user inserts the series of words, the application based on that input gives suggestion words that would complete the sentence. There are three predictions:

  1. Prediction based on the last two words in the input.
  2. Prediction based on the last three words in the input.
  3. Prediction based on the last four words in the input.

This application can be used for different purposes as long as the need is to predict the next word from series of already inserted words.

The design of the User Interface

plot of chunk unnamed-chunk-1

The User Interface of the application “Predictor” is organized in two parts, as it can be seen on the left:

  1. On the left part there is the input field and an explanation how this application works and is used.
  2. On the right part the guessed next word is shown and bellow it the other predicted words based on different criteria.

The logic of the application and its performance

  • The next word is guessed using the frequency of the word in combinations of two, three and four words.

  • The frequency is calculated using text that is extracted from blogs, news and twitter that SwiftKey has provided for this project.

  • When using the application you will notice that the word 'it' is given, this is a default value for the prediction when there are no hint for guessing of the next word due to missing combination of the words.

  • The data preparation for this model normally takes around 30 minutes, where the the original data composed by news, blogs and twitts is organized and selects 50.000 random lines to create the frequency tables with the combination of two, three, and four words.

  • The output of this process is the data used by this app. The total size is 3Mb.

  • The time required by the app to search the next word in the frequency tables is roughly 0.5 seconds in average.

  • Thank you for checking out my presentation!