19/01/2020

The application

This app predicts the next word in a given sentence.

The algorithm uses the last words of the sentence as the context to predict the new word.

Usage:

  • type a few words in the text area and then press ‘Predict word’ button

  • the application will suggest the next word of your sentence

For example you can type: ‘as a matter of’, ‘to buy or not to’, etc

Since the algorithm has not been optimised, word prediction can take up to 45 seconds

Data prep and prediction algorithm

  • Data files contain tweets, news and blog entries. Only a small portion of them is used to create the prediction model

  • Texts containing profanity are excluded from model creation

  • The algorithm constructs a large set of ngrams. ngrams which appear less than 10 times are discarded from the model

  • The algorithm processes a sentence and looks for the largest ngram that contains the last words of the sentence

  • If no ngram is found, a shorter ngram is searched and so on. If no ngram can be found, the algorithm returns a null

The implementation

Word Prediction is a shiny application. It is based on two files:

ui.R

  • implements the presentation logic and draws the user interface (UI)
  • provides the interactivity with the user

server.R

  • loads data, implements the word prediction algorithm
  • sends data to the UI

When the user inputs a sentence, the server side is executed and it returns the predicted word to the user.

References