13/04/2020

Predict Next Word App

This presentation describes a Shiny App that predicts the next word based on your input.

Built by Taras Poltorak for the Capstone Project of the Data Science Specialisation from Johns Hopkins University.

The app can be found at: https://tazpoltorak.shinyapps.io/WordPredictionApp/.

13/04/2020

Summary

The objectives of this presentation are:

  • To describe the purpose of the app.
  • To explain the principles behind predicting the next word based on user input.
  • To describe the app’s interface.

Purpose of the app

This Shiny app predicts the next word based on the user input. As the user types one or more words in the input field, the app provides three choices for the next word, which the algorithm considers to be the most likely. The principles of the algorithm are explained in the next slide.

Principles

  • The predictive text model is based on a large corpus of text taken from blogs, Twitter and news.
  • This corpus was cleaned and formatted so that the frequency of individual words occurring one after another could be counted and recorded.
  • In order to increase the efficiency and speed of the app the words were sampled.
  • The algorithm is based on n-grams. An n-gram is a contiguous sequence of n items from a given sample of text or speech; in our case, words. For example, in the sentence ‘I am going to the shop’ the bi-grams (or 2-grams) would be ‘I am’, ‘am going’, ‘going to’, ‘to the’, ‘the shop’.
  • The algorithm analyses user input and based on the n-grams suggests the next word.

Interface

The app has a very simple interface. On the left-hand side there is a field labeled start typing here. As words of a sentence are typed, the app produces three suggestions to the right. This is done ‘on the fly’ and no ‘submit’ button is required.