Next Word Predictor

F. GarcĂ­a
2018-12-05

Final Project Submission - Data Science Capstone

Objective

Hereby I present a prototype tool to facilitate the typing of text to the users of computers or mobile phones.

"Stupid Backoff" model

In general terms:

  • In any language, each collection of n words (n-grams) has an occurrence rate, which can be identified with its probability of appearing in a text. There are collections of n words that appear more frequently than others, specially due to the grammar structure.

  • The same applies for all words that are used after specific combinations of words.

  • The “Stupid Backoff” back-off model provides estimations based on the sampling of a large number of texts, and it makes estimations based on a parsimonious process, through which it is looking for matches for n-grams of inferior length, until finding a coincidence.

The App

You can access the application in the following link:

https://francisco-garcia.shinyapps.io/predictor/

This application runs on Shinyapps.

Shinyapps is a platform for interactive web apps built with the Shiny R package.

How to use it

Please, enter a text with three or four words and push the button “Submit” alt text