NextWordPredictor App

Farah M
December 2020

The App

Next Word Predictor is a Shiny app that uses a text prediction algorithm to predict the next word based on text that is entered by the user.

It was developed for the Capstone Project for the Data Science Specialization by Johns Hopkins University in Coursera and is accessible through the link: https://sarizzuz.shinyapps.io/NextWordPredictor/

The R codes for the application are available in: https://github.com/sarizzuz/nextwordpredictor/tree/main/NextWordPredictor

The main repository containing the R codes for generating the models used for prediction along with other materials created for the capstone project: https://github.com/sarizzuz/nextwordpredictor

User Interface

The App has a simple and intuitive user interface.

On the “App” tab, the user enters the word or phrase in the “Input Box”.

Within a few seconds the suggested next words will appear in the “Output Box”.

Word Prediction App

Under The Hood

The maximum likelihood estimation or MLE is used to estimate the probabilities that is assigned to the N-gram models. Then the Stupid Back-off algorithm is applied to get the next word prediction. Finally, the Kneser-Ney smoothing is used when there is insufficient match results.

For this application, the n-grams used are: quadgrams, trigrams, bigrams and unigrams. The n-grams models created from samples of twitter, blog and news text taken from a corpus called HC Corpora.

References Used