https://thill3.shinyapps.io/Next-Word-Predictor/

Natural Language Processing Text Prediction

Thomas Hill III
2019 November 26

Introduction

While human beings are often able to determine what the next word in a sentence or phrase might be it is the case that machines and computers have a notoriously difficult time with the same task. The main reason for this is that human beings have a wealth of experience helping them to eliminate swaths of unlikely next words and intuit upcoming words.

Machines and computers (at least today) definitely outdo human beings most of the time in terms of logical-sequential processing speed. Humans still possess the advantage in certain forms of parallel processing, intuition, and context understanding.

Computers can, however, process large amounts of text data and use that processed data to predict text.

How it works

Preprocess - Load text data; make it usable for predictions.

  • Amass a large body of text to train the predictor.
  • Load all of the text into R.
  • Parse input into (\( n \))-grams ('phrases' that are (\( n \)) words long).

Choosing a next word - Once input text is formatted…

  • Convert input to lower case; omit problem characters (trailing spaces, some punctuation, etc.)
  • Look up the input in the table containing (\( n+1 \))-grams. Use the word that the table says comes next.
  • If input not found then implement “stupid backoff” process (remove input's leading word, look up in next smallest table).
  • Repeat above step until there's a match or no more tables.

Performance

  • This application will determine the next word for any number of input words.
  • The entire App fits within one gigabyte of space.
  • Once loaded the app will determine a next work almost immediately after the “Next Word” button is pressed.

How to Use

The app is located at https://thill3.shinyapps.io/Next-Word-Predictor/
Simply load the app and type in your input word or words. You can include any number of words and any combination of non-text characters (apostrophes, commas, etc.).

Once you click the “Next Word” button the app will place the predicted word in the grey box.

The Next Word Prediction app working