Word Prediction with R and Shiny

Sean Rossouw
2019

Description of the Program

Demonstration of a word prediction using R and Shiny

Corpus generated from text scraping blogs, news and twitter (832 Mb)
50% Used as training data, cleaned and tokenised
Unigrams, Bigrams and Trigrams generated and trimmed to create lookup tables

The Prediction Algorithm

The prediction takes a string and integer input
The last and second last words are used to search through matches in the lookup tables
Results are added together with adjustable weighting and the score used as a measure of likeliness of a match
The prediction returns a list of the n most highly scored predictions for the input string

How to use the program

Go to the Shiny app
Enter your text and select how many results you want returned with the slider
Click the Submit button
Try change the context or use the first two words of common phrases to see how results from the trigram are weighted above bigram and unigram results

More information

Read the readme file at https://github.com/SeanRossouw/WordPredictionR/blob/master/README.md
Visit my Github page at https://github.com/SeanRossouw

Thanks to Jeff Leek, Roger Peng and Brian Caffo