Sean Rossouw
2019
Demonstration of a word prediction using R and Shiny
Corpus generated from text scraping blogs, news and twitter (832 Mb)
50% Used as training data, cleaned and tokenised
Unigrams, Bigrams and Trigrams generated and trimmed to create lookup tables
The prediction takes a string and integer input
The last and second last words are used to search through matches in the lookup tables
Results are added together with adjustable weighting and the score used as a measure of likeliness of a match
The prediction returns a list of the n most highly scored predictions for the input string
Go to the Shiny app
Enter your text and select how many results you want returned with the slider
Click the Submit button
Try change the context or use the first two words of common phrases to see how results from the trigram are weighted above bigram and unigram results
Read the readme file at https://github.com/SeanRossouw/WordPredictionR/blob/master/README.md
Visit my Github page at https://github.com/SeanRossouw
Thanks to Jeff Leek, Roger Peng and Brian Caffo