Enelen Brinshaw
An app that takes as input a string and predicts possible next words (stemmed words are predicted).
The user can select upto 50 words for prediction.
quanteda package, removing numbers, spaces, punctuations, hypens, twitter symbols and profanity using this list of bad words used by Googlequanteda.data.tables, with each word of the n-gram in its own column (which were keyed for faster access) and a column for the count of the n-gram in the data. This helped speed up the later process as well as save memory.The app works in the following manner:
require(rbenchmark)
benchmark(nextWord("I am going to the"))[,2:5]
replications elapsed relative user.self
1 100 0.71 1 0.62
As can be seen, the app performs near instantaneously when run locally. When run on the Shiny server, the network latency can slow the process.
In future, the performance can be improved by: