Gavin
April 19, 2018
The algorithm is quite straight-forward:
To save on memory I constrained the database of these n-grams to only those that were observed at least 5 times in the training dataset. This reduced the size about ~100 fold and greatly increased the running time.
The tool is available here: https://gavinmdouglas.shinyapps.io/coursera_capstone/
Using this tool is extremely easy - simply type a setence in the box at the above link and the predicted next word will be output below!