This presentation submitted in fulfilment of the requirements for The Johns Hopkins University Data Science Specialization Capstone coursework as delivered through Coursera.org
Following are the objectives of this project:
The complete process of algorithm development will follow the below steps:
Random Sampling is performed on provided corpus to get training, validation and test set.
Training set is subsequently cleaned (removal of html tags, emails, twitter handles, punctuations etc) and N-grams tokens were created.
Develop N-gram frequency tables and model for text.
Develop N-gram frequency tables and model for text.
Predict the next word based upon the algorithm developed.
Check for prediction accuracy, if any, and remodel the algorithm for reprediction.
Type any phrase in the test box in the Shiny App.
The prediciton algorithm (behind the app) will try to predict next word.
The Shiny App
The app can be accessed from this link: