Quang V. Nguyen
8/23/15
My goal was to develop the ultimate natural language processing app using my specialized knowledge in data science. To tackle this challenge required all of the knowledge I'd learned to date from the 9 courses. First up, was to understand our data and our goals.
The results of this can be found here: http://rpubs.com/quangface/milestone_report
Having explored our data, the next step was to model and build the app.
You can find that link here: https://quangface.shinyapps.io/wordpredictor
The Word Predictor App is very easy to use. Here are the basics:
This app was built by creating a sample from the provided HC Corpora data. This included text from blogs, news, and tweets. Using this text we cleaned the data (removed punctuation/numbers/special characters, converted to lowercase, etc).
Once that was complete we tokenized into n-grams, using the Stupid Backoff Model.
This is a top word prediction app: https://quangface.shinyapps.io/wordpredictor
I hope you think so too!