Data Science Capstone

Hao YU
JUL. 6th 2016

Outline

For more detailed reference and source code, please visit: https://github.com/TeddyTiome/

  • Word Database
  • Pharse Process and Prediction
  • Shiny Application

Word Database

For a much more esay way to process the natural language, I have tried to serach the database for N-grams and dirty words.

Followed by these websites underlying, I have contributed 1 to 3 grams database as well as bad words.

Pharse Process and Prediction

Based on these database, the work is much easier, and the procedures include:

  • Clean typed text with Non-alphabet characters.
  • Filter bad words and set defualt to tokened tags.
  • Get words in phrase and predict the next with most possibility.

Shiny Application

Sorry that my shiny application is simple and concise.
So does the instructions:

  • Type pharse, sentence, word without punctuation.
  • Then the would do the prediction.
  • A word would appear in the last text box.

    (I would say lazy-made and crude)