PredictNext

Pingping
09/17/2020

Introduction

For more details on this Coursera project, please visit: https://www.coursera.org/learn/data-science-project/supplement/idhGA/syllabus.

Model building

  • First, a samll portion of the dataset was selected to avoid taking too much time to run the model
  • The selected data was cleaned by removing profanity, numbers, shapes etc.
  • Use the cleaned data to build a basic n-gram model to predict the next word.

Running the app

  • In the shiny app page, there's a input input box for inputting phrases that you want to predict the next word
  • The red sentence suggests that only English word is supported by this Shiny.app
  • The predicted word will automatically change based on your input phrases

Thanks!

  • Thank Coursera and John Hopkins University for provideing such a great series of Data Science courses
  • Thank Jeff Leek, Roger Peng and Brian Caffo for their efforts in making and presenting the course contents