Marc Boulet
2018-02-05
The purpose of the Coursera JHU Data Science Specialization capstone project is to build an app that will predict the next word based a user's text input.
This app uses a text prediction model with the following features:
The source content for the prediction app is from heliohost.org on September 30, 2016, with profanity filtered out.
The text prediction model uses 100% of the corpus, with a filter for low-frequency word events in order to make the data compact enough to fit into a Shiny app.
The corpus was tokenized (or split) into n-grams, or word chunks.
For instance, the sentence “many miles to go before I sleep” would be divided into multiple n-grams:
The app is designed for ease of use. Simply enter some text into the Input sentence box and the predicted words will appear on the right.
There are a few options to customize the text prediction model:
For more information, click on the Documentation button.