Enrique Estrada
November 28, 2018
This is the Capstone project for the Coursera Data Science specialization, which involved developing a word predicting application in R/Shiny. We were provided a Corpus of Text from Blogs, Twitter and News from HC Corpora which is a collection of corpora for various languages freely available to download. However we were required to use only The English texts.
This project is conducted with the support of the Johns Hopkins University and in cooperation with SwiftKey.
The application uses natural language processing, namely, n-grams, Markov model, and Katz's back-off model to perform text prediction.
The series of steps to build the model were:
The application predicts the next word in a phrase/sentence. Up to four possible next word predicitons are available, and you have the option to click on any of them.
The word selected will be added to your text then application continues on predicting the next following word.