The project goal was to build a predictive model of English text. The user must enter a word in a textbox and the algorithm predicts the next word. The skills needed to complete this task include natural language processing and text mining. The app was created using the Shiny Application in RStudio.
The source data for this project can be found at: https://d396qusza40orc.cloudfront.net/dsscapstone/dataset/Coursera-SwiftKey.zip
My Shiny App can be found at: https://starscream.shinyapps.io/Final_Project/
In order to build a prediction algorithm, data was scraped from blogs, twitter and the news. There are several processes that need to be completed before the model can be built.
The model for the next word prediction was based on the Katz Back-off algorithm. This project’s algorithm contains: