Jill Beck
April 25, 2016
This presentation will briefly cover an application that can be leveraged for predicting the next word.
This application can be leveraged on all form factors to faciltate quicker typing of thoughts across all social media, email and other communications platforms.
The main objective of this project was to build a Shiny application in R that is able to predict the next word.
In order to complete this assignment, techniques such as data cleansing, exploratory analysis, predictive modeling, etc. were used.
Three types of data, which included blogs, news and twitter, were utilized to create the model.
Various word combinations (NGrams) were then created using clean data sets and a predictive algorithm was applied to predict the next word. The final predictive model was configured to work as a Shiny application.
The basic algorithm to produce a word prediction is as follows:
All text data used to create a frequency dictionary and then subsequently leveraged to predict the next word comes from a corpus known as HC Corpora.
This application is currently being hosted on shinyapps.io: https://jcbeck.shinyapps.io/PredictionAppEngine/
All background code and reporting for entire Capstone project can be found at this GitHub repository: https://github.com/jcbeck/Capstone-Project
This pitch deck is located here: http://rpubs.com/jcbeck/capstonefinalproject