The idea behind this project is to develop an algorithm for predicting text, based on the previous words entered by the user. This work is the result of the Capstone Project inside the Data Science Specialization Course, run by the John Hopkins University in collaboration with SwiftKey.
The following presentation will briefly explain how the model works, descrive its predictive performance and show off the developed shiny app and its operation.
Three different English corpora has been used to feed the model: one from blog posts, one from news articles and another from Twitter. The data has been splitted into training and test set, with 80-20% proportions.
The first part, consisting of data cleaning and exploratory analysis is part of another report that can be found here