Logan J Travis
2014-12-14
John Hopkins University Data Science Capstone on Coursera
Project Goal
Predict the next word from arbitrary input text.
My Goal
Develop a low-processing power model that can easily add custom words and learn from user input. I chose to work within these constraints to attain my goal:
I recognized two problems when investigating the Hans Christensen Corpora (HC Corpora):
Click to Zoom
My research lead me to develop a four step model sampled from 10% of the HC Corpora:
In few words? Not great!
I split my data into training and testing sets but anticipate accuracy below 10% based on my Shiny application. Give it a try.
However, my initial results made no sense. I since adjusted model parameters - especially for the next POS and similar sentence filters - to yield predictions neighboring reason. I plan further tweaks after reviewing my fellow students' projects.