Alexander Lee
August 20, 2015
The problem: Devise a model that will predict a user's next desired word based on arbitrary text input
Givens:
My approach: Model the corpora using Python's capabilities, and build the prediction logic in R
Tools:
Raw text data were first processed in Python as follows:
Algorithm logic:
Algorithm performance:
*In-sample text randomly selected from raw corpus data; out-of-sample text randomly selected from arbitrary Google News / Twitter content outside of provided corpora