S. Wu
April 2016
Mind Reading Robot uses NLP algorithms to finish your phrases in the form of a “Human vs. Robot” game with a very simple scoring system
You provide a phrase and the next word you would follow it with (“Crazy like a”, and “fox”)
The Robot gets three guesses at the next word
Play again and again, see yours and the Robot's wins over time
App Functionality
Prediction Algorithm
The app uses a 4-gram model with backoff smoothing method. It first searches the possible highest N-gram(ex. 4-gram if the search phrase consists 3 or more words), and returns the top 3 “next word” by frequency. If the evidence is insufficient, it then searches the next lower–order N-gram, so on and so forth (Jurafsky & Martin, 2014).
N-gram Data
60% random samples of english blogs, news, and tweets corpus from the Coursera Capstone project were preprocessed and generated into N-gram tables using R’s quanteda{} package. Features that occurred only once in each N-gram were removed as minimally advantageous to prediction, a decision which greatly enhanced app speed.
Anonymous Data Collection
Data is recorded to a private Google sheet. This allows the tracking of algorithm performance and will allow customized and improved accuracy by training the algorithm through past user inputs(see future enhancement).
Accuracy Improvement
- Enabling real-time algorithm training with user inputs, without compromising system processing time - Deploying sophisticated prediction methods such as Kneser-Ney smoothing and skip-grams - Looking at trends in tracked responses. For example, present historical data shows that users are unlikely to choose prepositions, conjunctions, or delimiters as their “next word”. A future algorithm will include weights to improve prediction according to how humans play this game. - Word association weighting
Potential applied usages include
- An educational tool: variants of the app could train children or second language learners. - User experience: the app could be used to elicit word associations in entertainment and social media spaces. For example, the Robot could provide the phrase, and the human the final word. - Learning a specific group: Millennials, skating enthusiasts, dreamers etc. (could result in improved accuracy). - Marketing research: “I drink coke when I”, “_____”