New Magic 8 Ball Predictor - Capstone Project

Joe Walters
January 24, 2016

Introduction

Introducing the new Magic 8 Ball Predictor now featuring NLP technology. This new version uses text analytics in a Shiny App to interpret your phrase and determine the next word.

Picture of App

App Instructions & Overview

That Magic 8 Ball as easy to use as a Google search box. In the box under the Magic 8 Ball, enter your phrase where it says “Enter phrase here”. Then shake and watch the magic happen. On the right, the Magic 8 Ball Predictor will present the top four words in order of most likely to you.

Picture of App

App Design Overview

The Magic 8 Predictor looks for most commonly used 2 and 3 n-grams for the phrase entered. The n-gram dictionaries are samples of the news, blogs and twitter files. It looks for the most detailed match starting with 3-gram and backs down to 2-ngram if no match found.

To make the app more widely usable like SwiftKey, the search starts based upon dictionaries of samples from the provided files. If a user provides words beyond the provided texts, secondary dictionaries(1) of 2 and 3-grams from Corpus of Contemporary American English (COCA) at http://www.wordfrequency.info are searched.

The app has been sized to work on most devices requiring only 66.0 MB in space to house all dictionaries.

App Search Algorithm Overview & Reference

The program works through search tree using the following steps until it finds a match.

  1. 3-grams 5% sample from blogs, news and twitter files
  2. 2-grams 100% sample from blogs, news and twitter files
  3. 1 million most frequent 3-grams from COCA library
  4. 1 million most frequent 2-grams from COCA library

Reference
(1) Davies, Mark. (2011) N-grams data from the Corpus of Contemporary American English (COCA). Downloaded from http://www.ngrams.info on January 24, 2016.