Data Science Capstone-Natural Language Processing

Christopher Papanicolas

July 29, 2017

Introduction

Objectives-

The Application

  1. Delete the word prediciton from the input box. (THere will be an error that comes up, ignore it)
  2. Add a sentence/phrase/word to the box and a prediction will return.
  3. The word predicted is based on frequency in the index we built for our model.

Modeling of App

Limitations and Future Work

Limitations

  1. Prediction is based strictly on probabilities
  2. Not enough sample from the corpus was used do to memory restrtictions

Future Steps

  1. Index a larger sample of the corpus anbd create larger dictionary
  2. COntinue to find ways to improve performance and memory usage
  3. Utilize grammar structure and word associations
  4. Use of models to remove noise in data and other models for better accuracy
  5. a greater n-list to get greater percision.