Predict Next Word - Capstone Project
Omer Shechter
25 January 2019
Project Summary
- This project is about understanding and building a predictive text model
- The goal is to use Text Data set from Blogs, News and Twitter and create an NLP Model
- The outcome is to get a sentence or a string and predict the next word
- The Last step is to build a data Product using Shiny APP that will illustrate the Prediction

The Algorithm
- The Algorithm used is Stupid Backoff
- All data is used (except from Twitter data - 85%)
- Data cleaning actions were done (e.g. Remove symbols, Numbers, Punctuations)
- 1:5 sets of Ngrams were created
- Data was trimmed based on frequency to save memory space
- The following R packages were used: quanteda, data.table, qdap
- Stupid Backoff briefly explained below :

Next Word Predict - The Application
- The Application is a Shiny Application
- Type a sentence in the slide bar (at the left)
- Push the Predict button
- The next word will be shown in the blue Window
- 4 Lower Priority options are displayed in the green window
