Abdullah Albyati
11/25/2017
This presentation will provide an overview of my algorithm to predict the next word in a sentence.This is the Capstone project for Johns Hopkins University Data Science specialization on coursera.org
The goal of this capstone project is to build a Shiny application that is capable of predicting the next word based on user text input.
This project was completed in three phases
Exploratory Analysis
Prediction model and Shinny App Creation
The prediction algorithm was created using a back-off ngram model (Up to 6 ngrams) The algorithm used a subset of the data obtained by running the following code
#Take a small sample of the text to work with
set.seed( 2017 ); ds.blogs <- sample(blogs, 0.2 * length(blogs))
set.seed( 2017 ); ds.tweets <- sample(news, 0.2 * length(news))
set.seed( 2017 ); ds.news <- sample(twitter, 0.2 * length(twitter))
The Algorithm will predict the next word by using the higher ngram first starting from 6 and work it's way down.
The application layout is as follow;