Captstone Project

25 February 2018

Overview

This project is part of the Coursera JHU Data Science Specialization Capstone Project. The full capstone project consists of: - Exploratory analysis on different corpus - Build ngrams model of the different corpus - Predict the next word using our model - Build an interactive SHiny app to use our model

My model Description

Data cleansing (lower case, remove digits, non words removed)
Build ngrams model of the different corpus
NGram Tokenizer was used to break text into words
Built 1,2,3 and 4 Grams dataset sorted by frequence
Naive prediction the most frequent mathinc words will be predicted

User Interface

[test](Capture.png]

App link