Capstone_Project

LyPu
6/6/2020

Requirements & Summary

The goal of this exercise is to create a product to highlight the prediction algorithm that you have built and to provide an interface that can be accessed by others
This presentation contains a brief introduction regarding below aspects:

1. An introduction on the algorithm

2. A description and an instruction of the ShinyApp

Prediction Model

Dataset:

The training dataset is from Coursera-SwiftKey.zip. It is a combination of 1% random sample of English language news, blogs and twitter dataset.

Language Model:

Quad-grams, tri-grams, bi-grams and uni-grams are applied to model text and Kneser-Ney Smoothing method is used to calculate predicted word probability. The top few words with highest probabilities are recommened based on backoff model.

Shiny App - Description & Instruction

The Shiny App contains 2 tabs App and Exploratory.

In the App page, there is a input textbar and a wordcloud.

You can type in words in the textbar and a few words will appear below that are predicted as the next word based on training dataset.
Below the input textbar is a wordcloud based on the predicted nextwords.

Shiny App - Description & Instruction

In the Eploratory page, an exploratory report that contains some descriptive analysis on the original dataset is attached.

This report is from milestone project which can be found here.

The app has been deployed to ShinyApps.io server.