LyPu
6/6/2020
The goal of this exercise is to create a product to highlight the prediction algorithm that you have built and to provide an interface that can be accessed by others
This presentation contains a brief introduction regarding below aspects:
1. An introduction on the algorithm
2. A description and an instruction of the ShinyApp
The training dataset is from Coursera-SwiftKey.zip. It is a combination of 1% random sample of English language news, blogs and twitter dataset.
Quad-grams, tri-grams, bi-grams and uni-grams are applied to model text and Kneser-Ney Smoothing method is used to calculate predicted word probability. The top few words with highest probabilities are recommened based on backoff model.
The Shiny App contains 2 tabs App and Exploratory.
In the App page, there is a input textbar and a wordcloud.
You can type in words in the textbar and a few words will appear below that are predicted as the next word based on training dataset.
Below the input textbar is a wordcloud based on the predicted nextwords.
In the Eploratory page, an exploratory report that contains some descriptive analysis on the original dataset is attached.
The app has been deployed to ShinyApps.io server.