Dipak Nandeshwar
23rd August 2020
- Capstone: Prediction of the next word Welcome to my Final Submission for the Capstone Project
The goal of this exercise is to create a product to highlight the prediction algorithm that you have built and to provide an interface that can be accessed by others. For this project you must submit:
A slide deck consisting of no more than 5 slides created with R Studio Presenter pitching my algorithm and app as if you were presenting to your boss or an investor. A Shiny app that takes as input a phrase (multiple words) in a text box input and outputs a prediction of the next word.
In order to produce this shiny app, it has been required:
- Data: Coursera-SwiftKey dataset including News and Twitter examples to feed the model
- Sofware: R, optional: RStudio
- Libraries: Shiny, tm, data.table, stringr, dplyr Internet Conection -Disclaimer: due to limited resourses this is only a prototype, so all tweets and news are not analysed, just a selected group
This is the method followed in order to code the algorith to predict the next word:
- Read the data from the Coursera-SwiftKey dataset
- Format the data to create a Corpus: remove punctuation, meaningless words (prepositions, articles…), extra whitespaces…
- Create ngram models feeded with the generated Corpus Use ngram models to predict the next word
This is the method followed in order to code the algorith to predict the next word:
- Read the data from the Coursera-SwiftKey dataset Format the data to create a Corpus: remove punctuation, meaningless words (prepositions, articles…), extra whitespaces…
- Create ngram models feeded with the generated Corpus
- Use ngram models to predict the next word
The Shiny app:
- Allow you to select from with data do you want to predict your next word (Select origin data for: Twitter or News)
- Introduce the text do you want to predict the next word: Predict Next Word
- Use the ngram algorithm to predict the next word: Predict Next Word (note that non empty text in Prdict Next Word is required)
- Clear the app: clean button
- See which is the predicted word: Next word.