Shiny APP Presentation - Data Science Capstone - Johns Hopkins University

Rejane Rodrigues de Carvalho Pereira, Brasília-Brazil

January 13, 2022

Shiny APP of the Words Predictor

This shiny application shows the interactive words predictor.
The data comes from the unzipped twitter file from the Coursera-SwiftKey available on the website: https://d396qusza40orc.cloudfront.net/dsscapstone/dataset/Coursera-SwiftKey.zip
I did an Exploratory Data Analysis this Coursera-SwiftKey available on the website: https://rpubs.com/rodricar/852305

I choose the model comes from of the wordpredictor package because of the fast performance and facilities of the implementation.
I published my app on the website: https://natureza.shinyapps.io/Final_Project/

Step 1: Read the most extensive file on the Twitter’s file from the Coursera-SwiftKey.
Step 2: Init the object by ModelGenerator$new in which I sample 0,1% of Twitter file by performance using cleaning data, generating sample-clean.txt with 2.361 lines. My algorithm doesn’t transform words in steam, keeping the original words with a minimum frequency of one word.
Step 3: Generate the 4-gram model applying the method generate_model in the object generated before. The output is the file def-model-twitter.RDS.
Step 4: Predict the next word based on the user-chosen input sentence, applying the method predict_word in the 4-grams. Additionally, the APP shows the following predictive four terms.

The user chose the sentence to predict its next word and four more words.
The outcome is printed in the side box.
I applied steps 1, 2, 3 apart for best performance because this process has significant time to execute. This process generates the file def-model-twitter.RDS. So, I use in the Shiny APP only this file as input that contains the 4-Grams Models to predict words.