1/2/2020

Overview

I have created a Shiny app that predicts the next word based on data that was used on twitter, blog, and the news. Feel feel to try it! https://ainsleehampel.shinyapps.io/TestFinal/

  • The Goal of the shiny app is to predict the next word.

The Shiny App

  • The shiny app is very user friendly.
  • Just start typing in the word box and the machine will start guessing.
  • If the machine is unsure of the next word, it will say "No match found, not found in data it was trained on"
  • The App even comes with easy to follow instructions
  • The app can easly be implemented with your product for your customers.

Collecting the Data

Real world data is never cleaned. So after briefly looking over the data, I know it needed some work.

  • The app is using the twiter, news, and blog data for a larger sample size. The larger sample size will expose the app to more phrases.
  • There is a lot of lower and upper case letters.
  • There are some punctuation and numbers.
## [1] "How are you? Btw thanks for the RT. You gonna be in DC anytime soon? Love to see you. Been way, way too long."  
## [2] "When you meet someone special... you'll know. Your heart will beat more rapidly and you'll smile for no reason."
## [3] "they've decided its more fun if I don't."                                                                       
## [4] "So Tired D; Played Lazer Tag & Ran A LOT D; Ughh Going To Sleep Like In 5 Minutes ;)"                           
## [5] "Words from a complete stranger! Made my birthday even better :)"

Cleaning th Data

  • First, I combined the twitter, news, and blog data.

  • To clean data, I removed case sensitive letters, numbers and punctuation.

Building the Algorithm

  • I used n-grams: Quadgram, Bigram , and Trigram
  • By default, the model looks at the 4-gram and then downgrades to the 2-gram or the 3-gram.
  • Next steps: Give word values ("Hate" a negative connotation and "amazing" is a positive connotation). You can determine the emotions of the user with this idea!