What's Next (Text Prediction Application)

Omayma Said
30 Dec 2016

About What's Next

What's Next is a Shiny application that uses NLP models for text prediction. It provides two options as shown in the screenshots:

  • Next Word Prediction

  • Trigram Model Probability

Drawing
Fig.1-What’s Next Tabs

-What's Next on Shinyapp.io: HERE
-Created by OmaymaS

Getting Started With What's Next

Next Word Prediction

Enter text to get the top-3 predicted words based on your input. (similar to mobile phones and other gadgets' keyboards).
Input W1 W2 W3
here we go to on again
I'd like to see you the what
I'd like to say thanks for the follow first RT

Trigram Model Probaibiliy

Enter three words (Trigram) to get its probability

Input Output
here we go P(Here we go)
and we can do the P(can do the)
the first Not found in the trigram model
  • If Input>3 word, only the last 3 are considered.
  • If Input<3 word, or if the trigram is not found in our model, the user gets a message "Not found in the trigram model

Technical Details (Katz's backoff Model) 1/2

Using 70% of the given dataset:
  • data was cleaned removing symbols, profanity, extra spaces, non-latin characters and certain words/letters.
  • unigram, bigram and trigram datatables were generated with the count of each entry.
  • Drawing
  • Good-Turing discounts were calculated for Nr<=5 (Nr: frequency of frequency).
  • Probability is calculated using Katz’s Backoff model as shown in the example in Fig-2.

Advantage: Reactive

Tradeoffs: Accuracy Vs. Speed

Areas for Improvement: Application Startup Speed

Technical Details (Katz's backoff Model) 2/2

Fig.2- P(here we go) using Katz’s Backoff