2023-06-20

Building a predictive text model

  • Create an algorithm for predicting the next word given one or more words as input using NLP

  • A large corpus of blog, news and twitter data was loaded and analyzed

  • N-grams were extracted from the corpus and then used for building the predictive model

  • Various methods of improving the prediction accuracy and speed were explored

Algorithm

  • N-gram model strategy was used

  • Dataset was cleaned, lower-cased, removing links, twitter handles, punctuations, numbers and extra whitespaces, etc

Shiny App

  • Provides a text input box for user to type a sentence.

  • Detects words typed and predicts the next word reactively.

  • Shows the time needed to predict the next word

  • Shows word guessed using last two words, three words, and four words

  • Predicts using the longest, most frequent, matching N-gram

App and resourses