Text prediction

Theofilos Papapanagiotou
7 December 2014

About this product

Based on user input, predict the next word he wants to type.

alt text

The capstone project required to finish the data science specialization organized by:

Product features

Quantity of data that brings quality:

  • Dataset of 1 million blog entries, 1m news stories and 2m tweets.

Speed of prediction that triggers the user interest:

  • Clean from eroneous and small probability ngrams.
  • Hashes of 2, 3 and 4-grams to reduce the model size.

Acuracy of prediction that engages users:

  • Frequency of words in n-grams
  • Prediction based on the combined probability of the last words.

Ngrams training and prediction data flow

alt text

Powerful engine, simple UI.




User experience

Love the simplicity

https://theofpa.shinyapps.io/capstoneapp/