Word Predictor App

Kanu Dutta
15 April 2016

Background

  • Communication these days mostly through emails,social media,messaging apps
  • Texting is strenous, word prediction apps make typing swift and easy
  • App developed for predicting next word to be typed
  • Ingenious yet simple algortihm driving the engine
  • conceptual basis for model building is in natural language processing
  • R's memory limitation is well known
  • focus is on speed and accuracy: balanced approach in nutshell

Algorithm driving the engine

  • Kneser-Ney Smoothing
  • Modified version of absolute discounting algorithm
  • Calculates prob distribution based on histories,effectively smoothes out probablities
  • Diversity of histories important

  • a word like york or francisco should have lower unigram probability in case back off is performed from bigram to unigram, mainly owing to the fact both follow a word such as new/san

Model Generation

  • Data scrubbed and munged
  • unigram,bigram and higher order trigrams tables are created
  • probabilities are calculated using Kneser Ney smoothing
  • words are mapped to numbers using hash function,so as to reduce the size of the table object
  • ngram tables split into several smaller part for faster access
  • Top three words in decreasing order of their probabilites are suggested

How Stuff Works 1/2

alt text

  • user input in the box 'Type here' on the left side
  • need to clear the box before inputting
  • max eight word per sentence
  • five words are suggested on the right hand side
  • instant response,lean app

How Stuff Works 1/2

  • hash function is applied to Input text , and given to prediction model.

  • when the input box is empty , the most probale first word is returned from the unigram table

  • when the word is being typed, dynamically several permutation of words are also offered,

  • for eg: if kno is typed then knowledge, know and knows will be suggested

Thank You