Word Predictor App

Kanu Dutta
15 April 2016

kanudutta@gmail.com

App: https://kanudutta.shinyapps.io/word/

Background

Communication these days mostly through emails,social media,messaging apps
Texting is strenous, word prediction apps make typing swift and easy
App developed for predicting next word to be typed
Ingenious yet simple algortihm driving the engine
conceptual basis for model building is in natural language processing
R's memory limitation is well known
focus is on speed and accuracy: balanced approach in nutshell

Algorithm driving the engine

Kneser-Ney Smoothing
Modified version of absolute discounting algorithm
Calculates prob distribution based on histories,effectively smoothes out probablities
Diversity of histories important
a word like york or francisco should have lower unigram probability in case back off is performed from bigram to unigram, mainly owing to the fact both follow a word such as new/san

Model Generation

Data scrubbed and munged
unigram,bigram and higher order trigrams tables are created
probabilities are calculated using Kneser Ney smoothing
words are mapped to numbers using hash function,so as to reduce the size of the table object
ngram tables split into several smaller part for faster access
Top three words in decreasing order of their probabilites are suggested

How Stuff Works 1/2

alt text

user input in the box 'Type here' on the left side
need to clear the box before inputting
max eight word per sentence
five words are suggested on the right hand side
instant response,lean app

How Stuff Works 1/2

hash function is applied to Input text , and given to prediction model.
when the input box is empty , the most probale first word is returned from the unigram table
when the word is being typed, dynamically several permutation of words are also offered,
for eg: if kno is typed then knowledge, know and knows will be suggested

Thank You