WordPsychic

Apurv Kaushal
24-02-2019

An n gram model appraoch to predicting the next word

App Details

Word Teller is a web app for predicting the next word of a sentence.

It can find its usage in messaging apps and type interface apps.

It has been built on an N Gram Model under the validity of Markov Chain Assumptions. Some of the salient features of the development process

  • N=3. It is a 3 gram model, predicting the next word based on the last two words of the sentence.
  • Trained by taking data from news, blogs and twitter data set of the English Language.
  • The training size used for development consists of around 840,000 sentences in English.

App Interface

alt text

The Word Teller App has two input features

  • Enter your words here: Ask the user to enter the sentence fragment in this text box
  • How many words recommended: Ask the user how many word options will they want to view for the next word

App Instructions & Features

The user then has to click on the submit button. By default, “hi you” and 1 are the input values.

Using these two input features the app predicts the next word that belongs to either of the two categories

  • StopWords: Frequently occuring words of the language
  • Non StopWords: Other Words of the language

If the user chooses “hi you” and 1 as the input, then the app predicts the top 1 word from all possible options for the next word of both the stop word and non stop word categories.

App Performance

The Accuracy of Prediction when tested on

  • 1000 test set examples is 58.5 %
  • 10000 test set examples is 28 %

Bias & Variance Check

  • Train set error on 10000 examples is 31 %
  • Test set error on 10000 examples is 28 %