Next Word App

Ricardo S. Carvalho
11 June 2016

Capstone Project

Data Science Specialization

How does the app calculates probability of next word?

Why the app uses Modified Kneser-Ney Smoothing?

Because it is one of the best known backoff smoothing methods. Reference: http://www.cs.jhu.edu/~jason/465/PDFSlides/lect05-smoothing.pdf

How does the app deals with unknown trigrams, bigram or unigrams?

It performs backoff for unknown trigrams to bigrams, bigrams to unigrams, and uses only known unigrams.

alt text The only difference for the modified is that the discount is different for each n-gram.

How to start using?

Start by typing any text in the input available on the left part of this page.
Right above the input the app shows three sugestions for next work based on the input text provided.

What are these sugestions for next word?

They are from LEFT-TO-RIGHT the most probable words you would type based on the input provided.
Therefore, the FIRST WORD on the LEFT is the MOST PROBABLE WORD you would type based on the input provided.

How is the app so fast to show the results?

It already loads pre-computed probabilities, so it does not re-calculates every time, just performs a fast lookup.

What is the novel approach here?

Modified Kneser-Ney Smoothing with backoff combined and super fast results for the suggested next word predictions.