Elena Chernousova
07/11/2016
Making N-gram model as probability distribution over strings that attemps to reflect how frequently a string occurs as a sentence based on the counts in a traning set.
Applying smoothing - calculating probabilities by taking into consideration order of words and their context: the unigram probability used should not be proportional to the number of occurrences of a word , but instead of to the number of different words that it follows.
Used resourses: NLP Lunch Tutorial: Smoothing by Bill MacCartney, 21 April 2005 http://nlp.stanford.edu/~wcmac/papers/20050421-smoothing-tutorial.pdf
Put one or two word into the text field for prediction
Expected result: The app would show five the most probabl word if there is prediction based on the your input, otherwise shown the most probable words computed regardless of input.