Cristian Santa
January 2016
The predictive model is based on Katz's back-off model. Essentially, This Means That if the a n-gram has-been seen more than k times in training, the conditional probability of a word Given STI history is proportional to the maximum likelihood estimate of That n-gram. Otherwise, the conditional probability is equal to the back-off of the conditional probability “(n-1)-gram”.
The goal of this application is to predict the next word in a sentence that the user types in a text box. The dataset used for this app is part of a set of corpus called HC Corpora in English, divided into three sources: News, Blogs and Twitter.
he corpora have been collected from numerous different webpages, with the aim of getting a varied and comprehensive corpus of current use of the respective language.