Metin Turgal
18.04.2016
Basically this application predicts the next word of the text written by the user with the help of the language model build on a sample twitter, blogs and news data set.
The steps to build the model is as follows:
The algorithm of the language model consists of the following steps
1. Take manageable samples of the data
2. Build 2-gram, 3- gram and 4-grams with frequencies
3. Sort them according to the frequencies
4. The match with the longest Ngram gains priority according the frequency.
5. If not found in N grams the most frequent word is given as the prediction.
The application interface is simple: The user enters text and upon pressing submit button application provides the user with the next word predicted from the model.
Even though the prediction can be based on 2 word, 3 word and 4 word groups only the aggregated best result is shown to achieve a lean and simple interface. Even though the application is very self- explanatory, Instructions are also given in another tab in the application.