Takes as input a phrase (multiple words) in a text box;
Outputs a prediction of the next word on another text box; and
Works only with text in English.
How does it work?
How it was built?
First was created a data sample from the imported data available to the project;
Then this sample was cleaned by conversion to lowercase, removing punctuation, links, whitespace, profanation words, numbers and all kinds of special characters;
The data sample was also tokenized into bi-,tri- and quadgram;
The n-gram term frequency matrices have been transferred into frequency dictionaries;
Resulting data frames are used to predict the next word for the text inputed based on frequencies of the underlying n-grams disctionaries.