Vaz, Thiago
2018, March 14th
There are two trends changing the way we deal with communication: (i) easy access to information and (ii) new channels spreading this info in a way we've never seen before.
With the purpose to facilitate the process, intelligent systems are been developed to help us with the unnecessary issues or to accelerate common activities, mainly based on a previous knowledge database.
To achieve this goal the app is based on model with the following steps:
Data was acquared from (a sample of) twitter, newspapers and blogs
Data was pre-processe/cleaned to enable better analysis and avoid bad behaviors (like profanity words)
We calculated the n-grams, a contiguous sequence of n items from the given sample of text
Creation of a Katz's back-off model, that estimate conditional probability of a word given by its history in the n-gram and “backing-off” data to smaller histories.
Considering the experience user will face using the app, I've made some choices (such as reducing n-grams size) to get a better performance (avoiding legging between one word and another).
Specially for this model, the performance is related with the size of n-grams.
However, the main use case (predicting through a typing process) requires almost real-time responses.
To address this issue and balance accuracy and time, as furthur improvements, I would test new models/frameworks (Markov and RNN's) deployed in a scalable infrastructure.
There is also another “tab” called “About”, where the user find additional information about the project
Product Screenshot