Vocabulary that fits every occasion

You are adaptative and so is your writing

Your writing reflects who you are. It naturally adapts to the diverse situations of your life.

When you are writing a scientific paper, or a letter to your significant other, the word choices you make are different. However, the existing next word predictors don't take into account the kind of communication you are performing.

Would't it be great if the words you get while typing are just right for the moment?

A next word predictor ready to follow your steps

Next is a word prediction tool that helps you write more effectively. It keeps different data sets for different kinds of written communications:

  • Tweets;
  • Blog posts;
  • News articles;
  • And much more to come!

No more slang when writing an important business email. No more jargon when writing an SMS to your kids.

Using Deep Learning technology

Most word predictors are based in n-grams and, thus, are not context aware.

I bought an apple. I am eating the ___
In this example “apple” is more likely than “banana”, but the n-gram model may fail to detect.

Neural networks can use the context through a technique called long short term memory (LSTM) [1].
A Recurrent neural network gives itself feedback from past experiences.
Source [1]

LSTM and Embedding

Embedding is a technique to convert words into values. Using this approach, we may notice that “dog” and “cat” are similar terms. The values are learned during the training phase.

Word embeddings are stored in a lookup table. Given a word, the word vector of numbers is returned. Given a sentence, a matrix of vectors for each word in the sentence is returned.
Source: [1]

Model Training and Next Steps

  • The recurrent neural network was trainned using the first 100,000 lines of the blogs and news data sets, and the first 200,000 of the twitter data set.
  • We used a context of 15 words, and a neural network with 100 neurons per hidden layer (2).
  • Next steps would be scaling the model and building an app that is aware of the application being used and the interlocutor. That way the predictions can be even more targeted.

[1] Understanding Natural Language with Deep Neural Networks Using Torch. Soumith Chintala and Wojciech Zaremba http://devblogs.nvidia.com/parallelforall/understanding-natural-language-deep-neural-networks-using-torch/