NLP prediction model

Leon Joshua Gensel

Data Science Specialization Capstone Project

Motivation

Someone using their smart is probably one of the most common views in public one can get. Waiting for someone, using public transport or sitting in Uni… no matter what, people use their smartphone, on average even over 3 hours each day (Source). And a lot of that time is again spent typing: messaging or looking something up. Studies have shown that intelligent text entry techniques positively correlate with higher typing speed. This means that consumers should have high demands for typing features that are gonna make their lives easier and save time. One of such feature would be auto-complete, which is already commonly used and exactly what this app is good at!

Model characteristics

  • instant text prediction
  • small memory usage (~20 MB RAM)
  • around 16% accuracy in word prediction on test set
  • quantifies probabilities/uncertainty

How the app works:

  • The model is an n-gram language model with up to n=4

  • Word probability is calculated as follows::

\[ Pr(w_{i}|w_{i-1}) = \frac {count(w_{i-1},w_{i})} {count(w_{i-1})} \]
\[ \text{where } w_{i}, w_{i-1} \text{ are the last word, and n-1 preceding words.} \]


  • The model uses an over 1 billion line corpus as data

How you can use it

  • Just click here
  • enter your prompt that you want auto-completed. DONE!
  • if you want to see the other words that were in the running, what n-grams are consider and the probabilities of each suggestion you can turn on Diagnostics

Thanks for reading! :)