Word Prediction Tool
Helping people to communicate better and faster
Coursera Data Science Capstone Project
C. Werneck
may, the fourth, 2018
The subject matter
When typing on computers, tablets and cell phones, people want to quickly find the words to send their messages.
A tool to accomplish that task, to be incorporated into mobiles, tablets and other digital user interfaces, is not a trivial one. It must consider the Corpus of words usualy typed by the users, and evolve along the time with the particular vocabulary each one uses.
This Capstone Project proposes a tool to perfom the task, improving mobile typing with techniques from Data Science courses.
The tool's backstage
The approach adopted
To create the Corpus of text to be used, the initial bunch was taken from the Coursera partner SwiftKey, that provided texts from approximately 0.8 mi blogs, 1.5 mi news and 2 mi tweets. These texts where cleaned to eliminate not desired characters.
Using an idea borrowed from Linguistics, we took a statisticaly significant percentage of phrases, that were organized in n-grams, calculated the frequency for each of the n-grams bunch, and converted into tables to a dataBase, to be searched by the application.
How to use it
The app we propose is simple and intuitive to use.
Access https://cwerneck.shinyapps.io/DS_CP_NextWord/
While you type the characters, the app suggests the three more statistically relevant words that follow the group, according to the initial Corpus. The user can click on one of the words to incorporate it to the text. Typing space, he marks the beginning of the next word.
It is very cool but...
As the app was being developed, some new ideas came for future implementations.
- To help people write more extensive texts, the app can be transformed in a kind of thesaurus suggesting words that are usualy used together.
- To help people in hospitals and police stations to fill the forms where a narrative is necessary.
- A Glossary with the main words used by the user may be added to the original Corpus, and another app can periodically reduce the size of the Corpus, eliminating the “never used” words.
New functionalities will keep helping people to communicate better and faster!
Last words
Word Prediction is a fascinating issue and some links were consulted to help the conception of the app.