WordSage

Ziwen Li
27/01/2021

WordSage is a shiny app that predicts the next word based on user input

It enables faster typing on a daily basis when implemented in mobile phone keyboards.
It helps people with communication disorders to express themselves better when implemented in augmentiative/alternative communication devices.
It is the product of my final project of the Data Science Capstone course of Data Science Specialisation.
Link to WordSage

The corpora was generated by combining and sampling three coporas collected from twitter, news and blogs written in English provided by the capstone project.
N-grams (2-grams, 3-grams and 4-grams) were generated from the combined copora with profanity check.
Frequency based ranking was generated based on the frequency of each feature in each n-gram. The top 100,000 ranked features were included in the final n-gram database.
User inputs are cleaned and trimmed down to 3-grams if necessary before searching the match.
The searching strategy is to search from high to low number grams until a hit is found. If no hits found after the search, it will return “Sorry, we need more data to process your request”.
Top predicted phrases are generated and provided and the predicted word is displayed.

WordSage uses a slick design with the input and output clear and eye-catching.
It reiterates users' input, returns the word with the highest frequency rank and provides the top predicted features as alternatives.
It provides a progress indicator when loading the database.

I would like to thank all instructors and the learning community for their kind guidance and support through the specialisation.