WordSage

Ziwen Li
27/01/2021

Concept

WordSage is a shiny app that predicts the next word based on user input

  • It enables faster typing on a daily basis when implemented in mobile phone keyboards.
  • It helps people with communication disorders to express themselves better when implemented in augmentiative/alternative communication devices.
  • It is the product of my final project of the Data Science Capstone course of Data Science Specialisation.
  • Link to WordSage

Algorithm

  • The corpora was generated by combining and sampling three coporas collected from twitter, news and blogs written in English provided by the capstone project.
  • N-grams (2-grams, 3-grams and 4-grams) were generated from the combined copora with profanity check.
  • Frequency based ranking was generated based on the frequency of each feature in each n-gram. The top 100,000 ranked features were included in the final n-gram database.
  • User inputs are cleaned and trimmed down to 3-grams if necessary before searching the match.
  • The searching strategy is to search from high to low number grams until a hit is found. If no hits found after the search, it will return “Sorry, we need more data to process your request”.
  • Top predicted phrases are generated and provided and the predicted word is displayed.

User interface

  • WordSage uses a slick design with the input and output clear and eye-catching.
  • It reiterates users' input, returns the word with the highest frequency rank and provides the top predicted features as alternatives.
  • It provides a progress indicator when loading the database.
    User interface

Ackownledgement

I would like to thank all instructors and the learning community for their kind guidance and support through the specialisation.