Capstone Project: Next Word Prediction App

Fabio Turetto Rodrigues

Problem and Goal

Typing on mobile devices can be slow and error-prone.
Our goal is to build an app that predicts the next word given a phrase to improve typing speed and accuracy.

Data and Preprocessing

  • Data source: SwiftKey dataset (Blogs, News, Twitter)
  • Cleaning steps:
    • Lowercasing, removing punctuation and numbers
    • Removing stopwords and profanity
    • Tokenization and stemming

Prediction Algorithm

We use a simple backoff n-gram model: - Try to match last 3 words → trigram - If not found, use last 2 → bigram - If still not found, use last word → unigram

About the App

Summary and Future Work

  • Shows potential of n-gram modeling
  • Future: personalization, LSTM, BERT