Problem Statement

Typing on mobile is slow and error-prone.
The goal of this app is to predict the next word a user is likely to type, based on an input phrase.
This can improve typing speed, accuracy, and user experience.

Data and Method

  • Dataset: SwiftKey corpus (blogs, news, Twitter)
  • Cleaning: lowercase, remove punctuation/numbers/profanities/stopwords
  • N-gram models: 1- to 4-gram with RWeka
  • Storage: Frequency tables in .Rds format

Prediction Algorithm

  • Backoff strategy:
    • 4-gram match → if not found,
    • 3-gram → then 2-gram → fallback to unigram
  • Fast and simple lookup based on string matching

The Shiny App

Conclusion

  • Working prototype of a word predictor
  • Core logic implemented in base R
  • Potential for future enhancements:
    • smoothing
    • personalization
    • neural models / transformer-based approaches