January 13, 2026

Executive Summary

  • This application provides a real-time text prediction experience.
  • As users type, the algorithm suggests the most likely next word.
  • The goal is to improve typing speed and efficiency on mobile or web interfaces.

The Prediction Algorithm

  • The app uses a N-gram Back-off Model.
  • Trigrams: It first looks at the last two words typed to find a match.
  • Bigrams: If no trigram exists, it looks at the last single word.
  • Unigrams: If all else fails, it suggests the most frequent word (“the”).

Data Cleaning & Processing

  • The data was sampled from a large corpus of Blogs, News, and Twitter data.
  • Text was converted to lowercase and punctuation was removed.
  • Frequency tables (RDS files) were created to ensure the app remains fast and responsive.

How to Use the App

  • Navigate to the ShinyApps Link.
  • Type your phrase into the text input box on the left.
  • The prediction appears instantly in blue text on the right.
  • It is designed to be lightweight and work on any browser.

Future Improvements

  • Implement “Kneser-Ney” smoothing for better accuracy on rare words.
  • Expand the dictionary to include more diverse vocabulary.
  • Add a feature to predict the next three likely words instead of just one.