SwiftPredict: Next Word Prediction

March 22, 2026

Introduction

Typing on mobile devices can be slow and prone to errors. SwiftPredict solves this by providing real-time word suggestions.

The model was built using the HC Corpora dataset, comprising millions of lines from Twitter, Blogs, and News.

Sampling: 1% of total data used to optimize Shiny memory limits.
Cleaning: Used stringi and tidytext to remove numbers, punctuation, and profanity.
Tokenization: Created N-gram frequency tables (Bigrams, Trigrams, and Quadgrams).

We implemented a Stupid Back-off algorithm, which is the industry standard for speed and efficiency in web apps.

The application is hosted on shinyapps.io and features a minimalist design.

Input: Reactive text box that captures user phrases.
Output: Instantaneous display of the predicted next word.
Optimization: Data is stored in compressed .rds files, ensuring the app loads in under 2 seconds.

[Insert App Screenshot Here]

SwiftPredict demonstrates that complex NLP models can be deployed as lightweight web applications.

Live App: [Link to your shinyapps.io]
Source Code: [Link to your GitHub]
Next Steps: Integrating “Katz Back-off” for higher accuracy and expanding the dictionary.

Thank you!