2026-01-11
Problem Overview
- Modern typing applications assist users by predicting the next word.
- These systems improve typing speed and user experience.
- This project demonstrates a simple next-word prediction system using R.
Dataset and Motivation
- The approach is inspired by the SwiftKey text dataset (blogs, news, Twitter).
- For deployment efficiency, a small representative corpus was used.
- This ensures fast loading and reliable performance on shinyapps.io.
Prediction Algorithm
- Text is converted to lowercase and tokenized.
- Bigrams (two-word sequences) are generated.
- The last word entered by the user is matched against bigrams.
- The most frequent following word is returned as the prediction.
Shiny Application
- User enters a phrase containing multiple words.
- On clicking Predict, the app computes the next word.
- The output is a single predicted word.
- The app is lightweight, responsive, and easy to use.
Conclusion
- Demonstrates core Natural Language Processing concepts.
- Efficient and suitable for real-time prediction.
- Can be extended to larger datasets and higher-order n-grams.
- Deployed using R Shiny and shared via RPubs.