Project Overview
- Developed a web-based Next Word Predictor App using
R and Shiny.
- Built on Natural Language Processing and n-gram frequency
models.
- Data source: SwiftKey corpus (blogs, news, Twitter).
- Live at: shinyapps.io
How the Model Works
- Corpus loaded and cleaned with
tm and
stringr packages.
- Created unigrams, bigrams, trigrams using
tidytext.
- Prediction logic:
- 3+ word input → trigram match
- 2 words → bigram match
- 1 word or no match → unigram fallback
- Fast lookup using frequency tables saved as
.RDS
Conclusion & Future Work
- Great proof-of-concept of applied NLP in R
- Can be expanded to:
- Suggest multiple predictions
- Visualizations (word cloud, frequency charts)
- Advanced models (LSTM, BERT)
- A strong foundation for AI-powered typing tools
- Ready for production — just needs more data 🚀