## Project Overview
Develop a predictive text application that predicts the next word
using N-gram language models.
Objective
Provide fast and accurate next-word predictions through an
interactive Shiny application.
Exploratory Data Analysis
Dataset Summary
| Blogs |
899,288 |
| News |
1,010,206 |
| Twitter |
2,360,148 |
### Key Findings
- Word frequencies follow Zipf’s Law.
- Frequent bigrams and trigrams improve prediction accuracy.
- A small vocabulary covers most text.
Prediction Algorithm
Model
- Unigram
- Bigram
- Trigram
- Backoff Strategy
Flow
Input → Trigram → Bigram → Unigram → Prediction
Shiny Application
Features
- User enters a phrase.
- Predicts the next word.
- Real-time response.
Example
- one of → the
- going to → be
- thank you → for
Results and Conclusion
Future Work
- 4-gram models
- Better smoothing
- Larger datasets