Next-Word Prediction Using N-gram Language Model
2026-03-10
5-gram Language Model with Stupid Backoff
Model Statistics:
Interactive Shiny Application
Real-Time Prediction: Enter text, get instant predictions with confidence scores
Top 5 Results: View likely words with visual bars
Performance Metrics: See n-gram level and response time
Clean Predictions: Filters stopwords, returns meaningful words only
1. Enter Text - Type a phrase (e.g., “I’m going to”)
2. Click Predict - Press button or use example sentences
3. View Results - Top 5 predictions with confidence percentages
4. Explore Details - Check n-gram level, response time, model stats
Process: Clean text → Extract context → Search tables → Apply backoff → Filter → Return predictions
Training Data: 50% OpenSubtitles, 20% Movies, 15% Twitter, 10% Blogs, 5% News
Evaluation: - 90% accuracy (multiple choice) - 100% meaningful predictions - 20% accuracy (open prediction)