What Am I Gonna Say?

M M

2026-01-04

🔍 What It Is

  • A Shiny app that predicts the next word given a user’s input
  • Trained on blogs, news, and Twitter from online datasets
  • Uses a stupid backoff n-gram model for fast and simple prediction
  • Accepts 1–2 word input and returns the top 3 likely next words

🧠 How It Works

  • Trigram model: predicts based on last 2 words
  • Bigram model: backs off to last 1 word
  • Unigram fallback: most frequent words
  • Scoring:
    • Trigrams × 1.0
    • Bigrams × 0.4
    • Unigrams × 0.16
  • Returns top 3 candidates by cumulative score

📊 Why It’s Awesome

  • Training corpus: 4 million+ lines
  • Speed: ~50ms average prediction time in Shiny
  • Accuracy (manual eval):
    • Top-1 match: ~18%
    • Top-3 contains correct word: ~37%
    • Perplexity (approx.): ~140–180
  • Model size: <20 MB for deployment
  • 💡 Can be enhanced with smoothing, confidence bars, or word embeddings!

🖥️ Live Demo: Shiny App

  • Input: how are
  • Output:
    1. you
    2. we
    3. they
  • Input: let's go
  • Output:
    1. ahead
    2. back
    3. now

Simple, fast, intuitive — designed for fun interaction!