Next Word Prediction App

2025-09-01

Slide 1: Problem & Data

Goal: predict the next word given a user-typed phrase.
Data: SwiftKey English corpora (blogs, news, twitter).
We sampled data for prototype (10k lines per source).

Slide 2: Model

Approach: Back-off n-gram model (trigram -> bigram -> unigram).
Implementation in R using tidytext, dplyr.
Quick, explainable, small memory footprint for Shiny.

Slide 3: Performance

Simple evaluation: top-1 accuracy on small held-out set (example result shown in report).
Prediction time: <1 second per query on a typical laptop.
Trade-off: simplicity vs. advanced ML (RNN) — good for prototype & fast deployment.

Slide 4: The App

Shiny app: text input, single-word prediction output.
Includes example test phrases.
Easy to deploy on shinyapps.io and share.

Slide 5: Demo & Next Steps

Demo five test phrases and show predictions.
Next steps: use larger model, implement top-3 predictions, add smoothing (Kneser-Ney) and backoff, deploy updated app.