Predicting Your Next Word App

Sathvik

Introduction

This application generates the upcoming word prediction through a backoff n-gram language model that was trained with English content from blogs and news articles and Twitter sources.

The goal is to demonstrate a data product that applies natural language processing to make predictions in real time and is deployed via Shiny.

The Algorithm

We built a backoff n-gram model:

  • Trigram Model: Uses the last two words to predict the next word.
  • Bigram Model: If no trigram match, uses the last word only.
  • Fallback: If no bigram match, default to “the”.

Data source:

  • 3% sample from en_US.blogs.txt, en_US.news.txt, en_US.twitter.txt.
  • Cleaned, tokenized with tidytext.
  • Most common n-grams selected to reduce memory and increase speed.

The App

Try it out: (https://sathvikreddy.shinyapps.io/capstoneshinyapp/)

Instructions:

  1. Enter a phrase into the input box.
  2. Click “Predict”.
  3. The app shows the predicted next word in real time.

The model is loaded with .rds files for efficiency and uses minimal memory on deployment.

User Experience

The app is designed to be:

  • Simple: Just type and click.
  • Fast: Predicts instantly for short phrases.
  • Accessible: Works on mobile or desktop.
  • Smart: Uses a smart backoff model that learns from patterns in English.

This app could be adapted to improve keyboard prediction, chatbots, or voice assistants.

Why It Stands Out

Efficient: Memory-friendly and fast
Accurate: Uses real-world English sources
Scalable: Can expand to larger models or deep learning
Deployed: Fully functional on shinyapps.io
Ready to integrate into real-world applications

Would you hire me?
With this app, I’ve shown I can analyze big text data, build predictive models, and deploy data products effectively.