2025-07-13

Project Overview

  • Built a smart next-word prediction app
  • Trained on:
    • Blogs
    • News
    • Twitter
  • Predicts the next word using n-gram frequency tables

Algorithm

  • Cleaned and tokenized data
  • Built:
    • Unigrams
    • Bigrams
    • Trigrams
  • Back-off strategy:
    • Trigram → Bigram → Unigram

Example: predict_next_word(“I would like”) → “to”

The Shiny App

  • How it works:
    • Enter a phrase in the input box
    • Click “Predict”
    • App returns the most likely next word
  • Technologies Used:
    • tidytext, dplyr, stringr – for preprocessing and prediction
    • .rds files – for fast loading of n-gram models
  • Try It:
    • “The president of” → ?
    • “We are going” → ?

Slide 4: User Experience

  • 🧠 Fast & Intelligent:
    • Predictions load in under a second
    • Handles incomplete or informal inputs
  • 🖥️ Interface:
    • Clean, distraction-free UI
    • Works on mobile and desktop
  • 🎯 Applications:
    • Predictive typing
    • Chatbots and AI Assistants
    • Language learning apps

Slide 5: Why It Matters

  • 🔍 Problem Solved:
    • Reduces typing effort
    • Helps generate contextually correct words
    • Improves UX in language-based apps
  • 🚀 Business Impact:
    • Easily integratable into messaging apps, editors, keyboards
    • Could scale with deep learning (e.g., GPT-based models)
  • Would I hire this data scientist?
    • Yes – built full NLP pipeline
    • Deployed real-time product
    • Clear user-centric thinking