2025-07-27
Introduction
- This project is the final Capstone for the Coursera Data Science Specialization.
- Objective: Build a Shiny app that predicts the next word based on user input.
- Based on SwiftKey NLP problem using real-world datasets (blogs, news, Twitter).
- The app is available at: https://9rks8u-shashank-r.shinyapps.io/finalproject/
Model Building
- Created N-grams: unigrams, bigrams, trigrams.
- Used
tokenizers, tidytext, and data.table for speed.
- Stored frequency tables for prediction.
- Example:
- Input: “I love”
- Trigram match → “you”
Prediction Algorithm & Shiny App
- Backoff Strategy:
- Try trigram match.
- If not found, fallback to bigram.
- If still not found, return top unigram.
- Deployed as Shiny app on
shinyapps.io.
- Input box predicts next word instantly.
Conclusion