2026-07-03

Project Overview

Data Science Capstone Project

Objective

  • Build a Next Word Prediction Application
  • Predict the next word based on user input
  • Develop an interactive web application using Shiny

Dataset and Prediction Algorithm

Dataset

The prediction model was built using three English language datasets:

  • Blogs
  • News
  • Twitter

Prediction Algorithm

  • Read and cleaned the text data
  • Converted text to lowercase
  • Removed punctuation and extra spaces
  • Generated a Bigram model
  • Predicted the next word based on the highest frequency match

Shiny Application

Technologies Used

  • R
  • RStudio
  • Shiny
  • dplyr
  • stringi
  • shinyapps.io
  • RPubs

Future Improvements

  • Add Trigram and Four-gram models
  • Improve prediction accuracy
  • Optimize response time
  • Enhance the user interface

Thank You