Monalisa H V
1/8/2026
## Introduction
This project presents a next word prediction application built using R and Shiny.
The app predicts the most likely next word based on a user-entered phrase.
## Data Used
The following datasets were used:
- US Blogs
- US News
- US Twitter
These datasets contain millions of English sentences used for language modeling.
## Prediction Algorithm
- Text cleaning and tokenization
- N-gram models (unigram, bigram, trigram)
- Frequency-based probability estimation
- Backoff strategy when higher-order n-grams are unavailable
## Shiny Application
- User enters a phrase
- Application predicts the next word
- Simple and responsive interface
- Designed for real-time prediction