Next Word Prediction Project

Slide 1: Problem Statement

Predict the next word based on user input
Useful for autocomplete systems
Built using R and NLP techniques

Slide 2: Data Used

Blogs dataset
News dataset
Twitter dataset
Combined and sampled for training

Slide 3: Approach

Text preprocessing (lowercase, clean text)
Tokenization into unigrams, bigrams, trigrams
Frequency-based n-gram model
Backoff strategy used for prediction

Slide 4: Model Logic

Trigram match → highest priority
Else Bigram match
Else Unigram fallback
Final output = most frequent next word

Slide 5: Deployment

Built interactive Shiny app
Hosted on shinyapps.io
Real-time prediction interface