Slide 1: Problem Statement
- Predict the next word based on user input
- Useful for autocomplete systems
- Built using R and NLP techniques
Slide 2: Data Used
- Blogs dataset
- News dataset
- Twitter dataset
- Combined and sampled for training
Slide 3: Approach
- Text preprocessing (lowercase, clean text)
- Tokenization into unigrams, bigrams, trigrams
- Frequency-based n-gram model
- Backoff strategy used for prediction
Slide 4: Model Logic
- Trigram match → highest priority
- Else Bigram match
- Else Unigram fallback
- Final output = most frequent next word
Slide 5: Deployment
- Built interactive Shiny app
- Hosted on shinyapps.io
- Real-time prediction interface