This project builds a next-word prediction model using natural language processing techniques.
Goals:
- Uses n-gram modeling (trigrams)
- Predicts the next word based on previous two words
- Built and deployed as a Shiny web application
2026-06-20
This project builds a next-word prediction model using natural language processing techniques.
Goals:
The model is trained on three datasets (Blogs, News, and Twitter) and processed into trigrams. A sample of the data was used to improve performance and reduce computation time
Steps:
Example:
The prediction function: - Cleans input text (lowercase, remove punctuation) - Splits input into words - Uses pattern matching to find matching trigrams
matches <- trigram_df[grepl(paste0("^", last_two), trigram_df$text), ]
predicted <- most_freq_word(matches)
The Shiny app allows users to:
The application includes:
Key achievements: