This project was developed as the final capstone for the Johns Hopkins University Data Science Specialization on Coursera. The objective was to design and implement a word prediction application using R and Shiny, applying natural language processing (NLP) techniques to build a functional and interactive tool.
As part of the project, we were provided with a large corpus of text from HC Corpora, which includes content from blogs, Twitter, and news articles. Although the corpus supports multiple languages, we were required to work exclusively with the English-language datasets.
The final product is a Shiny web application that allows users to input text and receive up to four predicted next-word suggestions, enhancing typing efficiency and demonstrating the power of statistical language modeling.