Data Science Capstone Assignment

06 December 2025

This project focuses on developing a Next Word Prediction App using Natural Language Processing (NLP) techniques in R.

The application is built using Shiny and the report is presented through R Markdown using a slide-based layout.

Application Overview

Click the Below Link for the Application.

Click the link below for compiled project files on gitub repo.

The data used for this project comes from the HC Corpora dataset, which includes text from:

The original dataset is very large (over 500MB). To optimize performance:

library(quanteda)
summary(corpus_data)