Data Science Capstone Project: Predict Your Next Word!

Luo Zhouyang
2021.3.14

Introduction

The objective of the Coursera Data Science Capstone Project is to build an application to predict the next word from a short phrase.

Several main objectives:

  1. Analyze large textual dataset
  2. Build an algorithm to predict words
  3. Create a predictive web application Shiny App link
  4. a five-page pesentation

Algorithm

  • Dataset: blogs, news and twitter source.
  • Processing: sampling, cleaning, local storage.
  • Modeling: n-grams model.
  • Prediction: search in the n-gram table, return the most frequent next word.

Usage

  • Wait for some time first to lauch the application (as the database is a bit large).
  • After the application is launched, enter a phase on the left.
  • The predicted word is shown immediately.

Thank you

  • Thank you for watching