Coursera Data Science Capstone Project

Utsav Prakash Srivastava
18th June 2020

Objective

The goal of this exercise is to create a product to highlight the prediction algorithm that you have built and to provide an interface that can be accessed by others. For this project you must submit:

A Shiny app that takes as input a phrase (multiple words) in a text box input and outputs a prediction of the next word. A slide deck consisting of no more than 5 slides created with R Studio Presenter (https://support.rstudio.com/hc/en-us/articles/200486468-Authoring-R-Presentations) pitching your algorithm and app as if you were presenting to your boss or an investor.

Review Criteria

  • Does the link lead to a Shiny app with a text input box that is running on shinyapps.io?
  • Does the app load to the point where it can accept input?
  • When you type a phrase in the input box do you get a prediction of a single word after pressing submit and/or a suitable delay for the model to compute the answer?
  • Put five phrases drawn from Twitter or news articles in English leaving out the last word. Did it give a prediction for every one?

Methodology used

The app makes intensive use of methodology “tidy data” principles achieved by the following steps:

  • Input:Raw text files
  • Clean training data by separating into 2, 3, 4 ngrams
  • Sort ngrams by frequency and save as repos
  • N grams function: uses a “back-off” type prediction model
  • Output: next word prediction

Make use of the Shiny App

  • Step 1: Open the App
  • Step 2: Input a phrase.
  • Step 3: Press “Submit”.