Deck Slides Natural Language Processing (NLP) Project

Dat Nguyen
23-Feb-2018

Summary Description

The Shiny App will use n-grams to predict the next word of a statement. The user will input a statement into the Shiny ui.R page. Once the statement is input, the user should press the predict next word button. By clicking this button, the statement (string) will be sent to the Shiny server.R page. Once the submit button has been pressed by the user, the server.R page will determine the what the most likely next word is by using n-grams. Furthermore, the application utlizes nGram information from Corpus is used as part of its processing algorithm.

Algorithm for Processing

  • Application will accept an input in the form of a statement from the user.
  • Application will read in a data set of nGrams Table of Corpus in order to gain a dataset that will help the application establish frequencies and likelihoods of what the next word might be for the inputted statement.
  • Application tidies up the data by removing unnecessary characters and blank spaces.
  • Application then analyzes the data and puts them into categories by determining the most likely frequencies of the word using combinatorics.
  • Application will utilize the newly created n-Grams to predict what the most likely next word in the statement string should be.

My initial Exporatory Analysis

References