10/16/2020

App Description

This app was created as the final project in the Data Science Specialization in the Johns Hopkins/Coursera Capstone course. The brief was to create a natural language processing application that takes a phrase as input in a text box and outputs a prediction of the next word.

The data used for this app was provided by SwiftKey, and I would like to thank them for their sponsorship.

Here is a link to my app: https://mjoyce2017.shinyapps.io/Capstone/

App Instructions

The app I created is fairly simple and straightforward.

There is a text input box. You can input any number of words in here, and a prediction will be made below.

If the predictor needs more words to make a valid prediction, it will say so.

Reminder: the app does take a few seconds to load so please be patient.

App Functionality

The first tab is a brief introduction to the project. The second tab contains the application.

The app functions by taking input from the text box and returning the most likely next word.

It also returns the top 10 most likely words below.

Algorithm

The algorithm I used to make this prediction possible is as follows:

  • I created a group of bigrams, trigrams, and quadgrams through sampling and various cleaning methods.
  • I then evaluated the frequency of corresponding ngrams in each ngram set.
  • The ngrams were then used as a kind of matching mechanism with the incoming input.
  • The input was cleaned in the same manner as the original data and then passed through the matching mechanism.