DSCslides

DMbewe
29 nov 2021

Project description

The Project involves NLP. The task is to take a user's group of words and to output a predicted next word.

Description of APP

The App provides a simple user interface to the next word prediction model.

App model

The app uses the principles of “tidy data”

Input: raw text files for model training Clean training data; separate into 2 word, 3 word, and 4 word n grams, save as tibbles Sort n grams tibbles by frequency, save as repos N grams function: uses a “back-off” type prediction model user supplies an input phrase model uses last 3, 2, or 1 words to predict the best 4th, 3rd, or 2nd match in the repos Output: next word prediction Benefits: easy to read code; uses “pipes”; fast processing of training data;

Outcome

Key Features:

Text box for user input Predicted next word outputs dynamically below user input Tabs with plots of most frequent n grams in the data-set Side panel with user instructions