Sandra Ezidiegwu
This presentation describes the functionality and usefulness of the application built for next word predictions.
The application was built as a part of the capstone project for the Coursera Data Science specialization course held by professors of John Hopkins University in cooperation with Swiftkey.
Build an algorithm that can predict the next word of a phrase or word input by the application user. In building the prediction algorithm, the following steps were applied:
Create Shiny App to display functionality of prediction algorithm using R Studio
Data Sampling and Cleaning: Data was randomly sampled and cleaned by conversion to lowercase and applying regex functions to remove punctuations, special characters etc
Corpus and N-Grams: This data sample was converted to a vector corpus and was then tokenized using the tau package in R to uni-, bi-, tri-, and quad- grams.
Frequency Dictionary and Prediction: A frequency matrix was created for each n-gram and transferred into frequency dictionaries. The resulting data frames were used to predict the next word.
To use the application, you simply enter a word or phrase in the text box and the application will then try to predict the next word. This result will be shown in blue.