Word Prediction APP

Manoj Kundrapakam
10/20/2021

Data and Goal

Data used in this prediction app consists of twitter and news data provided by coursera.

The goal of this exercise is to create a product to highlight the prediction algorithm that you have built and to provide an interface that can be accessed by others.

The goal is to create a Shiny app that takes as input a phrase (multiple words) in a text box input and outputs a prediction of the next word.In this app the user can input one,two or three words to get next word.

Code

Below is the sample code for transforming text and creating N gram model

corp <- VCorpus(VectorSource(df))

corp <- tm_map(corp, tolower)
corp <- tm_map(corp, removePunctuation)
corp <- tm_map(corp, removeNumbers)
corp <- tm_map(corp, stripWhitespace)
corp <- tm_map(corp, PlainTextDocument)
changetospace <- content_transformer(function(x, pattern) gsub(pattern, " ", x))
corp <- tm_map(corp, changetospace, "/|@|\\|")

#use a tokenizer to break speeck into components that can be read my machine
uniGramTokenizer <- function(x) NGramTokenizer(x, Weka_control(min = 1, max = 1))
biGramTokenizer <- function(x) NGramTokenizer(x, Weka_control(min = 2, max = 2))
triGramTokenizer <- function(x) NGramTokenizer(x, Weka_control(min = 3, max = 3))
quadGramTokenizer <- function(x) NGramTokenizer(x, Weka_control(min = 4, max = 4))

Prediction App and Source code

Link for source code in github.

source code

Link for prediction APP

Prediction App

Note

Data used here is sampled from given source as my machine cannot process huge data so not all the words get predicted. I am working on improvising the model. Thanks for your patience.