Predict Next Word

A Predictive Model in R using application of Shiny

Sougata Biswas
Data Scientist

Predict Next Word

The end goal of the Data Science Specialization Capstone Project is to produce a predictive text algorithm in R that based on a user’s text input the system will suggest the next most likely word to be entered.

How does it works?

If any user has a partial sentence or an incomplete sentence, then this algorithm can predict immediate next word which has highest probability to occur.

The steps are : 1. The user enters the incomplete sentence or partial sentence. 2. The algorithm utilizes its trained Decision Tree to figure out next word. 3. The algorithm finds out the output ie. the "word".

The Back Off Application

At the back, we need to load 1-grams, 2-grams, 3-grams & 4-grams data frame files.These data are already cleansed with N-Grams frequency in decending order.The data was convert to lower case, punctuations removed, numbers removed, white spaces removed, non print characters removed.Then, the algorithm uses Markov Chain Model for prediction.

Prediction Accuracy

The predictive model works fine. It predicts next word each time.Only, prediction accuracy is not good. I am working on to increase prediction accuracy. May be, in near future, I will be able to present more accurate model.