09/04/2022

Can I guess your word?

We live in a fast paced world

Any option to save time and effort is appreciated by all

This product helps to predict the next word as a person types

Initial Preparation

Model was set up using three sets of data: blogs, twitter and news

The lines were first cleaned to select only required words

10% of randomly selected lines were used from these three files to prepare the model

Constructed 3-gram, 2-gram and 1-gram from this selected data

Methodology

The model uses sentence parsing and word matching techniques

Starts with 3-gram and goes down to 1-gram

Uses simple probability to identify best match

If no match could be identified, the model displays most frequent word

The smallest input required is one word

Limitations

The model was scaled down as much as possible to fit Shiny

Accuracy was severely compromised in this approach

Even this scaled down model takes a very long time to predict first word as loading and setting up objects require time

More work is needed to optimize accuracy and performance

Sometimes in Shiny the error message ‘disconnected from server’ may appear. Reload will solve the issue



The Application can be accessed here