Tan Woei Ming
11 October 2016
In the modern hand-held devices, text input requires faster and more efficient method.
In order to achieve this, we demostrate an application to predict the word input for faster input and able to run under minimum memory requirement.
Our objectives are:
To predict the frequently use word in the sentence.
Optimize the prediction algorithm using Ngram to minimize the long processing time and memory resource.
First, we use tm package to clean up the data and build the corpus library.
Then, we create unigram and bigram word library with its occur frequency, that become the word prediction library used later.
Based on the library, we are able to tell what words are frequently used (filter non-frequent one), the unigram and bigram and their corresponding ‘predicted word’.
Application UI allows user to enter a sentence, and show the ‘predicted word’ and ‘suggested words’ for user.
It is fast and efficient, as it only predict the commonly used words.
The ‘predicted word’ is the word application think you want to input; and the ‘suggested words’ give the other 4 options (include the predicted one)
Link to word prediction application: https://twming.shinyapps.io/datascience-wordpredict/.