Rutger Lakin
December 9, 2016
This application tries to predict the next word in a sentence. It is written in R and uses a corpora provided by SwiftKey. It performs very quick: predicted words are given within 150ms and can be used on your mobile keyboard application.
In total 15k lines from blogs, 20k lines from news and 15k tweets has been used.
Katz's Backoff Model is used to predict the next word.
The backoff model has been simplified by not calculating the (discounted) probabilities, as the true probabilities are not needed to give the maximum likely word.
The application has been written for Shiny in R. The application is divided into a frontend and backend.
The frontend shows a text input field where the user can fill in a sentence. If the “Predict Next Word” button is pressed, the sentence is sent to the backend which returns the word predicted. The frontend ahs been kept simply and therefore very mobile-friendly.
The backend uses the Katz's Backoff Model to search for the predicted word. A n-gram of 2 and 3 is used. The matched word with highest probability is returned in bold together with the sentence prepended. If no match is found, the backend will return the sentence without a bold predicted word.
An application has been made in Shiny and is accessible using this link.
Fill in a sentence in the text input field and press the button “Predict Next Word” to get the next word prediction. The predicted word will be bold. If no bold word is shown, no prediction is available.
Please note that it is possible that the application needs a startup time of 1 minute.