by Austin Routt
August 23 2015
Wordsley is a learnable predictive text engine that uses a statistical model to anticipate the next word a user will type. More specifically, the algorithm relies on :
Whenever you enter text into Wordsley he checks the number of words you've given him. Based on that, the highest order n-gram model is chosen, and he then looks through that particular model to find a list of all n-grams that match what has been typed; if no match exists he defaults to a lower model. Using his list, Wordsley normalizes the frequency data, and then orders it so that the n-grams with the highest probability are first. The top five are displayed as buttons.
All n-grams are stored online as a google spreadsheet. Whenever Wordsley completes a set of rounds he updates his google spreadsheet by adding to the frequency of those n-grams he already knew, or storing the count of those which he did not; in this way he can learn new n-grams, as well as refine his ability to predict the most appropriate one for you.
At its core, Wordsley is an entirely data driven application, as he's mainly a collection of n-gram models derived from a large corpus of english text. Nevertheless, Wordsley is special in that he can learn new n-grams, as well as refine his ability to predict the most appropriate one for you. Using a data set provided by SwiftKey, unigram, bigram, trigram, and quadrigram models were created, and these were then glued together by a very simple back-off smoothing algorithm.
Wordsley can be found at the following web address: