FD
20.06.2016
Dont want to type anymore on tiny keyboards? Use Stop typing!
Stop typing! is the prototype of a word prediction application based on n-gram statistics. It is optimized for small memory (less than 30MB hard disk space required) and CPU footprint.
It will show you words, which would fit to what you already typed.
You find the application here. Try it out!
The application consists of two tabs: Application and Background.
Start the application by clicking here and wait, until the first proposals are displayed. Type your text in the input field. The proposal list is updated every time you stop typing.
The application uses Kneser-Ney smoothing and n-gram statistics generated from a swiftkey corpus downloaded from here.
Basically, it takes the last n words (n=4 here) from what you typed and tries to match it against known tokens.
The tokens were generated by scanning the corpus without filtering out stopwords and without stemming. The reason for this decision was the gramatical meaning, which should be covered by the application. The successful prediction of terms like “st. louis” based on “I am going to st.” confirms this approach.
For details, please see the background section of the application.
See [1] for details about smoothing techniques. This also contains the used approach for Kneser-Ney smoothing.
See [2] for word vectors technique to analyze sentences which can capture context information.
See [3] for details about the preprocessing of the data.
[1]: An empirical study of smoothing techniques for language modeling, Stanley F. Chen and Joshua Goodman, Computer Speech and Language (1999) 13, 359-394
[2]: Efficient Estimation of Word Representations in Vector Space, Tomas Mikolov et al., arXiv preprint arXiv:1301.3781
[3]: Distribution of n-grams in english text corpus, http://rpubs.com/fdd/187848