Word Prediction Web App

by Herchu

Unmatched user experience!

Highly accurate and blazing-fast!

Sleek design. Shows or hides profanity.

Works in all modern (JavaScript enabled) browsers. Responsiveness guaranteed: tested in an iPad and an iPhone.

Using the application

app main screen capture

Internal Model

4-gram Model with Linear Interpolation Smoothing

\[ \begin{align*} P(w_n|w_{n-1},\dotsc,w_{n-3}) = & \lambda_0 P(w_n) + \lambda_1 P(w_n|w_{n-1}) + \dotsc + \\ & \lambda_3 P(w_n|w_{n-1},\dotsc,w_{n-3}) \end{align*} \]

30,000 words dictionary, unigram to tetagram tables. 2-3-4-grams have 1 million entries each (MLE; frequencies >= 3.)

n-grams tuples include begin-of-sentence tokens.

Achieves 16% accuracy for the first word and 26% within the best three. Independently scored by benchmark.R

Implementation

Web app hosted in shinyapps.io (developed in R)

Words in the ngram tables are integer coded: less char strings results in a 50MB total memory footprint.

One line in R (fast!) gets the most probable next word:

head(order(rowSums(sweep(ngrams,2,weights,`*`)),decreasing=T),n=1)

Weights \( (\lambda_0,\lambda_1,\lambda_2,\lambda_3) \) were eyeballed. Optimization (COBYLA) didn't get better results.

Try it!

Includes two optional extra features:

  • Predicts the obligatory word plus two more words
  • PAT Phrase-Auto-Typing ™ ;-)

https://herchu.shinyapps.io/shinytextpredict