Next word, please!

John O. Bonsak
April 2016

Algorithm

  • Modified Kneser-Ney smoothing - advanced form of interpolation between 1-4-grams
    KneserNey

  • Full reproducibility through code sharing (for peer reviewers only, ref Coursera Honor Code)

  • Final report without code is available here

Saving the...

Nextword app screen shot

Nextword are the predicted words, ranked by descending \( P_{KN} \) - the Kneser-Ney probability. The ngram column shows which n-gram (0-4) the words were picked from.

Usage

App input App detail
App detail 2 App about

  • Simple, just type or paste your (unfinished) English sentence
  • Suggestions are rendered as a data table (using DT::datatable for enhanced appearance)
  • Stopword downvoting is default, but can easily be turned off using the checkbox
  • Read more in the app's About tab

Noteworthy

  • Efficient storage: All words mapped to integer dictionary
  • Super soft lemmatization reduces perplexity: Plural s removed where it's safe
  • Stopword downvoting provides more meaningful results, refills with stopwords when needed
  • An enormous and exciting potential for improvement, humbly admitted in the final report

Thanks for the cooperation we had in the forums the last year, dear peer!

Go to the app