Recent Developments in Word Suggestion Solutions

January 27, 2021

Raymond Bem, Junior Data Scientist from NextApp Corporation presenting

hyperlink to product…NextApp

alt text

Introduction

  • The latest generation of NextApp word suggestion technology has arrived!

  • More Accurate – reliance on rank-ordered lists results in poor accuracy, maturization of NLP (Natural Language Processing) techniques improve performance considerably

  • Faster – filtering logic means a smaller server footprint as well as faster response time given a language set

  • Simplified UI – our design has been market-tested and co-developed with users culminating in a simple, easy to use interface

Algorithm

  • Highly respected Modified Kneser-Ney Smoothing (MKNS)1 improves upon traditional back-off models, mainly by interpolation between the phrase levels (n-grams)

  • Process considers word-facings before and after, combines with search phrase counts, then weights – yielding a workable, valid distribution summing to one…high probability/count data have their probability reduced and added systematically to the lower probability data (smoothing)

  • Weights calculated on the frequency of 1 and 2 count phrases, discounting influence of lower level (n-gram) search results – these discounts reduce errors by limiting the effect of popular words (e.g., “and”) and elevating infrequent data

1. [Chen, Stanley F. and Joshua Goodman. 1998. An Empirical Study of Smoothing Techniques for Language Modeling. Harvard Computer Science Group Technical Report TR-10-98.]

App

  • NextApp is quite simple, the user simply enters a word or phrase and clicks one button

  • Underneath, calibrated tables are built from samples of modern news, blog, and twitter language

  • A final set of weights is applied to the MKNS model probabilities to better surface infrequent phrases

  • The user is returned a list of word selections presented in graphical format – the best choice is clearly called out in the results

  • In user output, y-axis distance between the chosen word and non-chosen provide easy, visual cues as to the strength of selection

Conclusion

  • NextApp Corporation is very excited about this latest work, and would like to thank you for the opportunity to present today

  • Future applications are wide-ranging, from assisted typing for disabled persons, to business inventory consolidation projects, as well as plagiarism detection

  • Check out our product here…NextApp