6/15/2021

Background

  • Word prediction a great tool for any writing application

  • Used in any digital application

  • User friendly for customers

Principle

  • 50% of the three text documents from news, blog and twitter used

  • Divided into train (75% of lines) and test (25% of lines) Corpa

  • Processed to lower case and remove stopwords

  • Model created using 80% of words in library

  • Alogarithm used: Stupid-Back-off (sbo) 3 n-gram

  • Minimize processing speed and memory use

Alogarithm - Explained

  • 3-gram language model

  • Stupid Back-off ranks next word

  • Start with 3-gram if no suggestions, back of to 2-gram

  • Back off progress continued until enough suggestions are found

App description