2026-02-20

The model

  • built on 4 orders of n-grams

  • Markov Chains assumption: probability of the next event depends only on the immediate previous event(s), not on the whole events history

  • Backoff strategy: tetragrams → trigrams → bigrams → unigrams

How it works

The model uses a 4-gram backoff model to predict the next word.

1) Try and use tetragram (4-gram): if not found, then

↓

2) backoff to trigram (3-gram): if not found, then

↓

3) backoff to bigram (2-gram): if not found, then

↓

4) backoff to unigram (1-gram): if not found, then

↓

5) just use most common words

The app

Link: https://fredericotonus.shinyapps.io/NWPredictor/

  1. Type a word or phrase

  2. Choose the number of predictions

  3. If wanted, choose to show which n-gram model was used for each prediction

  4. Click ‘Predict’ to get the words predicted

App interface