- The model uses the stupid backoff method to determine its prediction.
- It starts by looking for matches of the input phrase at the 7-gram level down to the 1-gram level. The predictions are returned by the highest number of occurences at the largest n-gram size.
- If there are no matches of the phrase amongst the data set, the model samples 3 of the top 10 occuring words in the dataset, based on the number of occurences (this is to avoid the same three words being produced in the same order for unkown phrases).