- predict the next word a user is typing
- used in mobile keyboards
- goal: improve typing speed
2026-05-04
Preprocessing steps: - Lowercasing text - Removing numbers - Removing punctuation - Removing extra whitespace - Sampling 1% of data for efficiency
Built using n-gram language model - Unigrams (single words) - Bigrams (2-word sequences) - Trigrams (3-word sequences) Prediction logic: - Try trigram first - If not found → fallback to bigram - If still not found → default prediction