This app takes in a phrase from the user and outputs a list of next 3 predicted words.
2026-05-03
This app takes in a phrase from the user and outputs a list of next 3 predicted words.
Training process: It took data from a vast amount of blogs, tweets and news for training and forms a dataset for unigram, bigram and trigram.
Prediction process: For phrase input it receives, it takes into account the last 2 words and predicts the next word by using the 3-gram dataset, which if it fails, will move on to looking at the last word and predict using the 2-gram dataset, which if it fails, then will use the unigram dataset to predict the most common word it learned from the training process
The app takes a dataset of around 90 MB to load up and each input responds its output with the following performance:
| Scenario | Input | Elapsed Time |
|---|---|---|
| Trigram hit | “how are” | 0.253s |
| Trigram hit | “in the” | 0.169s |
| Unigram fallback | “xkqzp zzzzz” | 0.148s |
The nextWordPredictor app: https://icedmcstuffin.shinyapps.io/nextWordPredictor/
RPubs article explaining the model:(https://rpubs.com/IcedMcstuffin/1428801)
The dataset used to train the model: (https://d396qusza40orc.cloudfront.net/dsscapstone/dataset/Coursera-SwiftKey.zip)
The github repository for this model: (https://github.com/Icedmcstuffin/Text-Prediction-Model)