Isaac Carey
22 July 2021
This app is a simple yet effective way to predict text using the Twitter database. The app has three main parts.
The provided Twitter database served as the main data for this project.
From it, I created two files that contained the most popular bigrams and trigrams.
This was done using tidytext and its unnest_tokens function.
After generating the bigrams and trigrams, they were sorted by count.
These files were saved as bigrams.csv and trigrams.csv for use by the app.
First, the algorithm checks to see how many “words” have been entered.
If there are more than two words, they will be searched for in the trigrams.
If a match is found, the third word of the trigram is returned.
If no match is found, the process is repeated using the last word of the input in the bigrams.
If no matches are found, the app will randomly return one of the top ten words from the data.
Thanks for reading through this!