The word prediction app provides information about the entire set of text and provides top n-grams depending on the user input.
user interface
prediction
A text field is available to predict the next n-gram of words. The output is dynamic and will provide the top predictions. The higher the value the most probable outcome.
The application initially displays the top ngrams by ngram size, a histogram, and a wordcloud for the entire data set.
The application runs several methods to prepare the data or to load previously saved class objects
workflow
NGrams work by probably model:
Ngram refers to the combination of previous combination of N words that can aid in prediction of the N+1…N terms. The simplest form is probabilistic model or most likelihood estimation.
Depending on the extend of the Ngrams computed, tokens are created for various backward text to predict the word that would occur.
So we are looking a the models probability outcome on frequency of the words with other words combined. And the tokens are the few N combination of words and the frequency of their occurance.
It is important to note, that the algorithm is as good as the data that is used for training or estimations.
[Fellows (2018); Gagolewski (2022); Wickham (2016); Feinerer and Hornik (2023); Neuwirth (2022); Wickham et al. (2023); Hornik, Meyer, and Buchta (2022); Dowle and Srinivasan (2023); Calin.Uioreanu:https://calin.shinyapps.io/predict_next_word; Hornik (2020)]