Sofian Hamiti
The aim of the project is to develop a Shiny app that can predict the next word of a user typing a sentence.
This function is particularly useful for keyboard applications such as Swiftkey.
The prediction model is based on an n-gram model built from 3 very large corpora: Twitter, News, and Blogs.
Due to technology limitations, the model is trained on a subset of the data.
The corpora were provided by HC Corpora. http://www.corpora.heliohost.org/aboutcorpus.html
The App implements the Naive Bayes model, which has the following advantages:
Reference: http://nlp.stanford.edu/IR-book/html/htmledition/naive-bayes-text-classification-1.html
Link: https://sofianhamiti.shinyapps.io/dscc
Future improvements: In the future, this app will include the following improvements: