Hans P.
9-02-2020
The Data Science Capstone was the final course in the Data Science Specialization offered by Johns Hopkins University on Coursera. The purpose of this course was to create a predictive text model using R and deploy it with a Shiny app.
My App:
You can try the app yourself on shinyapps.io
This app leverages an ngram approach to modeling that is similar to a Katz back-off model.
Steps
The app runs quickly and returns reasonable results in most cases. However there are some limitations to the app that should be addressed before it would be commercially viable.
The corpus used for modeling was developed from Tweets, news articles, and blogs. It may not work as well for texting or writing emails as the style of language is quite different.
My model is kind of “stupid”. It will return good candidate words that fit the grammar and general context, but will rarely return anything too complex.
My model sometimes makes the error in the discussion section of this wikiepedia article on Katz' back-off models.