Mykyta Zharov
12.03.2020
This is a presentation of the next word prediction app, which was built for the final Capstone project of the Coursera JHU Data Science Specialisation. In this presentation the following topis will be briefly described:
App:https://mykytazharov.shinyapps.io/SmartKeyBoardApp/ Milestone report:https://rpubs.com/kitazharov/573608
There were 3 text datasets given, which contained tweets(167mb), blogs(210mb) and news posts(205mb). All datasets were combined and 1% of the data was randomly taken to build a prediction model. The following starting steps were performed:
The prediction model, that was used in the application, was built with 4-gram language model using stupid backoff algorithm. More information can be found via the following links:
After the model was built, it was tested on the 4-grams from the test dataset. The resulting accuracy of the algorithm was around 20%.
Sniny application consists of two pages:
On the Info page the user can find initial information about the model and purpose of the app, as well as instructions how to use the app.
The App page itself has an text input field, where the user can type text. The 3 best predictions will be visualised as buttons under the text input. A user can click on button with prediction and it will be added at the end of the typed text.