MJ
17 June 2017
Next word predictor application is built for the Capstone Project of the Coursera Data Science Specialization.
This application provides a user friendly web interface that allows the user to save typing by predicting the next word that user will write in a text written in English.
It has been developed using Shiny using Shinydashboard as well to improve its appearance and it is available thanks to shinyapps.io.
The methodology followed to build the model consisted of the following steps:
The algorithm selected taking into account as well execution time and memory requirements is the Stupid back-off algorithm.
If the user has not writen anything, the 3 most frequent words in English are shown. As soon as the user starts writing, each time spacebar is pressed, previous words are used to understand which could be the next word that the user would like to write using up to 5-grams data.
Each candidate word will have a score that is computed using a back-off factor of 0.4.
Finally, the user will be able to see the three options with the highest scores and click on them to add it to the text.
Read Documentation for details and enjoy Next word predictor! If you are curious, check the data in N-gram database sections!