Mario R. Melchiori
22/04/2016
Prior to build word prediction algorithm, the following steps were executed:
Katz-Backoff
A Shiny application was developed based on the next word prediction model described previously. Here are main features of the Application available here:
Taking in account that in Natural Programming Language as much data training as possible and more data means better estimates, in this case using only 10% of total data being sampled from the 3 text files, the application seems to be doing a decent work based on Jan Hagelauer's benchmark.R program (https://github.com/jan-san/dsci-benchmark):