Francisco Gonzalez Alonso
2016/12/30
With this slides I will present the use of my DataScience Capstone application for predicting the next word.
The goal of this exercise is to create a product to highlight the prediction algorithm that you have built and to provide an interface that can be accessed by others.
And thus for this project I submited:
A Shiny text prediction application: application
A slide deck for describing to use of app: description
The data used to create a frequency dictionary and predict the next words, using unigrams, bigrams and trigrams n-grams, comes from a corpus called Corpus Download.
All text mining and natural language processing was done with the usage of follows R packages: NLP, tm, rJava, RWeka, SnowballC, and ggplot2.
The main concept to understand my prediction application is the use of unigrams, bigrams and trigrams to solve the fit of next word, i generated these and search in them the words from the input field.
The application return the result of the:
If they are available, in other case show: “WITHOUT RECOMMENDATIONS”
You only need insert one or several words in the “Input your text” field, and the application do the rest.
It's clear, easy and simple.
And on the “Predicted word” field show the best result of the best trigram, bigram and unigram, in this order if they are availables.