ngocbd
15 Mar 2016
This is small project that can predict next word of incompleted sentence
Corpus <- VCorpus(VectorSource(sample))
tokenOne <- function(x) NGramTokenizer(x, Weka_control(min = 1, max = 1))
tokenTwo <- function(x) NGramTokenizer(x, Weka_control(min = 2, max = 2))
....
tdmOne <- TermDocumentMatrix(fCorpus, control = list(tokenize = tokenOne))
freq1 <- sort(rowSums(as.matrix(tdmOne)), decreasing=TRUE)
....
1.If strInput is only 1 word search is made in bigrams for the most frequent word which follows the entered word.
2.If strInput is only 2 words are entered, first a search is made in trigrams by taking both the entered words and the most frequent possible . 3.Else last word of the entered two words a search is made in bigrams