Predict next words Shiny App

ngocbd
15 Mar 2016

Introduce

This is small project that can predict next word of incompleted sentence

  • Create Corpus
  • Tokenize Corpus
  • N-Gram Frequency
  • Build subset the data frames based on the frequency
  • Build Shiny App

Create Corpus - Tokenize Corpus

 Corpus <- VCorpus(VectorSource(sample))
 tokenOne <- function(x) NGramTokenizer(x, Weka_control(min = 1, max = 1))
 tokenTwo <- function(x) NGramTokenizer(x, Weka_control(min = 2, max = 2))
 ....

N-Gram Frequency - Build Shiny App

tdmOne <- TermDocumentMatrix(fCorpus, control = list(tokenize = tokenOne))
freq1 <- sort(rowSums(as.matrix(tdmOne)), decreasing=TRUE)

 ....

1.If strInput is only 1 word search is made in bigrams for the most frequent word which follows the entered word.

2.If strInput is only 2 words are entered, first a search is made in trigrams by taking both the entered words and the most frequent possible . 3.Else last word of the entered two words a search is made in bigrams

Shiny app

Shiny app