finalshiny

GusEsq
9 october 2016

Prediction Algorithm

I tried to keep my algorithm simple

  • over 2-gram word prediction
  • 3 next possible words
  • need some rules to work (lowercases, english)

Example of function running

Let's try the word “rock”“

'> nextwordpred("rock”)

[1] “star i chair”

Algorithm 1/2

#I upload to the shiny serve a csv file with the corpora 2-gramm order by frequency
#This is the main reason the algorith its quick
reqtwoGramw <- read.csv("database.csv")

nextwordpred <- function(word) {
split the last word of the phrase
  part1 <- strsplit(word, " ") 
  part2 <- length(part1[[1]])
  lookfor <- part1[[1]][part2]
  comilla <- "^"
  check <- paste(comilla,lookfor, sep="")

Algorithm 2/2

#Show the 3 most frequent words 
  options <- grep(check, freqtwoGramw$term, ignore.case = FALSE)
  option1 <- options[1]
  option2 <- options[2]
  option3 <- options[3]
  result1 <- as.character(freqtwoGramw$term[option1])
  result2 <- as.character(freqtwoGramw$term[option2])
  result3 <- as.character(freqtwoGramw$term[option3])
  word1 <- strsplit(result1, split = " ")[[1]][2]
  word2 <- strsplit(result2, split = " ")[[1]][2]
  word3 <- strsplit(result3, split = " ")[[1]][2]
  final<- rbind(word1, word2, word3)
  print(final) 
}

Conclusions

I always follow the rule of understanding the problem as the most important thing, when bringing down a solution. The 2-gram frecuency term look up, will give a basic but powerful approach to the next word.

This course was really challenging, for me it was really difficult for all the coding, I'm more used to statistic work, but this kind of development it's finally product for a final user, and there is where the value is added.