Noelia Oses
December 19th, 2015
booktext <<- readLines("pg1513.txt",encoding="UTF-8")
booktext <<- booktext[ which(booktext!="") ]
act_indeces <<- grep("ACT ",booktext)
actnumber <- 1
When the user selects an act the app performs the following steps:
– Transform the text of the act into a corpus.
– Transform all letters to lower case, remove punctuation, remove numbers, and remove the words “thy”, “thou”, “thee”, “the”, “and”, and “but”.
– Apply the 'TermDocumentMatrix' function to the corpus to construct a term-document matrix.
– Use the 'wordcloud' function to construct and render a word cloud of the words which have a minimum frequency of 5.
(See code in Github )
Word cloud for act 1 of Romeo and Juliet:
wordcloud(names(v), v, scale=c(5,0.2), min.freq = 5,colors=brewer.pal(8, "Dark2"))