Chris Rucker
2015_5_19
A word cloud is a visual representation for text data, typically used to depict keyword metadata (tags) on websites, or to visualize free form text. Tags are usually single words, and the importance of each tag is shown with font size or color. This format is useful for quickly perceiving the most prominent terms and for locating a term alphabetically to determine its relative prominence. When used as website navigation aids, the terms are hyperlinked to items associated with the tag.
Text mining, also referred to as text data mining, roughly equivalent to text analytics, refers to the process of deriving high-quality information from text. High-quality information is typically derived through the devising of patterns and trends through means such as statistical pattern learning.
# library(tm)
# myCorpus = Corpus(VectorSource(aFile2))
# myCorpus = tm_map(myCorpus, removePunctuation)
# myCorpus = tm_map(myCorpus, removeNumbers)
# myCorpus = tm_map(myCorpus, removeWords, stopwords("english"))
# myDTM = TermDocumentMatrix(myCorpus, control = list(minWordLength = 1))
# m = as.matrix(myDTM)
# v = sort(rowSums(m), decreasing = TRUE)
# library(wordcloud)
# set.seed(1)
# wordcloud(names(v), v, min.freq = 75,colors = brewer.pal(8,"Dark2"), rot.per = 0)