This is an R Markdown document. Markdown is a simple formatting syntax for authoring HTML, PDF, and MS Word documents. For more details on using R Markdown see http://rmarkdown.rstudio.com.
When you click the Knit button a document will be generated that includes both content as well as the output of any embedded R code chunks within the document. You can embed an R code chunk like this:
Note that the echo = FALSE parameter was added to the code chunk to prevent printing of the R code that generated the plot.
library('tm')
## Loading required package: NLP
library('RColorBrewer')
library('wordcloud')
trump <- read.csv("Trump.csv", comment.char="#")
washington <- subset(trump, USER_CITY == "WASHINGTON")
newyorkcity <- subset(trump, USER_CITY == "NEW YORK CITY")
Wtweets <- washington$MESSAGE_BODY
NYCtweets <- newyorkcity$MESSAGE_BODY
clean.text = function(x)
{
# tolower
x = tolower(x)
# remove rt
x = gsub("rt", "", x)
# remove at
x = gsub("@\\w+", "", x)
# remove punctuation
x = gsub("[[:punct:]]", "", x)
# remove numbers
x = gsub("[[:digit:]]", "", x)
# remove links http
x = gsub("http\\w+", "", x)
# remove tabs
x = gsub("[ |\t]{2,}", "", x)
# remove blank spaces at the beginning
x = gsub("^ ", "", x)
# remove blank spaces at the end
x = gsub(" $", "", x)
return(x)
}
# clean tweets
Wtweets = clean.text(Wtweets)
NYCtweets = clean.text(NYCtweets)
Create word cloud of tweets of WASHINGTON
## amp point republican answer can
## 3 3 3 2 2
## new theyre michigan win beat
## 2 2 2 2 2
## amiright tough nhprimary trumps good
## 2 2 2 2 2
## donaldtrump planelection tax campaign thousands
## 2 2 2 2 2
Create word cloud of tweets of NEW YORK CITY
## donaldtrump forelection retweet voting wondering
## 2 2 2 2 2
## trumpflprimary will bratty child gopclowncar
## 2 2 1 1 1
## never noelection told interview next
## 1 1 1 1 1
## president standing states united witham
## 1 1 1 1 1
```