Executive Summary:

A Word-Cloud is built using 3000 tweets collected about Fast and Furious 7 (#FastFurious7) on the releasee day, i.e., 04-03-2015. As expected, late Mr. Paul Walker is well recollected as seen from the word cloud in the form of words “paul, rip, walker.” The movie seems to have positive response from a lot of people who tweeted words like “awesome, amazing,great,best,better etc.” Sentiment analysis is then done on the clean text to classify the tweets as Negative, Neutral and Positive tweets. It is important to make sure to have the sentiment.R file, text files of positive and negative words in the current directory. Tweets on the release day appears to be split into 10% Negative, 42% Positive and 48% Neutral opinions.

Load Libraries:

Read Tweets:

Tweets are downloaded and stored as a text object. It is a good idea to ignore all the graphical parameters in order to prevent errors in the further functions.

setup_twitter_oauth(api_key,api_secret,access_token,access_secret)
## [1] "Using direct authentication"
tweets <- searchTwitter("#FastFurious7",n=3000,lang="en")
tweets.txt <- sapply(tweets, function(t)t$getText())
# Ignore graphical Parameters to avoid input errors
tweets.txt <- str_replace_all(tweets.txt,"[^[:graph:]]", " ") 

Process Text

The text object now has to be preprocessed to remove retweets, numbers, routine english words and pronouns etc. The clean text has then to be stored as a vector in order to plot the wordmap. There might be any additional words additional to the default stopwords and are included in remwords vector in the code.

clean.text = function(x)
{
  
   # tolower
   x = tolower(x)
   # remove rt
   x = gsub("rt", "", x)
   # remove at
   x = gsub("@\\w+", "", x)
   # remove punctuation
   x = gsub("[[:punct:]]", "", x)
   # remove numbers
   x = gsub("[[:digit:]]", "", x)
   # remove links http
   x = gsub("http\\w+", "", x)
   # remove tabs
   x = gsub("[ |\t]{2,}", "", x)
   # remove blank spaces at the beginning
   x = gsub("^ ", "", x)
   # remove blank spaces at the end
   x = gsub(" $", "", x)
   return(x)
}

cleanText <- clean.text(tweets.txt)
vector <- paste(cleanText,collapse=" ")
remwords <- c("movie","fast","watching")
vector <- removeWords(vector,c(stopwords("english"),remwords))

Word Cloud

wordcloud(vector, scale=c(6,0.7), max.words=150, 
           random.order=FALSE, rot.per=0.35,colors=brewer.pal(8,"Dark2"))

Sentiment Analysis

pos <- scan("positive.txt",what="character",comment.char=";")
neg <- scan("negative.txt",what="character",comment.char=";")
source("sentiment.R")

analysis <- score.sentiment(cleanText,pos,neg)
table(analysis$score)
## 
##   -5   -3   -2   -1    0    1    2    3    4    5 
##    2    2   27  284 1448  985  184   57    9    2
neutral <- length(which(analysis$score == 0))
positive <- length(which(analysis$score > 0))
negative <- length(which(analysis$score < 0))
Sentiment <- c("Negative","Neutral","Positive")
Count <- c(negative,neutral,positive)
output <- as.data.frame(Sentiment,Count)
qplot(Sentiment,Count,data=output,geom = "histogram", fill=Sentiment,
      binwidth=1,stat="identity",main="Fast&Furious7 Sentiment Analysis")

References