easySentimentAnalyseR

This is a demo for my side project easySentimentAnalyseR

After get authenticated, modify the value of search_str and num_twts, also you could set an actual start/end date if you want to. Based on my own testing, I found the earliest date could be set to 10 or 11 days before your current date.

For example, I just set em as below.

search_str = "lakers"
num_twts = 500 # the number of tweets you wanna search
lang = "en" # check lang at https://en.wikipedia.org/wiki/List_of_ISO_639-1_codes


# pay attention to the rate limiting 
# see --> (https://developer.twitter.com/en/docs/basics/rate-limiting)
# ?searchTwitteR to see other potential args
your_tweets <- searchTwitter(searchString = search_str, n = num_twts, lang = lang, since = "2017-10-20", until = "2017-10-21")

After adjusting the cleaning process, I could generate such a word cloud below.

Based on the word cloud below, you could perceive that people still don’t get enought about talking Lonzo Ball’s official debut and the “welcome come to NBA” he got from Patrick Beverley. Haha, I think that’s a good lesson for Lonzo, even Patrick was too harsh. What’s more, it also easily to find out that many people mentioned about Kobe Bryant (salute to Black Mamba) and Ingram also attracts the spotlight.

However, still on this word cloud, actually you could try to express it in some other ways, for example, you could say since Lonzo got a nice performance (29 pts, 9 asist and 10 rebounds) in his 2nd game, just one day after his debut, so may be his fans just treat it as a fight back to the Patrick though :-)

Also, you may apply a hierarchical clustering on the top words and try to get other insights.

In this way, you could indeed get some information and sort of conjectures about the keyword you searched.

However, it is still a little difficult for you to know how excatly people’s emotion or reaction related to the topic you searched.

Next, I will try to do a sentiment calculation to get the answer of the question above.

# calculate sentiment scores based on J.Breen approach (https://jeffreybreen.wordpress.com/2011/07/04/twitter-text-mining-r-slides/)
scores <- calclulate_score(tweets_corpus,pos,neg)

As you may see above, the sentiment score is 0.39 (lower than 0.5), and it means when people mentioned the word ‘lakers’ in twitter, most of them may intruduce a sort of negative emotions.

##    search_str   Sys_date score
## 1:     lakers 2017-10-22  0.39

However, you have to know that the scores are calculated based on the actual dictionary you used. For example, you could apply the nrc dictionary here to see the sentiment scores.

tweet_char_vec <- sapply(tweets_corpus, '[', "content") %>% unlist %>% as.character() %>% paste(collapse = ' ')

tweet_sent <- get_nrc_sentiment(tweet_char_vec)

Then, you could make a plot as below to show the sentimental information about your searching keyword.

Also, you may try other dictionaries such as “bing”, “afinn” and “syuzhet”, and calculate the sentiment scores respectively to get a comprehensive analysis about the sentiment of your corpus.

It’s about the end of this demo and hope you all enjoy it!!

easySentimentAnalyseR

Twitter_BoW

Jack

October 21, 2017