Sentiment Analysis - Hacker Hour

Background

There are various sentiment analysis techniques available through the R Syuzhet package. Let’s compare them!

# Calculate sentiment scores
politicstweets$sentiment_s <- get_sentiment(politicstweets$text, method="syuzhet")
politicstweets$sentiment_n <- get_sentiment(politicstweets$text, method="nrc")
politicstweets$sentiment_a <- get_sentiment(politicstweets$text, method="afinn")
politicstweets$sentiment_b <- get_sentiment(politicstweets$text, method="bing")

# Check output
politicstweets$sentiment_s[1:10]

##  [1] -1.25  0.40  5.10  0.55 -4.00 -1.00  0.40  0.40 -1.90 -3.25

politicstweets$sentiment_n[1:10]

##  [1]  0 -1  7 -1 -5 -1 -1 -1 -2 -2

politicstweets$sentiment_a[1:10]

##  [1]  -4  -5   8  -1 -18  -2  -5  -5  -6 -11

politicstweets$sentiment_b[1:10]

##  [1] -2 -3  4  1 -7 -2 -3 -3 -4  0

Results

# Set par
par(mfrow = c(2, 2)) # 2 rows with 2 plots

# First plot
hist(politicstweets$sentiment_s, 
     xlab="Syuzhet Score",
     main="Syuzhet Sentiment Scores for Political Tweets",
     cex.main=.7, col="blue",
     ylim = c(0, 800),
     xlim = c(-10, 10),
     border= F)
# Add line for the mean sentiment
abline(v=mean(politicstweets$sentiment_s), lwd=2)

# Second plot
hist(politicstweets$sentiment_n, 
     xlab="NRC Score",
     main="NRC Sentiment Scores for Political Tweets",
     cex.main = .7, col="purple",
     ylim = c(0, 800),
     xlim = c(-10, 10),
     border= F)
abline(v=mean(politicstweets$sentiment_n), lwd=2)

# Third plot
hist(politicstweets$sentiment_a, 
     xlab="Afinn Score",
     main="Afinn Sentiment Scores for Political Tweets",
     cex.main = .7, col="red",
     ylim = c(0, 800),
     xlim = c(-10, 10),
     border= F)
abline(v=mean(politicstweets$sentiment_a), lwd=2)

# Fourth plot
hist(politicstweets$sentiment_b, 
     xlab="Bing Score",
     main="Bing Sentiment Scores for Political Tweets",
     cex.main = .7, col="green",
     ylim = c(0, 800),
     xlim = c(-10, 10),
     border= F)
abline(v=mean(politicstweets$sentiment_b), lwd=2)

How can we compare sentiment to activity levels?

politicstweets$activity_count = politicstweets$favorite_count + politicstweets$retweet_count
p <- ggplot(data=politicstweets, mapping = aes(x=sentiment_b, y=activity_count, color=is_retweet)) +
  geom_point(na.rm=T, size=3, shape=4, alpha = .5, position = "jitter")+
  ggtitle("Activity vs. Sentiment in Political Twitter Posts")+
  ylab("Sum of Favorites and Retweets")+
  xlab("Sentiment (Bing)") +
  xlim(-4, 4)
ggplotly(p)

Sentiment Analysis - Hacker Hour

Sam Lee

3/25/2022

Background

Results