I wanted to take a look at the reactions of two of my preferred political Commentators, Bill Maher and Charles Blow. My newly developed skills in Twitter sentiment analysis comes in handy; I can get a sense of what these pundits have to say without thoroughly engaging with their material.it’s too soon for that.

Bill Maher’s Tweets:

@BillMaher America needs you more than ever, with me and all the rest of #TheResistance, until we can figure out how to really #MAGA! #WereStillHere

#load tweets and source 
number_of_tweets <- 2000
RT <- userTimeline('@BillMaher', n = number_of_tweets)
RT_df <- twListToDF(RT)
RT_tweets <- RT_df %>% 
  select(id, statusSource, text)

Bill Maher mainly tweets from these sources. I’ve selected the top 10 to make sure I include all the platforms that he uses.

The frequency breakdown of the origin of his tweets:

# trim tweet to cleanly reveal status source and percentage of tweets from that source
RT_df$statusSource = substr(RT_df$statusSource, 
                        regexpr('>', RT_df$statusSource) + 1, 
                        regexpr('</a>', RT_df$statusSource) - 1)
RT_platform <- RT_df %>% group_by(statusSource) %>% summarise(n = n()) %>% mutate(percent = n/sum(n)) %>% arrange(desc(n))
kable(RT_platform %>% select(Origin_of_Tweet = statusSource, Number_of_tweets = n, Percent = percent) %>% top_n(10), digits = 2)
Origin_of_Tweet Number_of_tweets Percent
Twitter Web Client 130 0.57
Twitter for iPhone 43 0.19
Media Studio 24 0.11
WhoSay 17 0.07
Instagram 7 0.03
iOS 4 0.02
SnapStream TV Search 3 0.01

In order to do a sentiment analysis of the tweets, the words in the sentence or phrase need to be isolated.

Some of Bill Maher’s common words that will be matched to sentiments include:

reg <- "([^A-Za-z\\d#@']|'(?![A-Za-z\\d#@]))"
RT_words <- RT_df %>%
  filter(!str_detect(text, '^"')) %>%
  mutate(text = str_replace_all(text, "https://t.co/[A-Za-z\\d]+|&amp;", "")) %>%
  unnest_tokens(word, text, token = "regex", pattern = reg) %>%
  filter(!word %in% stop_words$word,
         str_detect(word, "[a-z]"))

Most common words used in recent tweets

RT_words %>% count(word) %>% arrange(n) %>% with(wordcloud(word, n, max.words = 100, scale=c(5,.5),min.freq=5, random.order=FALSE, rot.per=.15, colors=brewer.pal(9,"Dark2")))

#list of most common words used in tweets
kable(head(RT_words %>% group_by(word) %>% summarize(n = n()) %>% arrange(desc(n)) %>% select(Word = word, Number_of_tweets = n) %>% top_n(4)))
Word Number_of_tweets
trump 41
tonight 20
hillary 18
live 15

In order to do a sentiment analysis, we need to load the sentiment categories.

nrc <- sentiments %>% filter(lexicon == "nrc") %>% select(word, sentiment) 

With the sentiments loaded, we can categorize the words with a tble join.

Surprisingly, the most frequent sentiment expressed with Bill Maher’s words is:

RT_words_sentiments <- RT_words %>% inner_join(nrc, by = "word")

kable(RT_words_sentiments %>% group_by(sentiment) %>% summarise(n = n()) %>% arrange(desc(n)) %>% select(Sentiment = sentiment, Number_of_tweets = n) %>% top_n(1))
Sentiment Number_of_tweets
positive 172

Below is a slice of recent tweets with the word that aligns them in the “positive” sentiment.

# identify tweets that align with the 'positive' sentiment
pos_tw_ids <- RT_words_sentiments %>% filter(sentiment == "positive") %>% distinct(id, word)
kable(RT_df %>% inner_join(pos_tw_ids, by = "id") %>% select(Date_Time = created,Tweet = text, Word = word) %>% slice(1:4))
Date_Time Tweet Word
2016-11-20 00:59:35 I’d give a week’s pay to hear that sermon! https://t.co/LEIb4gYGlv pay
2016-11-20 00:59:35 I’d give a week’s pay to hear that sermon! https://t.co/LEIb4gYGlv sermon
2016-11-18 21:40:58 Getting SO tired of hearing “the ppl voted for change”. Actually, she won. What has to change is “we win election, they get to be president” president
2016-11-17 19:54:18 Doesn’t @Mike_Pence look like the guy the airlines hire to play the Captain in the pre-flight video?… https://t.co/oGQiMJiXrh hire
#identify tweets that align with the 'negative' sentiment
neg_id_words <- RT_words_sentiments %>% filter(sentiment == "disgust") %>% distinct(id, word)
kable(RT_df %>% inner_join(neg_id_words, by = "id") %>% select(Date_Time = created,Tweet = text, Word = word) %>% slice(1:4))
Date_Time Tweet Word
2016-11-17 20:21:44 Since Trump got elected-slash-normalized, I’ve had weird dreams - anybody? A big orange skyscraper is chasing me - what does it mean???!! weird
2016-11-08 21:17:44 Shit just got real. #UseYourVote #Millennials https://t.co/nbNQ09pwPI shit
2016-11-08 14:41:11 Pls vote for Hillary today. Even if you don’t like her, its necessary to block a dangerous lunatic ultimate power. #ThisTimeIsDifferent lunatic
2016-11-05 05:00:12 Thank Trump for the one good thing he did. He exposed Evangelicals, who are his supporters as the shameless hippocr… https://t.co/v0Mq26BXQ8 shameless

Charles Blow’s Tweets

@CharlesMBlow “I’m always surprised when a column resonates with ppl bc I struggle so much to write them. Always worry that they’ll be bad.#TheResistance” Charles Blow writes a regular opinion column for the New York Times each Monday and Thursday.

America Elects a Bigot is his most recent

CB <- userTimeline('@CharlesMBlow', n = number_of_tweets)
CB_df <- twListToDF(CB)
CB_tweets <- CB_df %>% 
  select(created, id, statusSource, text)

Below are his most recent tweets.

kable(head(CB_tweets %>% select(Date_Time = created, Tweet = text)))
Date_Time Tweet
2016-11-20 15:22:08 Interesting… https://t.co/Eao3BQvG2B
2016-11-20 15:16:07 Ugh… https://t.co/o9FXORh4Ap
2016-11-20 15:08:55 Scheduled to be on @cnnreliable at 11 a.m. ET. Tune in if you can… https://t.co/1WbDNGwM5H
2016-11-20 05:01:59 I. Can’t. Even… https://t.co/gHg12YFCQ9
2016-11-20 01:32:57 . @joehick58 I’m not scared Joe… https://t.co/TWE0TXOZXj
2016-11-20 01:22:25 President-elect Trump, I’m just going to let the amazing Fannie Lou Hamer speak for me… #NotFoolingAnybody https://t.co/fzt8ESoUyd

He mainly tweets from one source, his iPhone. But he occasionally uses other platforms.

CB_df$statusSource = substr(CB_df$statusSource, 
                        regexpr('>', CB_df$statusSource) + 1, 
                        regexpr('</a>', CB_df$statusSource) - 1)

CB_platform <- CB_df %>% group_by(statusSource) %>% summarise(n = n()) %>% mutate(percent = n/sum(n)) %>% arrange(desc(n))
kable(CB_platform %>% select(Origin_of_Tweet = statusSource, Number_of_tweets = n, Percent = percent) %>% top_n(10), digits = 2)
Origin_of_Tweet Number_of_tweets Percent
Twitter for iPhone 111 0.89
Twitter Web Client 11 0.09
Instagram 3 0.02

Most common words used in recent tweets

CB_words <- CB_df %>%
  filter(!str_detect(text, '^"')) %>%
  mutate(text = str_replace_all(text, "https://t.co/[A-Za-z\\d]+|&amp;", "")) %>%
  unnest_tokens(word, text, token = "regex", pattern = reg) %>%
  filter(!word %in% stop_words$word,
         str_detect(word, "[a-z]"))
CB_words %>% count(word) %>% arrange(desc(n)) %>% with(wordcloud(word, n, max.words = 100, scale=c(5,.5),min.freq=5, random.order=FALSE, rot.per=.15, colors=brewer.pal(9,"Dark2")))

kable(CB_words %>% group_by(word) %>% summarize(n = n()) %>% arrange(desc(n)) %>% select(Word = word, Number_of_tweets = n) %>% top_n(4))
Word Number_of_tweets
trump 12
#electionnight 8
#theresistance 8
column 8

Chales Blows’s most commonly occuring sentiment, once matched with the words is “negative”. I always see him as an optimist, but given the presidential election, his sentiments are probably dark.

CB_words_sentiments <- CB_words %>% inner_join(nrc, by = "word")
kable(CB_words_sentiments %>% group_by(sentiment) %>% summarise(n = n()) %>% arrange(desc(n)) %>% select(Sentiment = sentiment, Number_of_tweets = n)%>% top_n(1))
Sentiment Number_of_tweets
negative 59

Below are examples of the the tweet words that correlate with the “positive” and “disgust” sentiments.

pos_tw_ids <- CB_words_sentiments %>% filter(sentiment == "positive") %>% distinct(id, word)
kable(CB_df %>% inner_join(pos_tw_ids, by = "id") %>% select(Date_Time = created,Tweet = text, Word = word) %>% slice(1:4))
Date_Time Tweet Word
2016-11-20 01:22:25 President-elect Trump, I’m just going to let the amazing Fannie Lou Hamer speak for me… #NotFoolingAnybody https://t.co/fzt8ESoUyd president
2016-11-20 01:22:25 President-elect Trump, I’m just going to let the amazing Fannie Lou Hamer speak for me… #NotFoolingAnybody https://t.co/fzt8ESoUyd elect
2016-11-19 15:40:11 Why is this man still on Twitter whining? I mean seriously. Aren’t you the president? Don’t you have some more raci… https://t.co/2XQwz0cLHd president
2016-11-18 02:18:23 Good lord, Armageddon is really near. Help us all… https://t.co/niV8mvGWyq lord
neg_id_words <- CB_words_sentiments %>% filter(sentiment == "disgust") %>% distinct(id, word)
kable(CB_df %>% inner_join(neg_id_words, by = "id") %>% select(Date_Time = created,Tweet = text, Word = word) %>% slice(1:4))
Date_Time Tweet Word
2016-11-19 04:32:27 This whole thing is just a disaster. Everything Trump accused Hillary of he will soon be guilty of… https://t.co/ZvggxHAHIq disaster
2016-11-18 02:18:23 Good lord, Armageddon is really near. Help us all… https://t.co/niV8mvGWyq lord
2016-11-17 18:42:59 Why aren’t more ppl aghast that Megan Kelly sat on all these accusations abt Team Trump until after voters couldn’t consider them?! #BadBiz aghast
2016-11-17 15:46:13 And this is the man more Americans judged as “honest and trustworthy”?! Is this real life or am I in a dream sequen… https://t.co/HUBpMvWY5j honest

Comparison of Bill Maher’s and Charles Blow’s tweets, I plot the percentages of them by sentiment.

RT_platform$Commentator <- "Bill Maher"
CB_platform$Commentator <- "Charles Blow"
RT_words_sentiments$Commentator <- "Bill Maher"
CB_words_sentiments$Commentator <- "Charles Blow"
platform2 <- rbind(RT_platform, CB_platform)
words_sentiments2 <- rbind(RT_words_sentiments, CB_words_sentiments)
joint_df <- words_sentiments2 %>% group_by(Commentator, sentiment) %>% summarise(n = n()) %>% mutate(frequency = n/sum(n))

ggplot(joint_df, aes(x = sentiment, y = frequency, fill = Commentator)) + geom_bar(stat = "identity", position = "dodge") + xlab("Sentiment") + ylab("Percent of tweets") + theme(axis.text.x = element_text(angle = 90, hjust = 1)) +
  scale_fill_manual(values=c("deeppink3" , "darkturquoise")) + ggtitle("Frequency of Sentiment Expressed")

Source of Tweets

In the final visualization, I compare the source of each person’s tweets. It looks like Bill Maher writes from his computer, while Charles Blow composes his on his phone. For some reason this suprised me, since he writes a column for a living. If I could explore this further, I would like to know if Charles Blow writes more negative tweets because he is so comfortable sending them from the mobile platform. This way he can compose them in the “heat of the moment” and without taking an opportunity to diffuse his “anger”.

pf <- c("Twitter Web Client", "Twitter for iPhone", "Media Studio", "Instagram")
pf_df <- platform2 %>% filter(statusSource %in% pf)
ggplot(pf_df, aes(x = statusSource, y = percent, fill = Commentator)) + 
  geom_bar(stat = "identity", position = "dodge") +
  xlab("Platform") +
  ylab("Percent of tweets") +
  theme(axis.text.x = element_text(angle = 90, hjust = 1))+
  scale_fill_manual(values=c("deeppink3" , "darkturquoise"))+ ggtitle("Source of Tweets")