I wanted to take a look at the reactions of two of my preferred political commentators, Bill Maher and Charles Blow. My newly developed skills in Twitter sentiment analysis comes in handy; I can get a sense of what these pundits have to say without thoroughly engaging with their material…it’s too soon for that.

Bill Maher’s Tweets:

@BillMaher America needs you more than ever, with me and all the rest of #TheResistance, until we can figure out how to really #MAGA! #WereStillHere

#load tweets and source 
number_of_tweets <- 2000
RT <- userTimeline('@BillMaher', n = number_of_tweets)
RT_df <- twListToDF(RT)
RT_tweets <- RT_df %>% 
  select(id, statusSource, text)

Bill Maher mainly tweets from these sources:

#most frequent tweeting sources
kable(head(RT_df %>% group_by(statusSource)) %>% 
  summarise(n = n()) %>%
  top_n(10))
statusSource n
Twitter Web Client 1
Twitter for iPhone 1
Media Studio 4

The breakdown of the origin of his tweets:

# trim tweet to cleanly reveal status source and percentage of tweets from that source
RT_df$statusSource = substr(RT_df$statusSource, 
                        regexpr('>', RT_df$statusSource) + 1, 
                        regexpr('</a>', RT_df$statusSource) - 1)
RT_platform <- RT_df %>% group_by(statusSource) %>% summarise(n = n()) %>% mutate(percent = n/sum(n)) %>% arrange(desc(n))
kable(RT_platform %>% top_n(10), digits = 2)
statusSource n percent
Twitter Web Client 156 0.70
Twitter for iPhone 26 0.12
Media Studio 16 0.07
WhoSay 13 0.06
Instagram 5 0.02
SnapStream TV Search 4 0.02
iOS 2 0.01

In order to do a sentiment analysis of the tweets, the words in the sentence or phrase need to be isolated.

Some of Bill Maher’s common words that will be matched to sentiments include:

#trim
reg <- "([^A-Za-z\\d#@']|'(?![A-Za-z\\d#@]))"
RT_words <- RT_df %>%
  filter(!str_detect(text, '^"')) %>%
  mutate(text = str_replace_all(text, "https://t.co/[A-Za-z\\d]+|&amp;", "")) %>%
  unnest_tokens(word, text, token = "regex", pattern = reg) %>%
  filter(!word %in% stop_words$word,
         str_detect(word, "[a-z]"))

Most common words used in recent tweets

#word cloud
RT_words %>% count(word) %>% arrange(n) %>% with(wordcloud(word, n, max.words = 100, scale=c(5,.5),min.freq=5, random.order=FALSE, rot.per=.15, colors=brewer.pal(9,"Dark2")))

#list of most common words used in tweets
kable(head(RT_words %>% group_by(word) %>% summarize(n = n()) %>% arrange(desc(n)) %>% top_n(4)))
word n
trump 43
hillary 17
live 13
tonight 13

In order to do a sentiment analysis, we need to load the sentiment categories.

nrc <- sentiments %>% filter(lexicon == "nrc") %>% select(word, sentiment) 

With the sentiments loaded, we can categorize the words with a tble join.

Surprisingly, the most frequent sentiment expressed with Bill Maher’s words is:

#most commonly occuring sentiment
RT_words_sentiments <- RT_words %>% inner_join(nrc, by = "word")
kable(RT_words_sentiments %>% group_by(sentiment) %>% summarise(n = n()) %>% arrange(desc(n)) %>% top_n(1))
sentiment n
positive 184

Below is a slice of recent tweets with the word that aligns them in the “positive” sentiment.

#identify tweets that align with the 'positive' sentiment
pos_tw_ids <- RT_words_sentiments %>% filter(sentiment == "positive") %>% distinct(id, word)
kable(RT_df %>% inner_join(pos_tw_ids, by = "id") %>% select(created,text, word) %>% slice(1:4))
created text word
2016-11-13 23:31:04 “This is a moral 9/11. Only 9/11 was done to us from the outside and we did this to ourselves.” (@tomfriedmanhttps://t.co/E9pQXotETY moral
2016-11-12 06:19:32 America needs you more than ever, with me and all the rest of #TheResistance, until we can figure out how to really… https://t.co/KGBWkQz6XJ rest
2016-11-08 21:17:44 Shit just got real. #UseYourVote #Millennials https://t.co/nbNQ09pwPI real
2016-11-08 14:41:11 Pls vote for Hillary today. Even if you don’t like her, its necessary to block a dangerous lunatic ultimate power. #ThisTimeIsDifferent vote

And below is a sample of tweets with the word that corresponds to a sentiment of “disgust.”

#identify tweets that align with the 'negative' sentiment
neg_id_words <- RT_words_sentiments %>% filter(sentiment == "disgust") %>% distinct(id, word)
kable(RT_df %>% inner_join(neg_id_words, by = "id") %>% select(created,text, word) %>% slice(1:4))
created text word
2016-11-08 21:17:44 Shit just got real. #UseYourVote #Millennials https://t.co/nbNQ09pwPI shit
2016-11-08 14:41:11 Pls vote for Hillary today. Even if you don’t like her, its necessary to block a dangerous lunatic ultimate power. #ThisTimeIsDifferent lunatic
2016-11-05 05:00:12 Thank Trump for the one good thing he did. He exposed Evangelicals, who are his supporters as the shameless hippocr… https://t.co/v0Mq26BXQ8 shameless
2016-10-20 02:40:35 Final thought: Hillary won the debate, but Alec Baldwin did a great job intensifying Trump’s insanity. That was Alec Baldwin, right? insanity

Charles Blow’s Tweets

@CharlesMBlow “I’m always surprised when a column resonates with ppl bc I struggle so much to write them. Always worry that they’ll be bad.#TheResistance”

Charles Blow writes a regular opinion column for the New York Times each Monday and Thursday.

America Elects a Bigot is his most recent

CB <- userTimeline('@CharlesMBlow', n = number_of_tweets)
CB_df <- twListToDF(CB)
CB_tweets <- CB_df %>% 
  select(id, statusSource, text)

Below are his most recent tweets.

kable(head(CB_tweets %>% select(text)))

text

“NYT says subscriptions are up in response to Trump”

https://t.co/SYLQMFOigt
Thanks Lisa! #TeamFire https://t.co/Xp498zks6o
File my Monday columns on Friday. Which I had time to write a diff column abt this #Bannon announcement. #TheResistance
Over the course of this nearly 2-year campaign I haven’t heard Trump make a literary ref. Don’t believe he reads books. Scary thing # 53,478 So, this H. L. Mencken quote is again making the rounds. Of corse it was written pre-mass media, but still interest… https://t.co/tdHBhwLoVx What the…? https://t.co/oVvft9edx3

He mainly tweets from one source, his iPhone. But he occasionally uses other platforms.

CB_df$statusSource = substr(CB_df$statusSource, 
                        regexpr('>', CB_df$statusSource) + 1, 
                        regexpr('</a>', CB_df$statusSource) - 1)

CB_platform <- CB_df %>% group_by(statusSource) %>% summarise(n = n()) %>% mutate(percent = n/sum(n)) %>% arrange(desc(n))
kable(CB_platform %>% top_n(10), digits = 2)
statusSource n percent
Twitter for iPhone 112 0.85
Twitter Web Client 18 0.14
Instagram 2 0.02

Most common words used in recent tweets

CB_words <- CB_df %>%
  filter(!str_detect(text, '^"')) %>%
  mutate(text = str_replace_all(text, "https://t.co/[A-Za-z\\d]+|&amp;", "")) %>%
  unnest_tokens(word, text, token = "regex", pattern = reg) %>%
  filter(!word %in% stop_words$word,
         str_detect(word, "[a-z]"))
CB_words %>% count(word) %>% arrange(desc(n)) %>% with(wordcloud(word, n, max.words = 100, scale=c(5,.5),min.freq=5, random.order=FALSE, rot.per=.15, colors=brewer.pal(9,"Dark2")))

kable(CB_words %>% group_by(word) %>% summarize(n = n()) %>% arrange(desc(n)) %>% top_n(4))
word n
column 11
#electionnight 10
ppl 9
#theresistance 8

Chales Blows’s Most commonly occuring sentiment, once matched with the words is “negative”. I always see him as an optimist, but given the presidential election, his sentiments are probably dark.

CB_words_sentiments <- CB_words %>% inner_join(nrc, by = "word")
kable(CB_words_sentiments %>% group_by(sentiment) %>% summarise(n = n()) %>% arrange(desc(n)) %>% top_n(1))
sentiment n
negative 82

Below are examples of the the tweet words that correlate with the “positive” and “disgust” sentiments.

pos_tw_ids <- CB_words_sentiments %>% filter(sentiment == "positive") %>% distinct(id, word)
kable(CB_df %>% inner_join(pos_tw_ids, by = "id") %>% select(created,text, word) %>% slice(1:4))
created text word
2016-11-13 19:25:21 So, this H. L. Mencken quote is again making the rounds. Of corse it was written pre-mass media, but still interest… https://t.co/tdHBhwLoVx quote
2016-11-13 18:54:11 Proper punctuation dictates a question mark if that’s a question. Idiot. Don’t you have other things to worry about… https://t.co/OVT5MeSgCp proper
2016-11-13 18:54:11 Proper punctuation dictates a question mark if that’s a question. Idiot. Don’t you have other things to worry about… https://t.co/OVT5MeSgCp question
2016-11-13 18:19:33 Your life isn’t only measured by what happens in it (sometimes you can’t control that) but how you DEAL with what happens… #TheResistance measured
neg_id_words <- CB_words_sentiments %>% filter(sentiment == "disgust") %>% distinct(id, word)
kable(CB_df %>% inner_join(neg_id_words, by = "id") %>% select(created,text, word) %>% slice(1:4))
created text word
2016-11-13 18:54:11 Proper punctuation dictates a question mark if that’s a question. Idiot. Don’t you have other things to worry about… https://t.co/OVT5MeSgCp idiot
2016-11-13 18:23:01 I am now going back to read Jim Crow history and concentrating on how ppl sustained themselves against state hostility… #TheResistance hostility
2016-11-12 05:18:39 I think I’ve received as much response from this “America Elects a Bigot” column as any column I’ve ever written. Not sure how to process… bigot
2016-11-11 23:02:32 Oh no. The lord is still working on me. You can’t put you face and hands in my car. Not NEVER… https://t.co/kbdtTfj8I1 lord

Comparison of Bill Maher’s and Charles Blow’s tweets, I plot the percentages of them by sentiment.

RT_platform$commentator <- "Bill Maher"
CB_platform$commentator <- "Charles Blow"
RT_words_sentiments$commentator <- "Bill Maher"
CB_words_sentiments$commentator <- "Charles Blow"
platform2 <- rbind(RT_platform, CB_platform)
words_sentiments2 <- rbind(RT_words_sentiments, CB_words_sentiments)
joint_df <- words_sentiments2 %>% group_by(commentator, sentiment) %>% summarise(n = n()) %>% mutate(frequency = n/sum(n))

ggplot(joint_df, aes(x = sentiment, y = frequency, fill = commentator)) + geom_bar(stat = "identity", position = "dodge") + xlab("Sentiment") + ylab("Percent of tweets") + theme(axis.text.x = element_text(angle = 90, hjust = 1)) +
  scale_fill_manual(values=c("deeppink3" , "darkturquoise"))

statusSource

In the final visualization, I compare the source of each person’s tweets. It looks like Bill Maher writes from his computer, while Charles Blow composes his on his phone. For some reason this suprised me, since he writes a column for a living. If I could explore this further, I would like to know if Charles Blow writes more negative tweets because he is so comfortable sending them from the mobile platform. This way he can compose them in the “heat of the moment.”

pf <- c("Twitter Web Client", "Twitter for iPhone", "Media Studio", "Instagram")
pf_df <- platform2 %>% filter(statusSource %in% pf)

ggplot(pf_df, aes(x = statusSource, y = percent, fill = commentator)) + 
  geom_bar(stat = "identity", position = "dodge") +
  xlab("Platform") +
  ylab("Percent of tweets") +
  theme(axis.text.x = element_text(angle = 90, hjust = 1))+
  scale_fill_manual(values=c("deeppink3" , "darkturquoise"))