I wanted to take a look at the reactions of two of my preferred political commentators, Bill Maher and Charles Blow. My newly developed skills in Twitter sentiment analysis comes in handy; I can get a sense of what these pundits have to say without thoroughly engaging with their material…it’s too soon for that.

Bill Maher’s Tweets:

@BillMaher America needs you more than ever, with me and all the rest of #TheResistance, until we can figure out how to really #MAGA! #WereStillHere

#load tweets and source 
number_of_tweets <- 2000
RT <- userTimeline('@BillMaher', n = number_of_tweets)
RT_df <- twListToDF(RT)
RT_tweets <- RT_df %>% 
  select(id, statusSource, text)

Bill Maher mainly tweets from these sources:

#most frequent tweeting sources
kable(head(RT_df %>% group_by(statusSource)) %>% 
  summarise(n = n()) %>%
  top_n(10))

statusSource	n
Twitter Web Client	1
Twitter for iPhone	1
Media Studio	4

The breakdown of the origin of his tweets:

# trim tweet to cleanly reveal status source and percentage of tweets from that source
RT_df$statusSource = substr(RT_df$statusSource, 
                        regexpr('>', RT_df$statusSource) + 1, 
                        regexpr('</a>', RT_df$statusSource) - 1)
RT_platform <- RT_df %>% group_by(statusSource) %>% summarise(n = n()) %>% mutate(percent = n/sum(n)) %>% arrange(desc(n))
kable(RT_platform %>% top_n(10), digits = 2)

statusSource	n	percent
Twitter Web Client	156	0.70
Twitter for iPhone	26	0.12
Media Studio	16	0.07
WhoSay	13	0.06
Instagram	5	0.02
SnapStream TV Search	4	0.02
iOS	2	0.01

In order to do a sentiment analysis of the tweets, the words in the sentence or phrase need to be isolated.

Some of Bill Maher’s common words that will be matched to sentiments include:

#trim
reg <- "([^A-Za-z\\d#@']|'(?![A-Za-z\\d#@]))"
RT_words <- RT_df %>%
  filter(!str_detect(text, '^"')) %>%
  mutate(text = str_replace_all(text, "https://t.co/[A-Za-z\\d]+|&amp;", "")) %>%
  unnest_tokens(word, text, token = "regex", pattern = reg) %>%
  filter(!word %in% stop_words$word,
         str_detect(word, "[a-z]"))

Most common words used in recent tweets

#word cloud
RT_words %>% count(word) %>% arrange(n) %>% with(wordcloud(word, n, max.words = 100, scale=c(5,.5),min.freq=5, random.order=FALSE, rot.per=.15, colors=brewer.pal(9,"Dark2")))

#list of most common words used in tweets
kable(head(RT_words %>% group_by(word) %>% summarize(n = n()) %>% arrange(desc(n)) %>% top_n(4)))

word	n
trump	43
hillary	17
live	13
tonight	13

In order to do a sentiment analysis, we need to load the sentiment categories.

nrc <- sentiments %>% filter(lexicon == "nrc") %>% select(word, sentiment)

With the sentiments loaded, we can categorize the words with a tble join.

Surprisingly, the most frequent sentiment expressed with Bill Maher’s words is:

#most commonly occuring sentiment
RT_words_sentiments <- RT_words %>% inner_join(nrc, by = "word")
kable(RT_words_sentiments %>% group_by(sentiment) %>% summarise(n = n()) %>% arrange(desc(n)) %>% top_n(1))

sentiment	n
positive	184

Below is a slice of recent tweets with the word that aligns them in the “positive” sentiment.

#identify tweets that align with the 'positive' sentiment
pos_tw_ids <- RT_words_sentiments %>% filter(sentiment == "positive") %>% distinct(id, word)
kable(RT_df %>% inner_join(pos_tw_ids, by = "id") %>% select(created,text, word) %>% slice(1:4))

created	text	word
2016-11-13 23:31:04	“This is a moral 9/11. Only 9/11 was done to us from the outside and we did this to ourselves.” (@tomfriedman) https://t.co/E9pQXotETY	moral
2016-11-12 06:19:32	America needs you more than ever, with me and all the rest of #TheResistance, until we can figure out how to really https://t.co/KGBWkQz6XJ	rest
2016-11-08 21:17:44	Shit just got real. #UseYourVote #Millennials https://t.co/nbNQ09pwPI	real
2016-11-08 14:41:11	Pls vote for Hillary today. Even if you don’t like her, its necessary to block a dangerous lunatic ultimate power. #ThisTimeIsDifferent	vote

And below is a sample of tweets with the word that corresponds to a sentiment of “disgust.”

#identify tweets that align with the 'negative' sentiment
neg_id_words <- RT_words_sentiments %>% filter(sentiment == "disgust") %>% distinct(id, word)
kable(RT_df %>% inner_join(neg_id_words, by = "id") %>% select(created,text, word) %>% slice(1:4))

created	text	word
2016-11-08 21:17:44	Shit just got real. #UseYourVote #Millennials https://t.co/nbNQ09pwPI	shit
2016-11-08 14:41:11	Pls vote for Hillary today. Even if you don’t like her, its necessary to block a dangerous lunatic ultimate power. #ThisTimeIsDifferent	lunatic
2016-11-05 05:00:12	Thank Trump for the one good thing he did. He exposed Evangelicals, who are his supporters as the shameless hippocr https://t.co/v0Mq26BXQ8	shameless
2016-10-20 02:40:35	Final thought: Hillary won the debate, but Alec Baldwin did a great job intensifying Trump’s insanity. That was Alec Baldwin, right?	insanity

Charles Blow’s Tweets

@CharlesMBlow “I’m always surprised when a column resonates with ppl bc I struggle so much to write them. Always worry that they’ll be bad.#TheResistance”

Charles Blow writes a regular opinion column for the New York Times each Monday and Thursday.

America Elects a Bigot is his most recent

CB <- userTimeline('@CharlesMBlow', n = number_of_tweets)
CB_df <- twListToDF(CB)
CB_tweets <- CB_df %>% 
  select(id, statusSource, text)

Below are his most recent tweets.

kable(head(CB_tweets %>% select(text)))

text

“NYT says subscriptions are up in response to Trump”

https://t.co/SYLQMFOigt
Thanks Lisa! #TeamFire https://t.co/Xp498zks6o
File my Monday columns on Friday. Which I had time to write a diff column abt this #Bannon announcement. #TheResistance
Over the course of this nearly 2-year campaign I haven’t heard Trump make a literary ref. Don’t believe he reads books. Scary thing # 53,478 So, this H. L. Mencken quote is again making the rounds. Of corse it was written pre-mass media, but still interest https://t.co/tdHBhwLoVx What the? https://t.co/oVvft9edx3

He mainly tweets from one source, his iPhone. But he occasionally uses other platforms.

CB_df$statusSource = substr(CB_df$statusSource, 
                        regexpr('>', CB_df$statusSource) + 1, 
                        regexpr('</a>', CB_df$statusSource) - 1)

CB_platform <- CB_df %>% group_by(statusSource) %>% summarise(n = n()) %>% mutate(percent = n/sum(n)) %>% arrange(desc(n))
kable(CB_platform %>% top_n(10), digits = 2)

statusSource	n	percent
Twitter for iPhone	112	0.85
Twitter Web Client	18	0.14
Instagram	2	0.02

Most common words used in recent tweets

CB_words <- CB_df %>%
  filter(!str_detect(text, '^"')) %>%
  mutate(text = str_replace_all(text, "https://t.co/[A-Za-z\\d]+|&amp;", "")) %>%
  unnest_tokens(word, text, token = "regex", pattern = reg) %>%
  filter(!word %in% stop_words$word,
         str_detect(word, "[a-z]"))

CB_words %>% count(word) %>% arrange(desc(n)) %>% with(wordcloud(word, n, max.words = 100, scale=c(5,.5),min.freq=5, random.order=FALSE, rot.per=.15, colors=brewer.pal(9,"Dark2")))

kable(CB_words %>% group_by(word) %>% summarize(n = n()) %>% arrange(desc(n)) %>% top_n(4))

word	n
column	11
#electionnight	10
ppl	9
#theresistance	8

Chales Blows’s Most commonly occuring sentiment, once matched with the words is “negative”. I always see him as an optimist, but given the presidential election, his sentiments are probably dark.

CB_words_sentiments <- CB_words %>% inner_join(nrc, by = "word")
kable(CB_words_sentiments %>% group_by(sentiment) %>% summarise(n = n()) %>% arrange(desc(n)) %>% top_n(1))

sentiment	n
negative	82

Below are examples of the the tweet words that correlate with the “positive” and “disgust” sentiments.

pos_tw_ids <- CB_words_sentiments %>% filter(sentiment == "positive") %>% distinct(id, word)
kable(CB_df %>% inner_join(pos_tw_ids, by = "id") %>% select(created,text, word) %>% slice(1:4))

created	text	word
2016-11-13 19:25:21	So, this H. L. Mencken quote is again making the rounds. Of corse it was written pre-mass media, but still interest https://t.co/tdHBhwLoVx	quote
2016-11-13 18:54:11	Proper punctuation dictates a question mark if that’s a question. Idiot. Don’t you have other things to worry about https://t.co/OVT5MeSgCp	proper
2016-11-13 18:54:11	Proper punctuation dictates a question mark if that’s a question. Idiot. Don’t you have other things to worry about https://t.co/OVT5MeSgCp	question
2016-11-13 18:19:33	Your life isn’t only measured by what happens in it (sometimes you can’t control that) but how you DEAL with what happens #TheResistance	measured

neg_id_words <- CB_words_sentiments %>% filter(sentiment == "disgust") %>% distinct(id, word)
kable(CB_df %>% inner_join(neg_id_words, by = "id") %>% select(created,text, word) %>% slice(1:4))

created	text	word
2016-11-13 18:54:11	Proper punctuation dictates a question mark if that’s a question. Idiot. Don’t you have other things to worry about https://t.co/OVT5MeSgCp	idiot
2016-11-13 18:23:01	I am now going back to read Jim Crow history and concentrating on how ppl sustained themselves against state hostility #TheResistance	hostility
2016-11-12 05:18:39	I think I’ve received as much response from this “America Elects a Bigot” column as any column I’ve ever written. Not sure how to process	bigot
2016-11-11 23:02:32	Oh no. The lord is still working on me. You can’t put you face and hands in my car. Not NEVER https://t.co/kbdtTfj8I1	lord

Comparison of Bill Maher’s and Charles Blow’s tweets, I plot the percentages of them by sentiment.

RT_platform$commentator <- "Bill Maher"
CB_platform$commentator <- "Charles Blow"
RT_words_sentiments$commentator <- "Bill Maher"
CB_words_sentiments$commentator <- "Charles Blow"
platform2 <- rbind(RT_platform, CB_platform)
words_sentiments2 <- rbind(RT_words_sentiments, CB_words_sentiments)

joint_df <- words_sentiments2 %>% group_by(commentator, sentiment) %>% summarise(n = n()) %>% mutate(frequency = n/sum(n))

ggplot(joint_df, aes(x = sentiment, y = frequency, fill = commentator)) + geom_bar(stat = "identity", position = "dodge") + xlab("Sentiment") + ylab("Percent of tweets") + theme(axis.text.x = element_text(angle = 90, hjust = 1)) +
  scale_fill_manual(values=c("deeppink3" , "darkturquoise"))

statusSource

In the final visualization, I compare the source of each person’s tweets. It looks like Bill Maher writes from his computer, while Charles Blow composes his on his phone. For some reason this suprised me, since he writes a column for a living. If I could explore this further, I would like to know if Charles Blow writes more negative tweets because he is so comfortable sending them from the mobile platform. This way he can compose them in the “heat of the moment.”

pf <- c("Twitter Web Client", "Twitter for iPhone", "Media Studio", "Instagram")
pf_df <- platform2 %>% filter(statusSource %in% pf)

ggplot(pf_df, aes(x = statusSource, y = percent, fill = commentator)) + 
  geom_bar(stat = "identity", position = "dodge") +
  xlab("Platform") +
  ylab("Percent of tweets") +
  theme(axis.text.x = element_text(angle = 90, hjust = 1))+
  scale_fill_manual(values=c("deeppink3" , "darkturquoise"))

Who is angrier these days?

Christine Iyer

November 13, 2016

Bill Maher’s Tweets:

@BillMaher America needs you more than ever, with me and all the rest of #TheResistance, until we can figure out how to really #MAGA! #WereStillHere

Most common words used in recent tweets

Charles Blow’s Tweets

@CharlesMBlow “I’m always surprised when a column resonates with ppl bc I struggle so much to write them. Always worry that they’ll be bad.#TheResistance”

text

Most common words used in recent tweets

Comparison of Bill Maher’s and Charles Blow’s tweets, I plot the percentages of them by sentiment.

statusSource