Who is Angrier?

I wanted to take a look at the reactions of two of my preferred political Commentators, Bill Maher and Charles Blow. My newly developed skills in Twitter sentiment analysis comes in handy; I can get a sense of what these pundits have to say without thoroughly engaging with their material.it’s too soon for that.

Bill Maher’s Tweets:

@BillMaher America needs you more than ever, with me and all the rest of #TheResistance, until we can figure out how to really #MAGA! #WereStillHere

#load tweets and source 
number_of_tweets <- 2000
RT <- userTimeline('@BillMaher', n = number_of_tweets)
RT_df <- twListToDF(RT)
RT_tweets <- RT_df %>% 
  select(id, statusSource, text)

Bill Maher mainly tweets from these sources. I’ve selected the top 10 to make sure I include all the platforms that he uses.

The frequency breakdown of the origin of his tweets:

# trim tweet to cleanly reveal status source and percentage of tweets from that source
RT_df$statusSource = substr(RT_df$statusSource, 
                        regexpr('>', RT_df$statusSource) + 1, 
                        regexpr('</a>', RT_df$statusSource) - 1)
RT_platform <- RT_df %>% group_by(statusSource) %>% summarise(n = n()) %>% mutate(percent = n/sum(n)) %>% arrange(desc(n))
kable(RT_platform %>% select(Origin_of_Tweet = statusSource, Number_of_tweets = n, Percent = percent) %>% top_n(10), digits = 2)

Origin_of_Tweet	Number_of_tweets	Percent
Twitter Web Client	130	0.57
Twitter for iPhone	43	0.19
Media Studio	24	0.11
WhoSay	17	0.07
Instagram	7	0.03
iOS	4	0.02
SnapStream TV Search	3	0.01

In order to do a sentiment analysis of the tweets, the words in the sentence or phrase need to be isolated.

Some of Bill Maher’s common words that will be matched to sentiments include:

reg <- "([^A-Za-z\\d#@']|'(?![A-Za-z\\d#@]))"
RT_words <- RT_df %>%
  filter(!str_detect(text, '^"')) %>%
  mutate(text = str_replace_all(text, "https://t.co/[A-Za-z\\d]+|&amp;", "")) %>%
  unnest_tokens(word, text, token = "regex", pattern = reg) %>%
  filter(!word %in% stop_words$word,
         str_detect(word, "[a-z]"))

Most common words used in recent tweets

RT_words %>% count(word) %>% arrange(n) %>% with(wordcloud(word, n, max.words = 100, scale=c(5,.5),min.freq=5, random.order=FALSE, rot.per=.15, colors=brewer.pal(9,"Dark2")))

#list of most common words used in tweets
kable(head(RT_words %>% group_by(word) %>% summarize(n = n()) %>% arrange(desc(n)) %>% select(Word = word, Number_of_tweets = n) %>% top_n(4)))

Word	Number_of_tweets
trump	41
tonight	20
hillary	18
live	15

In order to do a sentiment analysis, we need to load the sentiment categories.

nrc <- sentiments %>% filter(lexicon == "nrc") %>% select(word, sentiment)

With the sentiments loaded, we can categorize the words with a tble join.

Surprisingly, the most frequent sentiment expressed with Bill Maher’s words is:

RT_words_sentiments <- RT_words %>% inner_join(nrc, by = "word")

kable(RT_words_sentiments %>% group_by(sentiment) %>% summarise(n = n()) %>% arrange(desc(n)) %>% select(Sentiment = sentiment, Number_of_tweets = n) %>% top_n(1))

Sentiment	Number_of_tweets
positive	172

Below is a slice of recent tweets with the word that aligns them in the “positive” sentiment.

# identify tweets that align with the 'positive' sentiment
pos_tw_ids <- RT_words_sentiments %>% filter(sentiment == "positive") %>% distinct(id, word)
kable(RT_df %>% inner_join(pos_tw_ids, by = "id") %>% select(Date_Time = created,Tweet = text, Word = word) %>% slice(1:4))

Date_Time	Tweet	Word
2016-11-20 00:59:35	I’d give a week’s pay to hear that sermon! https://t.co/LEIb4gYGlv	pay
2016-11-20 00:59:35	I’d give a week’s pay to hear that sermon! https://t.co/LEIb4gYGlv	sermon
2016-11-18 21:40:58	Getting SO tired of hearing “the ppl voted for change”. Actually, she won. What has to change is “we win election, they get to be president”	president
2016-11-17 19:54:18	Doesn’t @Mike_Pence look like the guy the airlines hire to play the Captain in the pre-flight video? https://t.co/oGQiMJiXrh	hire

#identify tweets that align with the 'negative' sentiment
neg_id_words <- RT_words_sentiments %>% filter(sentiment == "disgust") %>% distinct(id, word)
kable(RT_df %>% inner_join(neg_id_words, by = "id") %>% select(Date_Time = created,Tweet = text, Word = word) %>% slice(1:4))

Date_Time	Tweet	Word
2016-11-17 20:21:44	Since Trump got elected-slash-normalized, I’ve had weird dreams - anybody? A big orange skyscraper is chasing me - what does it mean???!!	weird
2016-11-08 21:17:44	Shit just got real. #UseYourVote #Millennials https://t.co/nbNQ09pwPI	shit
2016-11-08 14:41:11	Pls vote for Hillary today. Even if you don’t like her, its necessary to block a dangerous lunatic ultimate power. #ThisTimeIsDifferent	lunatic
2016-11-05 05:00:12	Thank Trump for the one good thing he did. He exposed Evangelicals, who are his supporters as the shameless hippocr https://t.co/v0Mq26BXQ8	shameless

Charles Blow’s Tweets

@CharlesMBlow “I’m always surprised when a column resonates with ppl bc I struggle so much to write them. Always worry that they’ll be bad.#TheResistance” Charles Blow writes a regular opinion column for the New York Times each Monday and Thursday.

America Elects a Bigot is his most recent

CB <- userTimeline('@CharlesMBlow', n = number_of_tweets)
CB_df <- twListToDF(CB)
CB_tweets <- CB_df %>% 
  select(created, id, statusSource, text)

Below are his most recent tweets.

kable(head(CB_tweets %>% select(Date_Time = created, Tweet = text)))

Date_Time	Tweet
2016-11-20 15:22:08	Interesting https://t.co/Eao3BQvG2B
2016-11-20 15:16:07	Ugh https://t.co/o9FXORh4Ap
2016-11-20 15:08:55	Scheduled to be on @cnnreliable at 11 a.m. ET. Tune in if you can https://t.co/1WbDNGwM5H
2016-11-20 05:01:59	I. Can’t. Even https://t.co/gHg12YFCQ9
2016-11-20 01:32:57	. @joehick58 I’m not scared Joe https://t.co/TWE0TXOZXj
2016-11-20 01:22:25	President-elect Trump, I’m just going to let the amazing Fannie Lou Hamer speak for me #NotFoolingAnybody https://t.co/fzt8ESoUyd

He mainly tweets from one source, his iPhone. But he occasionally uses other platforms.

CB_df$statusSource = substr(CB_df$statusSource, 
                        regexpr('>', CB_df$statusSource) + 1, 
                        regexpr('</a>', CB_df$statusSource) - 1)

CB_platform <- CB_df %>% group_by(statusSource) %>% summarise(n = n()) %>% mutate(percent = n/sum(n)) %>% arrange(desc(n))
kable(CB_platform %>% select(Origin_of_Tweet = statusSource, Number_of_tweets = n, Percent = percent) %>% top_n(10), digits = 2)

Origin_of_Tweet	Number_of_tweets	Percent
Twitter for iPhone	111	0.89
Twitter Web Client	11	0.09
Instagram	3	0.02

Most common words used in recent tweets

CB_words <- CB_df %>%
  filter(!str_detect(text, '^"')) %>%
  mutate(text = str_replace_all(text, "https://t.co/[A-Za-z\\d]+|&amp;", "")) %>%
  unnest_tokens(word, text, token = "regex", pattern = reg) %>%
  filter(!word %in% stop_words$word,
         str_detect(word, "[a-z]"))

CB_words %>% count(word) %>% arrange(desc(n)) %>% with(wordcloud(word, n, max.words = 100, scale=c(5,.5),min.freq=5, random.order=FALSE, rot.per=.15, colors=brewer.pal(9,"Dark2")))

kable(CB_words %>% group_by(word) %>% summarize(n = n()) %>% arrange(desc(n)) %>% select(Word = word, Number_of_tweets = n) %>% top_n(4))

Word	Number_of_tweets
trump	12
#electionnight	8
#theresistance	8
column	8

Chales Blows’s most commonly occuring sentiment, once matched with the words is “negative”. I always see him as an optimist, but given the presidential election, his sentiments are probably dark.

CB_words_sentiments <- CB_words %>% inner_join(nrc, by = "word")
kable(CB_words_sentiments %>% group_by(sentiment) %>% summarise(n = n()) %>% arrange(desc(n)) %>% select(Sentiment = sentiment, Number_of_tweets = n)%>% top_n(1))

Sentiment	Number_of_tweets
negative	59

Below are examples of the the tweet words that correlate with the “positive” and “disgust” sentiments.

pos_tw_ids <- CB_words_sentiments %>% filter(sentiment == "positive") %>% distinct(id, word)
kable(CB_df %>% inner_join(pos_tw_ids, by = "id") %>% select(Date_Time = created,Tweet = text, Word = word) %>% slice(1:4))

Date_Time	Tweet	Word
2016-11-20 01:22:25	President-elect Trump, I’m just going to let the amazing Fannie Lou Hamer speak for me #NotFoolingAnybody https://t.co/fzt8ESoUyd	president
2016-11-20 01:22:25	President-elect Trump, I’m just going to let the amazing Fannie Lou Hamer speak for me #NotFoolingAnybody https://t.co/fzt8ESoUyd	elect
2016-11-19 15:40:11	Why is this man still on Twitter whining? I mean seriously. Aren’t you the president? Don’t you have some more raci https://t.co/2XQwz0cLHd	president
2016-11-18 02:18:23	Good lord, Armageddon is really near. Help us all… https://t.co/niV8mvGWyq	lord

neg_id_words <- CB_words_sentiments %>% filter(sentiment == "disgust") %>% distinct(id, word)
kable(CB_df %>% inner_join(neg_id_words, by = "id") %>% select(Date_Time = created,Tweet = text, Word = word) %>% slice(1:4))

Date_Time	Tweet	Word
2016-11-19 04:32:27	This whole thing is just a disaster. Everything Trump accused Hillary of he will soon be guilty of https://t.co/ZvggxHAHIq	disaster
2016-11-18 02:18:23	Good lord, Armageddon is really near. Help us all… https://t.co/niV8mvGWyq	lord
2016-11-17 18:42:59	Why aren’t more ppl aghast that Megan Kelly sat on all these accusations abt Team Trump until after voters couldn’t consider them?! #BadBiz	aghast
2016-11-17 15:46:13	And this is the man more Americans judged as “honest and trustworthy”?! Is this real life or am I in a dream sequen https://t.co/HUBpMvWY5j	honest

Comparison of Bill Maher’s and Charles Blow’s tweets, I plot the percentages of them by sentiment.

RT_platform$Commentator <- "Bill Maher"
CB_platform$Commentator <- "Charles Blow"
RT_words_sentiments$Commentator <- "Bill Maher"
CB_words_sentiments$Commentator <- "Charles Blow"
platform2 <- rbind(RT_platform, CB_platform)
words_sentiments2 <- rbind(RT_words_sentiments, CB_words_sentiments)

joint_df <- words_sentiments2 %>% group_by(Commentator, sentiment) %>% summarise(n = n()) %>% mutate(frequency = n/sum(n))

ggplot(joint_df, aes(x = sentiment, y = frequency, fill = Commentator)) + geom_bar(stat = "identity", position = "dodge") + xlab("Sentiment") + ylab("Percent of tweets") + theme(axis.text.x = element_text(angle = 90, hjust = 1)) +
  scale_fill_manual(values=c("deeppink3" , "darkturquoise")) + ggtitle("Frequency of Sentiment Expressed")

Source of Tweets

In the final visualization, I compare the source of each person’s tweets. It looks like Bill Maher writes from his computer, while Charles Blow composes his on his phone. For some reason this suprised me, since he writes a column for a living. If I could explore this further, I would like to know if Charles Blow writes more negative tweets because he is so comfortable sending them from the mobile platform. This way he can compose them in the “heat of the moment” and without taking an opportunity to diffuse his “anger”.

pf <- c("Twitter Web Client", "Twitter for iPhone", "Media Studio", "Instagram")
pf_df <- platform2 %>% filter(statusSource %in% pf)
ggplot(pf_df, aes(x = statusSource, y = percent, fill = Commentator)) + 
  geom_bar(stat = "identity", position = "dodge") +
  xlab("Platform") +
  ylab("Percent of tweets") +
  theme(axis.text.x = element_text(angle = 90, hjust = 1))+
  scale_fill_manual(values=c("deeppink3" , "darkturquoise"))+ ggtitle("Source of Tweets")

Who is Angrier?

Christine Iyer’s Assignment 4

November 20, 2016

Bill Maher’s Tweets:

Most common words used in recent tweets

Charles Blow’s Tweets

Most common words used in recent tweets

Comparison of Bill Maher’s and Charles Blow’s tweets, I plot the percentages of them by sentiment.

Source of Tweets