With the results of the election in, I could not help but take a look at what people had been feeling on Twitter. I was really interested in what kind of positive/negative feelings were coming from both sides. Even though this might not be completely correct based on what people are using for hashtags, I pulled the hillaryforprison tweets from Twitter and labeled them as Trump supporters. I pulled the loveTrumpshate tweets and labeled them as Hillary supporters.
## [1] "Using direct authentication"
# I pulled lovetrumpshate tweets and took a look at them by platform, like how we did in the lecture notes.
num_tweets <- 2000
lTh <- searchTwitter('#lovetrumpshate', n = num_tweets)
lTh_df <- twListToDF(lTh)
lTh_df %>% group_by(statusSource) %>%
summarize(n = n()) %>%
arrange(desc(n)) %>%
top_n(10)
## # A tibble: 10 × 2
## statusSource
## <chr>
## 1 <a href="http://twitter.com/download/iphone" rel="nofollow">Twitter for iPh
## 2 <a href="http://twitter.com" rel="nofollow">Twitter Web Client</a>
## 3 <a href="http://twitter.com/download/android" rel="nofollow">Twitter for An
## 4 <a href="http://twitter.com/#!/download/ipad" rel="nofollow">Twitter for iP
## 5 <a href="http://instagram.com" rel="nofollow">Instagram</a>
## 6 <a href="https://mobile.twitter.com" rel="nofollow">Mobile Web (M5)</a>
## 7 <a href="https://about.twitter.com/products/tweetdeck" rel="nofollow">Tweet
## 8 <a href="http://www.facebook.com/twitter" rel="nofollow">Facebook</a>
## 9 <a href="http://www.hootsuite.com" rel="nofollow">Hootsuite</a>
## 10 <a href="https://mobile.twitter.com" rel="nofollow">Mobile Web (M2)</a>
## # ... with 1 more variables: n <int>
lTh_df$statusSource = substr(lTh_df$statusSource,
regexpr('>', lTh_df$statusSource) + 1,
regexpr('</a>', lTh_df$statusSource) - 1)
lTh_platform <- lTh_df %>% group_by(statusSource) %>%
summarize(n = n()) %>%
mutate(percent_of_tweets = n/sum(n)) %>%
arrange(desc(n))
lTh_platform %>% top_n(10)
## # A tibble: 10 × 3
## statusSource n percent_of_tweets
## <chr> <int> <dbl>
## 1 Twitter for iPhone 815 0.4075
## 2 Twitter Web Client 537 0.2685
## 3 Twitter for Android 349 0.1745
## 4 Twitter for iPad 79 0.0395
## 5 Instagram 52 0.0260
## 6 Mobile Web (M5) 30 0.0150
## 7 TweetDeck 28 0.0140
## 8 Facebook 18 0.0090
## 9 Hootsuite 14 0.0070
## 10 Mobile Web (M2) 11 0.0055
lTh_df %>%
group_by(screenName) %>%
summarize(n = n()) %>%
mutate(percent_of_tweets = n/sum(n)) %>%
arrange(desc(n)) %>%
top_n(10)
## # A tibble: 11 × 3
## screenName n percent_of_tweets
## <chr> <int> <dbl>
## 1 Alisonnj 10 0.0050
## 2 RadicalRW 10 0.0050
## 3 ImDanielAddison 9 0.0045
## 4 kvpeckwriter 9 0.0045
## 5 HateFuckDestroy 6 0.0030
## 6 MzspellDena 6 0.0030
## 7 toma_media 6 0.0030
## 8 _Troll2 5 0.0025
## 9 AmyGrimesSuxx 5 0.0025
## 10 NATSllf 5 0.0025
## 11 VicariousChris 5 0.0025
#Isolate individual words as per the lecture notes. Then I joined the sentiments to the words.
reg <- "([^A-Za-z\\d#@']|'(?![A-Za-z\\d#@]))"
lTh_words <- lTh_df %>%
filter(!str_detect(text, '^"')) %>%
mutate(text = str_replace_all(text, "https://t.co/[A-Za-z\\d]+|&", "")) %>%
unnest_tokens(word, text, token = "regex", pattern = reg) %>%
filter(!word %in% stop_words$word,
str_detect(word, "[a-z]"))
lTh_words %>% group_by(word) %>% summarize(n = n()) %>% arrange(desc(n)) %>% top_n(20)
## # A tibble: 20 × 2
## word n
## <chr> <int>
## 1 #lovetrumpshate 1908
## 2 rt 964
## 3 #notmypresident 306
## 4 trump 166
## 5 love 128
## 6 vote 122
## 7 matters 114
## 8 popular 111
## 9 #trumpnation 110
## 10 @femtheologian 110
## 11 hate 110
## 12 https 91
## 13 march 88
## 14 day 78
## 15 proud 75
## 16 protest 67
## 17 @uhouston 65
## 18 @countermoonbat 60
## 19 people 57
## 20 #trump 51
nrc <- sentiments %>%
filter(lexicon == "nrc") %>%
select(word, sentiment)
head(nrc)
## # A tibble: 6 × 2
## word sentiment
## <chr> <chr>
## 1 abacus trust
## 2 abandon fear
## 3 abandon negative
## 4 abandon sadness
## 5 abandoned anger
## 6 abandoned fear
lTh_sentiments <- lTh_words %>% inner_join(nrc, by = "word")
lTh_sentiments %>% group_by(sentiment) %>% summarize(n = n()) %>% arrange(desc(n))
## # A tibble: 10 × 2
## sentiment n
## <chr> <int>
## 1 positive 1265
## 2 trust 771
## 3 negative 750
## 4 joy 699
## 5 anger 564
## 6 anticipation 553
## 7 sadness 438
## 8 surprise 436
## 9 fear 391
## 10 disgust 283
pos_lTh_ids <- lTh_sentiments %>% filter(sentiment == "positive") %>% distinct(id)
lTh_df %>% inner_join(pos_lTh_ids, by = "id") %>% select(text) %>% slice(1:10)
## text
## 1 RT @fIawlesssivan: THIS BEAUTIFUL FAMILY IS EVERYTHING I WANT MY FUTURE TO BE\n#LoveTrumpsHate #LoveIsLouder <U+2764><U+FE0F><U+2764><U+FE0F> https://t.co/TdBMR76o6p
## 2 Thank you, Mayor. #lovetrumpshate https://t.co/eGPromxaDJ
## 3 This is an pivotal moment for the children in our country. We must teach them that #LoveTrumpsHate. https://t.co/ck9gmytVKt
## 4 #dumptrump #LoveTrumpsHate #BoycottTrump #FuckTrump #notmypresident Those darned David Duke boys are repainting th<U+0085> https://t.co/UtfP8xdTgf
## 5 So reassured by the amount of love and compassion shown at Virginia Tech tonight. #LoveTrumpsHate #StrongerTogether<U+0085> https://t.co/nCbAOJZmYM
## 6 IDK about you guys, but I'm already tired of winning.\n\n#NotMyPresident #StillWithHer #LoveTrumpsHate #StopBannon #StrongerTogether #Protest
## 7 Aww...#Trump thinks the media didn't "portray" him fairly??? C'mon man...We saw you screaming crap LIVE!!!! #hardball\n#LoveTrumpsHate
## 8 This is very cool, she's against Trump but she's not just doing some meaningless gesture. #LoveTrumpsHate #fashion https://t.co/ABNptDYKmH
## 9 <ed><U+00A0><U+00BD><ed><U+00B1><U+008F><ed><U+00A0><U+00BC><ed><U+00BF><U+00BB><ed><U+00A0><U+00BD><ed><U+00B1><U+008F><ed><U+00A0><U+00BC><ed><U+00BF><U+00BB><ed><U+00A0><U+00BD><ed><U+00B1><U+008F><ed><U+00A0><U+00BC><ed><U+00BF><U+00BB><ed><U+00A0><U+00BD><ed><U+00B1><U+008F><ed><U+00A0><U+00BC><ed><U+00BF><U+00BB><ed><U+00A0><U+00BD><ed><U+00B1><U+008F><ed><U+00A0><U+00BC><ed><U+00BF><U+00BB><ed><U+00A0><U+00BD><ed><U+00B1><U+008F><ed><U+00A0><U+00BC><ed><U+00BF><U+00BB>Finally someone with #cajones Where are the rest of you #resistance #solidarity #LoveTrumpsHate<U+0085> https://t.co/1y3PpP8895
## 10 I may not have my reproductive organs anymore, but I will fight to the death for your reproductive rights. #lovetrumpshate #Obama
#I pulled what I labeled as Trump supporter tweets next. I
Hfp <- searchTwitter('#hillaryforprison', n = num_tweets)
head(Hfp)
## [[1]]
## [1] "Judicious34: RT @infowars: Hillary Is The Swamp, Trump Must Take Her Down - https://t.co/7dAoQqznEl #HillaryForPrison #LockHerUp"
##
## [[2]]
## [1] "Judicious34: RT @infowars: A Pardon For Hillary Pardons The District Of Criminals - https://t.co/8gOflwB2Yn #Infowars #HillaryForPrison"
##
## [[3]]
## [1] "BetterLife365: They need an example made of them to assure the American public that no one is above the law and as a deterrent to<U+0085> https://t.co/KLFrGMCCJe"
##
## [[4]]
## [1] "kanona70: RT @infowars: Hillary Is The Swamp, Trump Must Take Her Down - https://t.co/7dAoQqznEl #HillaryForPrison #LockHerUp"
##
## [[5]]
## [1] "KitKatC103: RT @DirtyGuap7: #NotMyPresident #ImWithHer #StillWithHer #DrainTheSwamp #LockHerUp #SaltyTears #HillaryForPrison #MAGA #PraiseKek https://t<U+0085>"
##
## [[6]]
## [1] "ProphetPX: RT @AVoluntarist: #HillaryForPrison @RogerJStoneJr https://t.co/fxBlqAXSZE"
Hfp_df <- twListToDF(Hfp)
Hfp_df$statusSource = substr(Hfp_df$statusSource,
regexpr('>', Hfp_df$statusSource) + 1,
regexpr('</a>', Hfp_df$statusSource) - 1)
Hfp_platform <- Hfp_df %>% group_by(statusSource) %>%
summarize(n = n()) %>%
mutate(percent_of_tweets = n / sum(n)) %>%
arrange(desc(n))
Hfp_words <- Hfp_df %>%
filter(!str_detect(text, '^"')) %>%
mutate(text = str_replace_all(text, "https://t.co/[A-Za-z\\d]+|&", "")) %>%
unnest_tokens(word, text, token = "regex", pattern = reg) %>%
filter(!word %in% stop_words$word,
str_detect(word, "[a-z]"))
Hfp_sentiments <- Hfp_words %>% inner_join(nrc, by = "word")
#I labeled the supporters and combined the data.
lTh_platform$supporter <- "Hillary"
Hfp_platform$supporter <- "Trump"
lTh_sentiments$supporter <- "Hillary"
Hfp_sentiments$supporter <- "Trump"
HTplatform <- rbind(Hfp_platform, lTh_platform)
HTwords_sentiments <- rbind(Hfp_sentiments, lTh_sentiments)
Despite the fact that the same tweets can contain multiple sentiment words or that a single word can have multiple sentiments, I found it very interesting that what I labeled as ‘Hillary supporters’ had a much higher percentage of positive tweets. In addition, Hillary supporters had a lower percentage of negative or angry tweets. It would be interesting to see what demographic makes up the group of 2,000, since they’re just pulled from the time the formula is run. For the final project, it would be interesting to see where these tweets are coming from. If they’re from the majority of red states or people living on the coast in normally ‘blue’ areas.
#I created the bar chart and compared sentiments.
sentTH_df <- HTwords_sentiments %>%
group_by(supporter, sentiment) %>%
summarize(n = n()) %>%
mutate(frequency = n/sum(n))
ggplot(sentTH_df, aes(x = sentiment, y = frequency, fill = supporter)) +
geom_bar(stat = "identity", position = "dodge") +
xlab("Sentiment") +
ylab("Percent of Tweets") +
theme(axis.text.x = element_text(angle = 90, hjust = 1))