This assignemnt we were tasked with taking down data from Twitter and to do something cool with it. I decided to look at data on #turtle, because I love turtles and wanted to see what I could tell from this assignment about this topic on Twitter.
First off, I set up the the developer connection for Twitter as we learned and then it was off to get more data! I find it quite fascinating that Twitter has this integration built to talk with a program software like R, and can tell there is so much one could do with Twitter data having the right tools and know-how. I felt slightly like a bull in a china shop with this assignment though, as personally I feel the assignments tend to highlight more all the things I don’t know about R and data management, rather than what I can do.
app <- "MBA676 Assignment4"
consumer_key <- "3is9uJeDfw2S2jRm6ZB8gN79Q"
consumer_secret <-"JvOjFhtXAO0dmzfzKsmWUqZ2Xa5eMwOxPCtzMQcPBtYJFDJwPd"
access_token <- "402207199-i3cRQ0UVvJtM9bzuuUqiDpV5Sqo2myq8Xx0UEybz"
access_secret <- "BOYbHOvzkiIywGu8I3BD57hc8Y1MuOe4caOBRP2dnNuZs"
my_token <- create_token(app = app, consumer_key = consumer_key,
consumer_secret = consumer_secret,
access_token = access_token,
access_secret = access_secret)
(echo = FALSE)
## [1] FALSE
I first looked searched on just #turtle to see what data came back.
num_tweets <- 1000
tt <- search_tweets('#Turtle', n = num_tweets,
include_rts = FALSE)
head(tt)
## # A tibble: 6 x 90
## user_id status_id created_at screen_name text source
## <chr> <chr> <dttm> <chr> <chr> <chr>
## 1 105368~ 11995505~ 2019-11-27 04:48:36 Thedailyme~ Mega~ Twitt~
## 2 852037~ 11995469~ 2019-11-27 04:34:03 Orange2016~ #<U+81ED><U+6854> ~ Twitt~
## 3 852037~ 11970859~ 2019-11-20 09:35:06 Orange2016~ #<U+81ED><U+6854> ~ Twitt~
## 4 108381~ 11995399~ 2019-11-27 04:06:22 crochetand~ Had ~ Twitt~
## 5 764651~ 11995399~ 2019-11-27 04:06:22 CecilsJust~ This~ Faceb~
## 6 307109~ 11995395~ 2019-11-27 04:04:48 RealCoastal This~ Faceb~
## # ... with 84 more variables: display_text_width <dbl>,
## # reply_to_status_id <chr>, reply_to_user_id <chr>,
## # reply_to_screen_name <chr>, is_quote <lgl>, is_retweet <lgl>,
## # favorite_count <int>, retweet_count <int>, quote_count <int>,
## # reply_count <int>, hashtags <list>, symbols <list>, urls_url <list>,
## # urls_t.co <list>, urls_expanded_url <list>, media_url <list>,
## # media_t.co <list>, media_expanded_url <list>, media_type <list>,
## # ext_media_url <list>, ext_media_t.co <list>, ext_media_expanded_url <list>,
## # ext_media_type <chr>, mentions_user_id <list>, mentions_screen_name <list>,
## # lang <chr>, quoted_status_id <chr>, quoted_text <chr>,
## # quoted_created_at <dttm>, quoted_source <chr>, quoted_favorite_count <int>,
## # quoted_retweet_count <int>, quoted_user_id <chr>, quoted_screen_name <chr>,
## # quoted_name <chr>, quoted_followers_count <int>,
## # quoted_friends_count <int>, quoted_statuses_count <int>,
## # quoted_location <chr>, quoted_description <chr>, quoted_verified <lgl>,
## # retweet_status_id <chr>, retweet_text <chr>, retweet_created_at <dttm>,
## # retweet_source <chr>, retweet_favorite_count <int>,
## # retweet_retweet_count <int>, retweet_user_id <chr>,
## # retweet_screen_name <chr>, retweet_name <chr>,
## # retweet_followers_count <int>, retweet_friends_count <int>,
## # retweet_statuses_count <int>, retweet_location <chr>,
## # retweet_description <chr>, retweet_verified <lgl>, place_url <chr>,
## # place_name <chr>, place_full_name <chr>, place_type <chr>, country <chr>,
## # country_code <chr>, geo_coords <list>, coords_coords <list>,
## # bbox_coords <list>, status_url <chr>, name <chr>, location <chr>,
## # description <chr>, url <chr>, protected <lgl>, followers_count <int>,
## # friends_count <int>, listed_count <int>, statuses_count <int>,
## # favourites_count <int>, account_created_at <dttm>, verified <lgl>,
## # profile_url <chr>, profile_expanded_url <chr>, account_lang <lgl>,
## # profile_banner_url <chr>, profile_background_url <chr>,
## # profile_image_url <chr>
I tried a few code variations on source and screen name, but did not find they gave me any insight that I could follow.
turtle_platform <- tt %>% group_by(source) %>%
summarize(n = n()) %>%
mutate(percent_of_tweets = n/sum(n)) %>%
arrange(desc(n))
head(tt)
## # A tibble: 6 x 90
## user_id status_id created_at screen_name text source
## <chr> <chr> <dttm> <chr> <chr> <chr>
## 1 105368~ 11995505~ 2019-11-27 04:48:36 Thedailyme~ Mega~ Twitt~
## 2 852037~ 11995469~ 2019-11-27 04:34:03 Orange2016~ #<U+81ED><U+6854> ~ Twitt~
## 3 852037~ 11970859~ 2019-11-20 09:35:06 Orange2016~ #<U+81ED><U+6854> ~ Twitt~
## 4 108381~ 11995399~ 2019-11-27 04:06:22 crochetand~ Had ~ Twitt~
## 5 764651~ 11995399~ 2019-11-27 04:06:22 CecilsJust~ This~ Faceb~
## 6 307109~ 11995395~ 2019-11-27 04:04:48 RealCoastal This~ Faceb~
## # ... with 84 more variables: display_text_width <dbl>,
## # reply_to_status_id <chr>, reply_to_user_id <chr>,
## # reply_to_screen_name <chr>, is_quote <lgl>, is_retweet <lgl>,
## # favorite_count <int>, retweet_count <int>, quote_count <int>,
## # reply_count <int>, hashtags <list>, symbols <list>, urls_url <list>,
## # urls_t.co <list>, urls_expanded_url <list>, media_url <list>,
## # media_t.co <list>, media_expanded_url <list>, media_type <list>,
## # ext_media_url <list>, ext_media_t.co <list>, ext_media_expanded_url <list>,
## # ext_media_type <chr>, mentions_user_id <list>, mentions_screen_name <list>,
## # lang <chr>, quoted_status_id <chr>, quoted_text <chr>,
## # quoted_created_at <dttm>, quoted_source <chr>, quoted_favorite_count <int>,
## # quoted_retweet_count <int>, quoted_user_id <chr>, quoted_screen_name <chr>,
## # quoted_name <chr>, quoted_followers_count <int>,
## # quoted_friends_count <int>, quoted_statuses_count <int>,
## # quoted_location <chr>, quoted_description <chr>, quoted_verified <lgl>,
## # retweet_status_id <chr>, retweet_text <chr>, retweet_created_at <dttm>,
## # retweet_source <chr>, retweet_favorite_count <int>,
## # retweet_retweet_count <int>, retweet_user_id <chr>,
## # retweet_screen_name <chr>, retweet_name <chr>,
## # retweet_followers_count <int>, retweet_friends_count <int>,
## # retweet_statuses_count <int>, retweet_location <chr>,
## # retweet_description <chr>, retweet_verified <lgl>, place_url <chr>,
## # place_name <chr>, place_full_name <chr>, place_type <chr>, country <chr>,
## # country_code <chr>, geo_coords <list>, coords_coords <list>,
## # bbox_coords <list>, status_url <chr>, name <chr>, location <chr>,
## # description <chr>, url <chr>, protected <lgl>, followers_count <int>,
## # friends_count <int>, listed_count <int>, statuses_count <int>,
## # favourites_count <int>, account_created_at <dttm>, verified <lgl>,
## # profile_url <chr>, profile_expanded_url <chr>, account_lang <lgl>,
## # profile_banner_url <chr>, profile_background_url <chr>,
## # profile_image_url <chr>
turtle_platform %>% slice(1:10)
## # A tibble: 10 x 3
## source n percent_of_tweets
## <chr> <int> <dbl>
## 1 Tweets for Turtles 185 0.214
## 2 Twitter for iPhone 147 0.170
## 3 Instagram 130 0.150
## 4 Twitter Web App 109 0.126
## 5 Twitter for Android 72 0.0832
## 6 IFTTT 42 0.0486
## 7 Hootsuite Inc. 31 0.0358
## 8 Twitter Web Client 30 0.0347
## 9 Buffer 17 0.0197
## 10 TweetDeck 16 0.0185
tt %>% group_by(screen_name) %>%
summarize(n = n()) %>%
mutate(percent_of_tweets = n/sum(n)) %>%
arrange(desc(n)) %>% slice(1:10)
## # A tibble: 10 x 3
## screen_name n percent_of_tweets
## <chr> <int> <dbl>
## 1 aTurtlebot 185 0.214
## 2 TMNT_Wiz 26 0.0301
## 3 kame_fuji 15 0.0173
## 4 GreenieTurtle 14 0.0162
## 5 kamepi24 14 0.0162
## 6 TurtleAloha 14 0.0162
## 7 NatureCutsTags 11 0.0127
## 8 StarCrystalDel 11 0.0127
## 9 donnietheturtle 10 0.0116
## 10 RedEaredSliderz 10 0.0116
Next, I looked at the words used in the tweets to determine what is being discussed when #turtle is used. (Outside of just turtles, of course.)
reg <- "([^A-Za-z\\d#@']|'(?![A-Za-z\\d#A]))"
turtle_words <- tt %>% select(status_id, text) %>%
filter(!str_detect(text, '^"')) %>%
mutate(text = str_replace_all(text,
"https:t.co/[A-Za-z\\d]+|&",
"")) %>%
unnest_tokens(word, text, token = "regex",
pattern = reg) %>%
filter(!word %in% stop_words$word, str_detect(
word, "[a-z]"))
turtle_words <- turtle_words %>% group_by(word) %>%
summarize(n = n()) %>%
mutate(percent_of_tweets = n/sum(n)) %>%
arrange(desc(n)) %>% top_n(20)
## Selecting by percent_of_tweets
head(turtle_words)
## # A tibble: 6 x 3
## word n percent_of_tweets
## <chr> <int> <dbl>
## 1 https 971 0.0814
## 2 #turtle 860 0.0721
## 3 #plastic 291 0.0244
## 4 #cute 211 0.0177
## 5 turtle 194 0.0163
## 6 #turtlebot 187 0.0157
This gave me a bit more to work with, and you can see that the second return is #plastic, which leads me to believe there are many tweets involving pollution and sea turtles that we could deal with.
Now, I tried to plot the count of words used with #turtle. Unfortunately, while I was able to generate the graph, I could not get the y-axis with count to work correctly. It seemed to be setting all count to “1” and I tried several different graphs and attempted scaling, but could not determine why my integer count that I can see in the tibble did not translate to the graph.
turtle_words %>% count(word, sort = TRUE) %>% top_n(15) %>%
mutate(word = reorder(word, n)) %>% ggplot(aes(x = word, y = n)) +
geom_col() + xlab(NULL) + coord_flip() + labs(x = "Top Turtle Word Use",
y = "Count",
title = "Top Twitter Searches on #Turtle")
turtle_words %>% count(word, sort = TRUE) %>% top_n(15) %>%
mutate(word = reorder(word, n)) %>% ggplot(aes(x = word, y = n)) +
geom_col() + xlab(NULL) + coord_flip() + labs(x = "Top Turtle Word Use",
y = "Count",
title = "Top Twitter Searches on #Turtle") +
ylim(0, 10)
After looking at the words used in #turtle tweets I wanted to see if more information about these tweets could be gleaned by adding in the sentiment lexicon and looking at the tweets themselves based upon emotion attribute. It returned the list of common range of sentiments that you would expect to find.
turtle_words2 <- tt %>% select(status_id, text) %>%
filter(!str_detect(text, '^"')) %>%
mutate(text = str_replace_all(text,
"https:t.co/[A-Za-z\\d]+|&",
"")) %>%
unnest_tokens(word, text, token = "regex",
pattern = reg) %>%
filter(!word %in% stop_words$word, str_detect(
word, "[a-z]"))
nrc <- get_sentiments("nrc") %>%
select(word, sentiment)
head(nrc)
## # A tibble: 6 x 2
## word sentiment
## <chr> <chr>
## 1 abacus trust
## 2 abandon fear
## 3 abandon negative
## 4 abandon sadness
## 5 abandoned anger
## 6 abandoned fear
turtle_words2_sentiments <- turtle_words2 %>%
inner_join(nrc, by = "word")
turtle_words2_sentiments %>%
group_by(sentiment) %>% summarize(n = n()) %>%
arrange(desc(n))
## # A tibble: 10 x 2
## sentiment n
## <chr> <int>
## 1 positive 597
## 2 joy 304
## 3 anticipation 256
## 4 trust 234
## 5 negative 179
## 6 surprise 109
## 7 sadness 97
## 8 fear 96
## 9 anger 62
## 10 disgust 60
Next, I pulled the positive posts to look at them more closely. However, when I did that I found the selection seemed to consist of tweets about jewelry, which was not what I was looking for.
pos_tt_id <- turtle_words2_sentiments %>%
filter(sentiment == "positive") %>% distinct(status_id)
tt %>% inner_join(pos_tt_id, by = "status_id") %>%
select(text) %>% slice(1:10)
## # A tibble: 10 x 1
## text
## <chr>
## 1 Megan Brittany of 32nd East Side said she found a turtle intruder in her apa~
## 2 This could be the biggest #turtle swarm ever filmed at sea - They were... ht~
## 3 This could be the biggest #turtle swarm ever filmed at sea - “This is the...~
## 4 ".Green Aventurine wire wrapped #turtle earrings! So cute! Limited supply av~
## 5 ".Green Aventurine wire wrapped #turtle earrings! So cute! Limited supply av~
## 6 "0Green Aventurine wire wrapped #turtle earrings! So cute! Limited supply av~
## 7 "`Green Aventurine wire wrapped #turtle earrings! So cute! Limited supply av~
## 8 "Green Aventurine wire wrapped #turtle earrings! So cute! Limited supply ava~
## 9 "0Green Aventurine wire wrapped #turtle earrings! So cute! Limited supply av~
## 10 "`Green Aventurine wire wrapped #turtle earrings! So cute! Limited supply av~
As the positive sentiment was pulling information not about real turtles, I thought looking at a sentiment more on the negative spectrum of emotions it might show a different picture. However, when looking at the sad sentiment, the returned tweets were the same jewelry ones found in the positive return.
sad_tt_id <- turtle_words2_sentiments %>% filter(sentiment == "sadness") %>%
distinct(status_id)
tt %>% inner_join(sad_tt_id, by = "status_id") %>% select(text) %>% slice(1:10)
## # A tibble: 10 x 1
## text
## <chr>
## 1 Megan Brittany of 32nd East Side said she found a turtle intruder in her apa~
## 2 ".Green Aventurine wire wrapped #turtle earrings! So cute! Limited supply av~
## 3 ".Green Aventurine wire wrapped #turtle earrings! So cute! Limited supply av~
## 4 "0Green Aventurine wire wrapped #turtle earrings! So cute! Limited supply av~
## 5 "`Green Aventurine wire wrapped #turtle earrings! So cute! Limited supply av~
## 6 "Green Aventurine wire wrapped #turtle earrings! So cute! Limited supply ava~
## 7 "0Green Aventurine wire wrapped #turtle earrings! So cute! Limited supply av~
## 8 "`Green Aventurine wire wrapped #turtle earrings! So cute! Limited supply av~
## 9 "Green Aventurine wire wrapped #turtle earrings! So cute! Limited supply ava~
## 10 Sea Turtle Painting Hawaii Art Sea Turtle Decor Sea Turtle Wall Art Kauai Po~
Not to be daunted, I tried anger instead. Again looking for that search maybe talking about pollution as shown in the #plastic top results. Unfortuntely, yet again my results still showed the same tweets as in my other returns.
anger_tt_id <- turtle_words2_sentiments %>% filter(sentiment == "anger") %>%
distinct(status_id)
tt %>% inner_join(anger_tt_id, by = "status_id") %>% select(text) %>% slice(1:10)
## # A tibble: 10 x 1
## text
## <chr>
## 1 Megan Brittany of 32nd East Side said she found a turtle intruder in her apa~
## 2 ".Green Aventurine wire wrapped #turtle earrings! So cute! Limited supply av~
## 3 ".Green Aventurine wire wrapped #turtle earrings! So cute! Limited supply av~
## 4 "0Green Aventurine wire wrapped #turtle earrings! So cute! Limited supply av~
## 5 "`Green Aventurine wire wrapped #turtle earrings! So cute! Limited supply av~
## 6 "Green Aventurine wire wrapped #turtle earrings! So cute! Limited supply ava~
## 7 "0Green Aventurine wire wrapped #turtle earrings! So cute! Limited supply av~
## 8 "`Green Aventurine wire wrapped #turtle earrings! So cute! Limited supply av~
## 9 "Green Aventurine wire wrapped #turtle earrings! So cute! Limited supply ava~
## 10 #turtle #plushtoy #stuffedanimals Baby Greenie and the Donuts: One time when~
So, my data was not giving me something I felt like I could work with. I could not figure out how to take the data I had already pulled and filter out tweets about generic turtle items such as jewelry. So instead I ran a more narrow seach on #seaturtle. I followed similar steps as executed above on #turtle.
stt <- search_tweets('#seaturtle', n = num_tweets,
include_rts = FALSE)
head(stt)
## # A tibble: 6 x 90
## user_id status_id created_at screen_name text source
## <chr> <chr> <dttm> <chr> <chr> <chr>
## 1 315200~ 11995365~ 2019-11-27 03:53:04 AnthonyCat~ "Sea~ Twitt~
## 2 315200~ 11980452~ 2019-11-23 01:07:02 AnthonyCat~ "Woo~ Twitt~
## 3 315200~ 11990911~ 2019-11-25 22:23:05 AnthonyCat~ "WOW~ Twitt~
## 4 235010~ 11995225~ 2019-11-27 02:57:12 smarturban~ This~ Twitt~
## 5 235010~ 11977028~ 2019-11-22 02:26:20 smarturban~ Marc~ Twitt~
## 6 702145~ 11995179~ 2019-11-27 02:38:55 OfficialGa~ "10 ~ Twitt~
## # ... with 84 more variables: display_text_width <dbl>,
## # reply_to_status_id <chr>, reply_to_user_id <chr>,
## # reply_to_screen_name <chr>, is_quote <lgl>, is_retweet <lgl>,
## # favorite_count <int>, retweet_count <int>, quote_count <int>,
## # reply_count <int>, hashtags <list>, symbols <list>, urls_url <list>,
## # urls_t.co <list>, urls_expanded_url <list>, media_url <list>,
## # media_t.co <list>, media_expanded_url <list>, media_type <list>,
## # ext_media_url <list>, ext_media_t.co <list>, ext_media_expanded_url <list>,
## # ext_media_type <chr>, mentions_user_id <list>, mentions_screen_name <list>,
## # lang <chr>, quoted_status_id <chr>, quoted_text <chr>,
## # quoted_created_at <dttm>, quoted_source <chr>, quoted_favorite_count <int>,
## # quoted_retweet_count <int>, quoted_user_id <chr>, quoted_screen_name <chr>,
## # quoted_name <chr>, quoted_followers_count <int>,
## # quoted_friends_count <int>, quoted_statuses_count <int>,
## # quoted_location <chr>, quoted_description <chr>, quoted_verified <lgl>,
## # retweet_status_id <chr>, retweet_text <chr>, retweet_created_at <dttm>,
## # retweet_source <chr>, retweet_favorite_count <int>,
## # retweet_retweet_count <int>, retweet_user_id <chr>,
## # retweet_screen_name <chr>, retweet_name <chr>,
## # retweet_followers_count <int>, retweet_friends_count <int>,
## # retweet_statuses_count <int>, retweet_location <chr>,
## # retweet_description <chr>, retweet_verified <lgl>, place_url <chr>,
## # place_name <chr>, place_full_name <chr>, place_type <chr>, country <chr>,
## # country_code <chr>, geo_coords <list>, coords_coords <list>,
## # bbox_coords <list>, status_url <chr>, name <chr>, location <chr>,
## # description <chr>, url <chr>, protected <lgl>, followers_count <int>,
## # friends_count <int>, listed_count <int>, statuses_count <int>,
## # favourites_count <int>, account_created_at <dttm>, verified <lgl>,
## # profile_url <chr>, profile_expanded_url <chr>, account_lang <lgl>,
## # profile_banner_url <chr>, profile_background_url <chr>,
## # profile_image_url <chr>
stt %>% group_by(screen_name) %>%
summarize(n = n()) %>%
mutate(percent_of_tweets = n/sum(n)) %>%
arrange(desc(n)) %>% slice(1:10)
## # A tibble: 10 x 3
## screen_name n percent_of_tweets
## <chr> <int> <dbl>
## 1 Makalewakan2 15 0.0838
## 2 NomadicBrits 5 0.0279
## 3 cehart03 4 0.0223
## 4 RGDives 4 0.0223
## 5 AnthonyCatucci 3 0.0168
## 6 FallHolidaze 3 0.0168
## 7 KauaiMarionette 3 0.0168
## 8 NatureCutsTags 3 0.0168
## 9 sebphotog 3 0.0168
## 10 StylingTech 3 0.0168
reg <- "([^A-Za-z\\d#@']|'(?![A-Za-z\\d#A]))"
seaturtle_words <- stt %>% select(status_id, text) %>%
filter(!str_detect(text, '^"')) %>%
mutate(text = str_replace_all(text,
"https:t.co/[A-Za-z\\d]+|&",
"")) %>%
unnest_tokens(word, text, token = "regex",
pattern = reg) %>%
filter(!word %in% stop_words$word, str_detect(
word, "[a-z]"))
seaturtle_words_sentiments <- seaturtle_words %>%
inner_join(nrc, by = "word")
seaturtle_words_sentiments2 <- seaturtle_words_sentiments %>%
group_by(sentiment) %>% summarize(n = n()) %>%
arrange(desc(n))
This time I was able to see a variation in the three emotions, which gave hope for better data to work with.
pos_stt_id <- seaturtle_words_sentiments %>%
filter(sentiment == "positive") %>% distinct(status_id)
stt %>% inner_join(pos_stt_id, by = "status_id") %>%
select(text) %>% slice(1:10)
## # A tibble: 10 x 1
## text
## <chr>
## 1 "Sea Turtle!!! This is big 40”x20” and it’s awesome!!! This is ready to go f~
## 2 "Woohoo!!! Just finished this awesome 40”x17” Sea Turtle!! I love the colors~
## 3 "WOW!!! This is an incredible 40”x15” front facing Sea Turtle!!! Ready to go~
## 4 This could be the biggest turtle swarm ever filmed at sea https://t.co/5NJMj~
## 5 Marco Island could have new sea turtle ordinance for 2020 nesting season htt~
## 6 Sea Turtle Painting Hawaii Art Sea Turtle Decor Sea Turtle Wall Art Kauai Po~
## 7 Turtle art prints, Hawaiian art, Kauai art prints, Hawaii painting, Hawaiian~
## 8 Gemstone Sea Turtle Pendant https://t.co/2ZcFAJyWon #FallHolidaze #Etsy #Sea~
## 9 Gemstone Sea Turtle Pendant https://t.co/2ZcFAJyWon #FallHolidaze #Etsy #Sea~
## 10 Blue Sea Sediment Stone Sea Turtle Pendant https://t.co/oX7obzsxaj #Etsy #Fa~
sad_stt_id <- seaturtle_words_sentiments %>% filter(sentiment == "sadness") %>%
distinct(status_id)
stt %>% inner_join(sad_stt_id, by = "status_id") %>% select(text) %>% slice(1:10)
## # A tibble: 10 x 1
## text
## <chr>
## 1 Sea Turtle Painting Hawaii Art Sea Turtle Decor Sea Turtle Wall Art Kauai Po~
## 2 Turtle art prints, Hawaiian art, Kauai art prints, Hawaii painting, Hawaiian~
## 3 Blue Sea Sediment Stone Sea Turtle Pendant https://t.co/oX7obzsxaj #Etsy #Fa~
## 4 "Green Sea Turtle\n.\nToo cool for you or me, the green sea turtle always se~
## 5 He's not really grumpy! https://t.co/TkDYbZ1utZ #seaturtle children #babies ~
## 6 He's not really grumpy! https://t.co/TkDYbZ1utZ #seaturtle children #babies ~
## 7 He's not really grumpy! https://t.co/TkDYbZ1utZ #seaturtle children #babies ~
## 8 He's not really grumpy! https://t.co/TkDYbZ1utZ #seaturtle children #babies ~
## 9 He's not really grumpy! https://t.co/TkDYbZ1utZ #seaturtle children #babies ~
## 10 Sea turtle key hanger, hand painted key hanger, beach hut key rack, seaside ~
Although, the top five tweets are repeats of the sad segment, here you can see some more appropriate tweets in line with the sentiment on tweets six through nine.
anger_stt_id <- seaturtle_words_sentiments %>% filter(sentiment == "anger") %>%
distinct(status_id)
stt %>% inner_join(anger_stt_id, by = "status_id") %>% select(text) %>% slice(1:10)
## # A tibble: 10 x 1
## text
## <chr>
## 1 Blue Sea Sediment Stone Sea Turtle Pendant https://t.co/oX7obzsxaj #Etsy #Fa~
## 2 He's not really grumpy! https://t.co/TkDYbZ1utZ #seaturtle children #babies ~
## 3 He's not really grumpy! https://t.co/TkDYbZ1utZ #seaturtle children #babies ~
## 4 He's not really grumpy! https://t.co/TkDYbZ1utZ #seaturtle children #babies ~
## 5 He's not really grumpy! https://t.co/TkDYbZ1utZ #seaturtle children #babies ~
## 6 He's not really grumpy! https://t.co/TkDYbZ1utZ #seaturtle children #babies ~
## 7 "Grumpy sea turtle. Maybe she's unhappy with our treatment of the oceans and~
## 8 "Devastating news \U0001f422 #seaturtle\nhttps://t.co/dMXDuTKrvC"
## 9 Increase of #seaturtle death in Bengkulu. suspected caused by increase in pl~
## 10 "Our recent tweet is evidence of exactly this! Read more from @MongabayID ab~
I wanted to try and use the country code of the specific user who did the tweets to try and plot where these people were located that were discussing sea turtles. I tried multiple things to pull in country to my sentiment tibbles. In the end the code that did not return an error message was:
seaturtle_words_sentiments2 <- merge(seaturtle_words_sentiments2, stt, "status_id")
However, it still did not work to what I wanted as you can see there is no country column in this tibble.
head(seaturtle_words_sentiments2)
## # A tibble: 6 x 2
## sentiment n
## <chr> <int>
## 1 positive 187
## 2 joy 70
## 3 anticipation 55
## 4 negative 48
## 5 trust 47
## 6 sadness 35
Therefore, when I tried to code it into a graph it could not pull the data. I ended up coding the sentiments without country data. However, it was not what I wanted to compare to.
ggplot(seaturtle_words_sentiments2, aes(x = sentiment, y = n)) +
geom_bar(stat = "identity", position = "dodge") + xlab("Sentiment") + ylab("Count") +
theme(axis.text.x = element_text(angle = 90,
hjust = 1))
I find R fascinating and can see how it potentially makes analysis so much easier and efficient than Excel. However, as we are approaching the end of this semester I find this course has taught me some basics but has more highlighted how much I don’t know still, as I back into what I see as coding failure time and time again. I look forward to the future analytic courses and hope they increase my little box of R knowledge I have started.