Assignment 6

I was trying to think of something to look at that would be interesting as I was watching the replay of Saturday Night Live. I searched for the hashtag #SNL up until 12/06/2020 as I wanted to gather relevant data on the most recent show while avoiding upcoming or past shows as much as possible. I selected 1,000 tweets as I didn’t want to overload the system, but also wanted to get a good sample. I looked at who was commenting on the show by analyzing the platform each tweet came from, and then I looked at the frequency of tweets. Finally I reviewed the sediment of the tweets as you did in your class example. Data was obtained through Twitter and used under their permission.

num_tweets <- 1000
SNL <- search_tweets('#SNL', n = num_tweets, until ="2020-12-06" , include_rts = FALSE)

SNL_source <- SNL %>% group_by(source) %>% 
  summarize(n = n()) %>% 
  mutate(percent_of_tweets = n/sum(n)) %>%
  arrange(desc(n))

platforms <- c("Twitter Web Client", "Twitter for iPhone", "Twitter for Android", "Twitter Web App")
Platform_Data <- SNL_source %>% filter(source %in% platforms)
ggplot(Platform_Data, aes(x = source, y = percent_of_tweets, fill = source)) + 
  geom_bar(stat = "identity", position = "dodge") +
  xlab("Platform") +
  ylab("Percent of Tweets") +
  theme(axis.text.x = element_text(angle = 90, hjust = 1)) +
  ggtitle("What Platform Do Saturday Night Live Viewers Use")

Time_volume <- SNL %>%
    rtweet::ts_plot(., by = "60 minutes") +
    ggplot2::labs(
            x = NULL,
            y = NULL,
            title = "Tweet Volume by Hour Using #SNL by time")
Time_volume

reg <- "([^A-Za-z\\d#@']|'(?![A-Za-z\\d#@]))"
SNL_Words <- SNL %>% select(status_id, text) %>%
  filter(!str_detect(text, '^"')) %>%
  mutate(text = str_replace_all(text, "https://t.co/[A-Za-z\\d]+|&amp;", "")) %>%
  unnest_tokens(word, text, token = "regex", pattern = reg) %>%
  filter(!word %in% stop_words$word,
         str_detect(word, "[a-z]"))

nrc <- get_sentiments("nrc") %>%
  select(word, sentiment)

##SNL_Words %>% group_by(word) %>% summarize(n = n()) %>% arrange(desc(n)) %>% top_n(20)

SNL_sentiments <- SNL_Words %>% inner_join(nrc, by = "word")

SNL_Words <- SNL_sentiments %>% 
  group_by(sentiment) %>% 
  summarize(n = n()) %>%
  arrange(desc(n)) %>%
  rename("Sentiment" = sentiment,  "Total Mentions"  = n)

SNL_sentiments1 <- group_by(SNL_Words)

library(knitr)
kable(SNL_Words, align = "lcc", caption = '__Top Sentiments of SNL Tweets__')

**Top Sentiments of SNL Tweets**
Sentiment	Total Mentions
positive	608
negative	474
anticipation	451
trust	440
joy	298
anger	219
fear	216
sadness	206
surprise	198
disgust	147

Overall, I was surprised with how many tweets were considered negative using the sentiments package. For a show that is supposed to bring joy, the Twitter world really struggles to find it. In respect to tweet volume, it seems volume was highest in the weeks leading up to the show, while tweets after the show generally averaged out. I did not research why that would be, but it could possibly be that the news picked up something about SNL on the 3rd which caused a small spike in tweets. Finally, the platform used was surprising to me, specifically the amount of people still using the twitter web app compared to phone native apps. I must admit I did not put a ton of effort into following my initial vision on this assignment as I was getting burned out a little. For the line graph I tried to create one based on how we learned in class, but was unsuccessful after a few attempts. Rather than continuing to try, I went online and found an example that I could work with, (still modifying some) using tsplot and labs as functions. Other than that I had some hiccups along the way, but was generally successful in this project.

Assignment 6

Evan Libby

12/6/2020