TWIN PEAKS SEASON 3: WAS IT WORTH THE WAIT?


Pre-cleaning

2017 marks an important year for all Twin Peaks fans: after 25 26 years since Season 2 ended, ShowTime announced its plans to revive this show for the third season.The hype generated by this announcement was incredible, however, it is not necessarily clear if the fans truly appreciated the opportunity to glance into the strange world of Twin Peaks one more time or the Network overestimated the need for a revival of the show based on the fans’ nastolgia.

The purpose of this analysis is to identify fans’ reaction to the Event (as dubbed by the Network)

I started out this project by loading several libraries to help me with my twitter scraping, data-wrangling, manipulation and further visualisation of the data. Then I used some of the code chunks from Units 11 & 12.

library(dplyr)
library(twitteR)
library(tidytext)
library(stringr)
library(ggplot2)
library(knitr)
library(wordcloud)

Data Analysis

The goal of my analysis was to answer the following questions:

  • What are the most common sentiments used to describe Season 3 of Twin Peaks?
  • What are the most common words used to describe Season 3 of Twin Peaks?

I used #TwinPeaks to look for the comments made by viewers primarily because using #TwinPeaksTheReturn and #TwinPeaksTheEvent did not result in as many hits as I wanted. Additionally, I assumed that this was a great hashtag to use for the third season, as the first two seasons of the show were aired in pre-Twitter world.

In order for me to answer my first question, I had to create a data frame using 3,500 tweets I downloaded from Twitter:

num_tweets <- 3500
reg <- "([^A-Za-z\\d#@']|'(?![A-Za-z\\d#@]))"
TwinPeaks <- searchTwitter('#TwinPeaks', n = num_tweets)
TwinPeaks_df <- twListToDF(TwinPeaks)
reg <- "([^A-Za-z\\d#@']|'(?![A-Za-z\\d#@]))"
TwinPeaks_words <- TwinPeaks_df %>%
  filter(!str_detect(text, '^"')) %>%
  mutate(text = str_replace_all(text, "https://t.co/[A-Za-z\\d]+|&amp;", "")) %>%
  unnest_tokens(word, text, token = "regex", pattern = reg) %>%
  filter(!word %in% stop_words$word,
         str_detect(word, "[a-z]"))
nrc <- sentiments %>%
  filter(lexicon == "nrc") %>%
  select(word, sentiment)
TwinPeaks_Sentiments <- TwinPeaks_words %>% inner_join(nrc, by = "word")
TwinPeaks_Sentiments %>% group_by(sentiment) %>% summarize(n = n()) %>% arrange(desc(n))
## # A tibble: 10 x 2
##       sentiment     n
##           <chr> <int>
##  1     positive  1919
##  2          joy  1203
##  3     negative  1041
##  4        trust  1033
##  5 anticipation   967
##  6      sadness   760
##  7         fear   750
##  8        anger   703
##  9      disgust   623
## 10     surprise   381

As you can see from the table above, the reaction to Season 3 has been somewhat polarized: with two thirds providing positive feedback on Twitter and one third - negative. Other sentiments, perhaps most notably - joy and anticipation - were the ones that were shared the most.

I took my analysis a step further by creating a word cloud using the words that were used by the Twitter users to describe the show.

TwinPeaks_Sentiments %>% count(word) %>% with(wordcloud(word, n, max.words = 100, scale=c(5,.5),min.freq=3, random.order=FALSE, rot.per=.15, colors=brewer.pal(8,"RdBu")), rot.per=0.35)

Looking at the word cloud gives an overwhelmingly positive picture (to contrast it with sentiments) with “Lynch”" (the creator of the show), “beautiful” and “wait” being the words most frequently used.

Repeating the steps above, I also wanted to run an analysis on the public’s reaction to the possibility of a Season 4. Fans of the show use the hashtag #119 when tweeting about the continuation.

num_tweets <- 3500
reg <- "([^A-Za-z\\d#@']|'(?![A-Za-z\\d#@]))"
TheReturn <- searchTwitter('#OneOneNine', n = num_tweets)
TheReturn_df <- twListToDF(TheReturn)
TheReturn_words <- TheReturn_df %>%
  filter(!str_detect(text, '^"')) %>%
  mutate(text = str_replace_all(text, "https://t.co/[A-Za-z\\d]+|&amp;", "")) %>%
  unnest_tokens(word, text, token = "regex", pattern = reg) %>%
  filter(!word %in% stop_words$word,
         str_detect(word, "[a-z]"))
TheReturn_Sentiments <- TheReturn_words %>% inner_join(nrc, by = "word")
TheReturn_Sentiments %>% group_by(sentiment) %>% summarize(n = n()) %>% arrange(desc(n))
## # A tibble: 10 x 2
##       sentiment     n
##           <chr> <int>
##  1     positive    19
##  2          joy    14
##  3      sadness    11
##  4 anticipation    10
##  5     negative     9
##  6        anger     7
##  7      disgust     6
##  8     surprise     5
##  9         fear     4
## 10        trust     4

The results of the sentiment analysis are reversed as compared to the feedback that Season 3 got. Majority of people tweeting don’t entertain the thought of giving Season 4 a green light or, perhaps, they were saddened with the Season 3 ending?

TheReturn_Sentiments %>% count(word) %>% with(wordcloud(word, n, max.words = 100, scale=c(5,.5),min.freq=3, random.order=FALSE, rot.per=.15, colors=brewer.pal(8,"RdBu")), rot.per=0.35)

Unfortunately, the word cloud does not appear too helpful either. Apart from the word “Lynch”, no other word stands out.

My next step in analysis was a visual representation of sentiments side by side.

TwinPeaks_Sentiments$Hashtag <- "#TwinPeaks"
TheReturn_Sentiments$Hashtag <- "#OneOneNine"
combined_sentiments <-rbind(TwinPeaks_Sentiments, TheReturn_Sentiments)
combined_df <-combined_sentiments %>% 
  group_by(Hashtag, sentiment) %>% 
  summarize(n = n()) %>%
  mutate(frequency = n/sum(n)*100)
ggplot(combined_df, aes(x = sentiment, y = frequency, fill = Hashtag)) + 
  geom_bar(stat = "identity", position = "dodge") +
  xlab("Sentiment") +
  ylab("Sentiment Frequency within tweets") +
  theme(axis.text.x = element_text(angle = 90, hjust = 1)) +
  scale_fill_manual(values=c("#0072B2", "#D55E00"))

As evidenced by the bar chart above, negative sentiments such as “anger”, “disgust”, and “fear” were used most often in relation to a potential Season 4 of the show. It can be explained by the fact that the viewers are not looking forward to a Season 4 or perhaps they are worried about it not happening?

Finally, I decided to look at the user names whose tweets were analyzed. The reason being #OneOneNine could be used for a different cause that I am not aware of.

TW_TopUsers<-TwinPeaks_df %>% 
        group_by(screenName) %>% 
        summarize(n = n()) %>%
        arrange(desc(n)) %>%
        filter(n > 2)

TR_TopUsers<-TheReturn_df %>% 
        group_by(screenName) %>% 
        summarize(n = n()) %>%
        arrange(desc(n)) %>%
        filter(n > 2)
common_users<-inner_join(TR_TopUsers, TW_TopUsers, by = "screenName")
colnames(common_users)<-c("Screen Name", "#TwinPeaks", "#OneOneNine")
kable(common_users)
Screen Name #TwinPeaks #OneOneNine
BluedRoses 5 7
Theresabanks101 3 10
TwinPeaksBotNew 3 151

As seen from the list above, #OneOneNine is indeed used by the fans of Twin Peaks.

Conclusion

Unfortunately, the results of my study proved to be inconclusive. On one hand, based on the tweets shared, people are exited about the show and use words such as “joy” and “surprise” to describe it. One the other hand, tweets about the possibility of a season 4 were shared in a negative context. Will there be a continuation or is season 3 it? Time will tell.