R Markdown

This is an R Markdown document. Markdown is a simple formatting syntax for authoring HTML, PDF, and MS Word documents. For more details on using R Markdown see http://rmarkdown.rstudio.com.

When you click the Knit button a document will be generated that includes both content as well as the output of any embedded R code chunks within the document. You can embed an R code chunk like this:

Note that the echo = FALSE parameter was added to the code chunk to prevent printing of the R code that generated the plot.

#Instructions In this assignment, you will download, analyze, and visualize Reddit threads based on a keyword of your choice. Specifically, you will be performing the following steps:

Section 0. Packages

# Package names
packages <- c("RedditExtractoR", "anytime", "magrittr", "httr", "tidytext", "tidyverse", "igraph", "ggraph", "wordcloud2", "textdata", "here", "jsonlite", "syuzhet", "dplyr", "sentimentr", "ggplot2", "stringr", "devtools", "htmltools")

# Install packages not yet installed
installed_packages <- packages %in% rownames(installed.packages())
if (any(installed_packages == FALSE)) {
install.packages(packages[!installed_packages])
}

# Load packages
invisible(lapply(packages, library, character.only = TRUE))

I aim to examine public reactions to the latest Marvel movie on Reddit by analyzing user-generated text data through sentiment analysis

The keyword I chose for the thread is Guardians of the Galaxy Vol. 3

# using keyword
threads_1_f <- find_thread_urls(keywords = 'Guardians of the Galaxy Vol. 3', 
                              sort_by = 'relevance', 
                              period = 'all') %>% 
  drop_na()

rownames(threads_1_f) <- NULL

# Sanitize text
threads_1_f %<>% 
  mutate(across(
    where(is.character),
    ~ .x %>%
        str_replace_all("\\|", "/") %>%   # replace vertical bars
        str_replace_all("\\n", " ") %>%   # replace newlines
        str_squish()                      # clean up extra spaces
  ))

colnames(threads_1_f)
head(threads_1_f, 3) %>% knitr::kable()

Searching by subreddit using the find_subreddits() for list of related subreddits to the keyword.

# search for subreddits
subreddit_list <- RedditExtractoR::find_subreddits('Guardians of the Galaxy Vol. 3')
subreddit_list %>% 
  arrange(desc(subscribers)) %>% 
  .[1:25,c('subreddit','title','subscribers')] %>% 
  knitr::kable()
threads_1_f$subreddit %>% table() %>% sort(decreasing = T) %>% head(20)

Threads within subreddit for Guardians of the Galaxy Vol. 3.

# using subreddit
threads_2_f <- find_thread_urls(subreddit = c(' marvelstudios', 'boxoffice', 'MarvelStudiosSpoilers', 'movies', 'shittymoviedetails', 'Marvel', 'comicbooks', 'DC_Cinematic', 'marvelmemes', 'MarvelStudios_Rumours'), 
                              sort_by = 'top', 
                              period = 'year') %>% 
  drop_na()

rownames(threads_2_f) <- NULL

# Sanitize text
threads_2_f %<>% 
  mutate(across(
    where(is.character),
    ~ .x %>%
        str_replace_all("\\|", "/") %>% 
        str_replace_all("\\n", " ") %>%
        str_squish()
  ))

head(threads_2_f, 3) %>% knitr::kable()

searching by both the keyword and subreddit.

# using both subreddit and keyword
threads_3_f <- find_thread_urls(keywords= 'Guardians of the Galaxy Vol. 3', 
                              subreddit = c(' marvelstudios', 'boxoffice', 'MarvelStudiosSpoilers', 'movies', 'shittymoviedetails', 'Marvel', 'comicbooks', 'DC_Cinematic', 'marvelmemes', 'MarvelStudios_Rumours'), 
                              sort_by = 'relevance', 
                              period = 'all') %>% 
  drop_na()

rownames(threads_3_f) <- NULL

# Sanitize text
threads_3_f %<>% 
  mutate(across(
    where(is.character),
    ~ .x %>%
        str_replace_all("\\|", "/") %>% 
        str_replace_all("\\n", " ") %>% 
        str_squish() 
  ))

head(threads_3_f, 3) %>% knitr::kable()
# get individual comments
threads_1_content <- get_thread_content(threads_1_f$url[1:4])
threads_2_content <- get_thread_content(threads_2_f$url[1:4])
threads_3_content <- get_thread_content(threads_3_f$url[1:4])

For the purpose of this assignment i will be using this threads_2_f thread

names(threads_2_content)

# check upvotes and downvotes
print(threads_2_content$threads[,c('upvotes','downvotes','up_ratio')])
load("threads_2_f.RData")
load("threads_2_content.RData")

Cleaning the text data for threads_2_content and saving all the other threads as csv.

# Sanitize text
threads_2_content$comments %<>% 
  mutate(across(
    where(is.character),
    ~ .x %>%
        str_replace_all("\\|", "/") %>% 
        str_replace_all("\\n", " ") %>% 
        str_squish() 
  ))

head(threads_2_content$comments, 3) %>% knitr::kable()
# Save each data frame to a RData
save(threads_1_f, "threads_1_f.RData")
save(threads_1_content, "threads_1_content.RData")
save(threads_2_f, "threads_2_f.RData")
save(threads_2_content, "threads_2_content.RData")
save(threads_3_f, "threads_3_f.RData")
save(threads_3_content, "threads_3_content.RData")

2. Tokenization and stop words

2-1. Tokenization

# Word tokenization
words <- threads_2_f %>% 
  unnest_tokens(output = word, input = text, token = "words") # run `?tidytext::unnest_tokens` on the console

words %>%
  count(word, sort = TRUE) %>%
  top_n(20) %>%
  mutate(word = reorder(word, n)) %>%
  ggplot(aes(x = word, y = n)) +
  geom_col() +
  xlab(NULL) +
  coord_flip() +
  labs(x = "words",
       y = "counts",
       title = "Unique wordcounts")
## Selecting by n

2-2. Stop words

removing stop words using a built-in dataset from the tidytext package.

# load list of stop words - from the tidytext package
data("stop_words")
# view random 50 words
print(stop_words$word[sample(1:nrow(stop_words), 100)])
##   [1] "generally"     "see"           "etc"           "and"          
##   [5] "we"            "yours"         "gave"          "can"          
##   [9] "face"          "make"          "weren't"       "which"        
##  [13] "like"          "except"        "those"         "become"       
##  [17] "new"           "yourself"      "formerly"      "he'd"         
##  [21] "sensible"      "latter"        "consequently"  "more"         
##  [25] "made"          "looking"       "young"         "uses"         
##  [29] "want"          "go"            "you'll"        "away"         
##  [33] "used"          "would"         "did"           "inner"        
##  [37] "we've"         "goods"         "keeps"         "don't"        
##  [41] "point"         "him"           "whom"          "yourself"     
##  [45] "even"          "we've"         "everywhere"    "themselves"   
##  [49] "others"        "various"       "does"          "everyone"     
##  [53] "certain"       "almost"        "man"           "corresponding"
##  [57] "through"       "same"          "as"            "theirs"       
##  [61] "co"            "used"          "high"          "seeming"      
##  [65] "doesn't"       "perhaps"       "until"         "six"          
##  [69] "seeing"        "every"         "currently"     "well"         
##  [73] "few"           "thanks"        "really"        "been"         
##  [77] "little"        "shouldn't"     "turned"        "allows"       
##  [81] "both"          "anyone"        "wherein"       "present"      
##  [85] "already"       "sub"           "a"             "neither"      
##  [89] "theres"        "p"             "per"           "nd"           
##  [93] "needing"       "later"         "being"         "apart"        
##  [97] "how"           "more"          "back"          "right"

The anti_join() function was used to remove the stop words from the text which left ua with a cleaned set of words.

# Regex that matches URL-type string
replace_reg <- "http[s]?://[A-Za-z\\d/\\.]+|&amp;|&lt;|&gt;"

words_clean <- threads_2_f %>% 
  # drop URLs
  mutate(text = str_replace_all(text, replace_reg, "")) %>%
  # Tokenization (word tokens)
  unnest_tokens(word, text, token = "words") %>% 
  # drop stop words
  anti_join(stop_words, by = "word") %>% 
  # drop non-alphabet-only strings
  filter(str_detect(word, "[a-z]"))

# Check the number of rows after removal of the stop words. There should be fewer words now
print(
  glue::glue("Before: {nrow(words)}, After: {nrow(words_clean)}")
)
## Before: 7240, After: 2585

A new plot is created after removing all the stop words for visualisation. This helps to see the words that sound meaningful and frequently used.

words_clean %>%
  count(word, sort = TRUE) %>%
  top_n(20, n) %>%
  mutate(word = reorder(word, n)) %>%
  ggplot(aes(x = word, y = n)) +
  geom_col() +
  xlab(NULL) +
  coord_flip() +
  labs(x = "words",
       y = "counts",
       title = "Unique wordcounts")

Generating a word cloud that illustrates the frequency of words excluding the keyword.

This plot compare the frequency of words before and after removing stop words using a word cloud.

knitr::opts_chunk$set(widgetframe=FALSE)
# words %>%
#   count(word, sort = TRUE) %>%
#   wordcloud2()

wc1 <- words %>%
count(word, sort = TRUE) %>%
wordcloud2()

htmltools::tagList(wc1)
words_clean %>%
  count(word, sort = TRUE) %>%
  wordcloud2()
knitr::include_graphics("C:/Users/akaamah3/Documents/SCaRP Course Materials_Fall2025/Into_to_Urban_Analytics/CP8883_working_with_R/1.png")

The word clouds generated above look nice, but their color schemes can be a bit overwhelming. Therefore the following block of code creates a custom color palette designed to highlight a selected number of words while graying out the rest. The collection of random colors are generated using the HSV (Hue, Saturation, Value) color model.

n <- 20 # number of words with color
h <- runif(n, 0, 1) # any color
s <- runif(n, 0.6, 1) # vivid
v <- runif(n, 0.3, 0.7) # neither too dark or bright

df_hsv <- data.frame(h = h, s = s, v = v)
pal <- apply(df_hsv, 1, function(x) hsv(x['h'], x['s'], x['v']))
pal <- c(pal, rep("grey", 10000))
words_clean %>%
  count(word, sort = TRUE) %>%
  wordcloud2(color = pal,
             minRotation = 0,
             maxRotation = 0,
             ellipticity = 0.8)
knitr::include_graphics("C:/Users/akaamah3/Documents/SCaRP Course Materials_Fall2025/Into_to_Urban_Analytics/CP8883_working_with_R/2.png")

Conducting a tri-gram analysis.

# Get trigrams. 
words_trigram <- threads_2_f %>%
  mutate(text = str_replace_all(text, replace_reg, "")) %>%
  select(text) %>%
  unnest_tokens(output = trigram,
                input = text,
                token = "ngrams",
                n = 3)
# Show trigram with sorted values
words_trigram %>%
  count(trigram, sort = TRUE) %>% 
  head(20) %>% 
  knitr::kable()
trigram n
NA 177
did not like 11
a lot of 9
one of the 8
10 liked it 6
brave new world 6
america brave new 5
was able to 5
10 loved it 4
7 10 liked 4
captain america brave 4
do you think 4
he didn t 4
in this movie 4
loved it liked 4
of the best 4
pov shown in 4
10 did not 3
9 10 loved 3
able to build 3
#separate the paired words into three columns
words_trigram_sep <- words_trigram %>%
  separate(trigram, into = c("word1", "word2", "word3"), sep = " ")

library(stringi)
words_trigram_filtered <- words_trigram_sep %>%
  filter(!word1 %in% stop_words$word &
         !word2 %in% stop_words$word &
         !word3 %in% stop_words$word) %>%
  filter(str_detect(word1, "[a-z]") &
         str_detect(word2, "[a-z]") &
         str_detect(word3, "[a-z]")) %>%
  filter(stri_enc_isascii(word1) &
         stri_enc_isascii(word2) &
         stri_enc_isascii(word3))

# Sort the new trigram (n=3) counts:
trigram_counts <- words_trigram_filtered %>%
  count(word1, word2, word3) %>%
  arrange(desc(n))

head(trigram_counts, 20) %>% 
  knitr::kable()
word1 word2 word3 n
captain america brave 4
america civil war 2
avengers infinity war 2
captain america civil 2
disney marvel captain 2
marvel captain america 2
news disney marvel 2
random unknown actor 2
rob zombie movie 2
absurd energy source 1
abusive family basically 1
acknowledge killing battlestar 1
actual people speak 1
ad revenue fake 1
aged white dude 1
aging scientist father 1
ago spoiling plot 1
ahem red hulk 1
america anthony mackie 1
anthony mackie captain 1

Discussing noteworthy tri-grams

The tri-grams with the highest frequencies (captain america brave, captain america civil, marvel captain america) indicate strong thematic focus on Captain America as a character and hero figure. Tri-grams like america civil war, and avengers infinity war suggest that users are mentioning movie titles or reacting to events in these movies. Again, Tri-grams such as disney marvel captain and news disney marvel highlight discussions involving Marvel Studios and Disney, possibly referencing announcements, news updates, or movie promotions.

Conducting a bi-gram network visualization.

# Get ngrams. You may try playing around with the value of n, n=3, n=4
words_ngram <- threads_2_f %>%
  mutate(text = str_replace_all(text, replace_reg, "")) %>%
  select(text) %>%
  unnest_tokens(output = paired_words,
                input = text,
                token = "ngrams",
                n = 2)
# Showing bi-grams with sorted values
words_ngram %>%
  count(paired_words, sort = TRUE) %>% 
  head(20) %>% 
  knitr::kable()
paired_words n
NA 175
of the 32
did not 20
and i 19
it s 19
in the 18
to the 14
captain america 13
didn t 13
it was 13
not like 12
the best 12
the mcu 12
a lot 11
i m 11
one of 11
this movie 11
and the 10
i think 10
really liked 10
#separate the paired words into two columns
words_ngram_pair <- words_ngram %>%
  separate(paired_words, c("word1", "word2"), sep = " ")

# filter rows where there are stop words under word 1 column and word 2 column
words_ngram_pair_filtered <- words_ngram_pair %>%
  # drop stop words
  filter(!word1 %in% stop_words$word & !word2 %in% stop_words$word) %>% 
  # drop non-alphabet-only strings
  filter(str_detect(word1, "[a-z]") & str_detect(word2, "[a-z]"))

# Filter out words that are not encoded in ASCII
# To see what's ASCII, google 'ASCII table'
library(stringi)
words_ngram_pair_filtered %<>% 
  filter(stri_enc_isascii(word1) & stri_enc_isascii(word2))

# Sort the new bi-gram (n=2) counts:
words_counts <- words_ngram_pair_filtered %>%
  count(word1, word2) %>%
  arrange(desc(n))

head(words_counts, 20) %>% 
  knitr::kable()
word1 word2 n
captain america 13
tony stark 8
infinity war 6
america brave 5
civil war 5
avengers endgame 4
pov shown 4
action scenes 3
black panther 3
comic book 3
gonna ruin 3
marvel movies 3
america civil 2
anthony mackie 2
avengers infinity 2
blade movie 2
captain marvel 2
chris evans 2
christian bale 2
comic accurate 2

Visualisation of words occurring in pairs

# plot word network
words_counts %>%
  filter(n >= 3) %>%
  graph_from_data_frame() %>% # convert to graph
  ggraph(layout = "fr") +
  geom_edge_link(aes(edge_alpha = .6, edge_width = n)) +
  geom_node_point(color = "darkslategray4", size = 3) +
  geom_node_text(aes(label = name), vjust = 1.8) +
  labs(title = "Word Networks",
       x = "", y = "")
## Warning: The `trans` argument of `continuous_scale()` is deprecated as of ggplot2 3.5.0.
## ℹ Please use the `transform` argument instead.
## This warning is displayed once every 8 hours.
## Call `lifecycle::last_lifecycle_warnings()` to see where this warning was
## generated.

Performing a sentiment analysis on the text data.

# syuzhet package
get_sentiment(threads_2_f$text, method='syuzhet')
##   [1]  0.00  3.85  2.95 -8.00  0.00  0.00  0.00  8.15  0.00  0.00  0.00  0.00
##  [13]  0.00  0.00  0.00  0.00  0.00  2.40  5.20  1.75  0.00  0.00  0.00  1.00
##  [25]  0.00  0.00  0.00  3.40  0.00  2.05  0.00  0.00  2.40  0.00  0.00  0.15
##  [37]  0.00 -0.75  0.00  0.00  0.00 -0.60  7.40  0.00  0.00  0.00  0.00  0.50
##  [49]  0.00  0.00  2.30  0.00  0.00  0.00  0.00  0.00  0.00  7.80  0.00  0.00
##  [61]  0.00  0.00  0.00  0.00  0.00  0.00  2.90  2.05  0.00  1.30  0.00 -1.65
##  [73]  0.00  0.00  0.00  4.00  0.05  0.00  0.00  0.00  1.05  0.00  0.00  0.00
##  [85]  0.00  0.00  0.00  0.00  0.00  0.00  0.00  0.00  0.00  0.00  0.00  0.00
##  [97]  0.00  0.00  2.70  1.25  0.00  0.00  3.35  2.85  0.00  0.00  0.00  0.00
## [109]  0.00  2.35  0.00  0.00  1.30  0.00  0.00  2.30  0.00  4.00  0.00  0.00
## [121]  2.75  0.00  0.00  0.00  2.55 -2.70  1.15  0.00 -0.25  0.00  0.00  0.00
## [133]  0.00  0.00  0.00  0.00  0.00  0.00  0.00  0.00  2.75  0.00  6.20  0.00
## [145]  0.00  0.00  0.00  0.00  0.00  0.00  0.00  1.35  0.00  0.00  0.00  0.00
## [157]  0.00  0.00  0.00  0.00  0.00  0.00  0.00  0.00  0.00  0.00  0.00  0.00
## [169]  0.00  0.00 -0.50  1.55  0.00  0.60  0.00  0.00  0.00  0.00  0.00  0.00
## [181]  0.00  1.30 12.10  0.00 -5.30  0.00 -1.10  0.00  0.00  0.00 -1.40  0.00
## [193]  0.00  0.10  0.00 -1.25 -0.35  0.00  0.00  0.00  0.00 -0.55  0.00  0.00
## [205]  0.00  0.00  1.65  0.00  0.00  0.00  0.80  0.00  0.00  0.80  0.00  0.00
## [217]  0.00  0.00  0.00 -0.45  0.00  0.00  0.00  0.25  4.00  0.00  0.00  0.00
## [229]  0.00  0.00  0.00  0.00  0.00  0.00  0.00  0.00  0.00  0.00  0.00  0.00
## [241]  0.00  1.55  0.00  0.00  0.90  0.00  0.00 -3.35  0.00
get_sentiment(threads_2_f$text, method='bing')
##   [1]  0  1  3 -7  0  0  0  8  0  0  0  0  0  0  0  0  0  0  1  4  0  0  0  2  0
##  [26]  0  0  5  0  1  0  0  0  0  0  0  0 -1  0  0  0 -1  2  0  0  0  0  1  0  0
##  [51]  5  0  0  0  0  0  0  8  0  0  0  0  0  0  0  0  3 -1  0  1  0  0  0  0  0
##  [76]  3  2  0  0  0  2  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  3  0
## [101]  0  0 -2  1  0  0  0  0  0  3  0  0  1  0  0  4  0  4  0  0  2  0  0  0  3
## [126] -6  0  0 -1  0  0  0  0  0  0  0  0  0  0  0  2  0  2  0  0  0  0  0  0  0
## [151]  0  1  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  4  0  0  0
## [176]  0  0  0  0  0  0  1 12  0 -5  0 -1  0  0  0 -1  0  0  1  0  4 -1  0  0  0
## [201]  0 -1  0  0  0  0  0  0  0  0  1  0  0  1  0  0  0  0  0 -6  0  0  0  0  3
## [226]  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  1  0  0  1  0  0 -2  0
get_sentiment(threads_2_f$text, method='afinn')
##   [1]   0   5   8 -20   0   0   0  20   0   0   0   0   0   0  -2   0   0  -4
##  [19]   5   8   0   0   0  12   0   0   0  14   0   7   0   0   6   0   0   5
##  [37]   0   0   0   0   0   0  13   0   0   0   0   2   0   0   7   0   0   0
##  [55]   0   0   0  21   0   0   0   0   0   0   0   0   5  -4   0   2   0   1
##  [73]   0   0   0  13  -1   0   0   0   2   0   0   0   0   0   0   0   0   0
##  [91]   0   0   0   3   0   0   0   0   4   2   0   0  10   3   0   0   0   0
## [109]   1  12   0   0   1   0   0  13   0   6   0   0  10   0   0   0   5  -3
## [127]   0   0  -1   0   0   0   0   0   0   0   0   0   0   0   3   0  15   0
## [145]   0   0   0   0   0   0   0   3   0   0   0   0   0   0   0   0   0   0
## [163]   0   0   0   0   0   0   0   0  -1   7   0   2   0   0   0   0   0   0
## [181]   0   4  14   0 -18   0   0   0   0   0  -1   0   0   0   0 -14   0   0
## [199]   0   0   0  -3   0   0   0   0  -2   0   0   0   0   0   0   1   0   0
## [217]   0   0   0   5   0   0   0   2  14   0   0   0   0   0   0   0   0   0
## [235]   0   0   0   0   0   0   0   3   0   0   4   0   0  -3   0
get_sentiment(threads_2_f$text, method='nrc')
##   [1]  0  3  2 -8  0  0  0  8  0  0  0  0  0  0  0  0  0  3  8  2  0  0  0 -3  0
##  [26]  0  0  8  0  2  0  0  2  0  1  1  0 -1  0  0  0 -1 13  0  0  0  0 -1  0  0
##  [51]  1  0  0  0  0  0  0  2  0  0  0  0  0  0  0  0  2  2  0  0  0  1  0  0  0
##  [76]  9  1  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  3  3
## [101]  0  0  8 -1  0  0  0  0  0  7  0  0  0  0  0  3  0  3  0  0  3  0  0  0  3
## [126]  1  4  0 -1  0  0  0  0  0  0  0  0  0  0  1  0  0  5  0  0  0  0  0  0  0
## [151]  0  2  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0 -1  1  0  1  0
## [176]  0  0  0  0  0  0  1  8  0 -6  0 -1  0  0  0  0  0  0  1  0 -4 -1  0  0  0
## [201]  0  0  0  0  0  0  4  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  1  4
## [226]  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  1  0  0  0  0  0 -5  0
get_nrc_sentiment(threads_2_f$text)
##     anger anticipation disgust fear joy sadness surprise trust negative
## 1       0            0       0    0   0       0        0     0        0
## 2       7           11       3    9   6       5        4    14       16
## 3       0            1       0    0   1       0        0     2        1
## 4      11            3       5    7   1       5        2     6       14
## 5       0            0       0    0   0       0        0     0        0
## 6       0            0       0    0   0       0        0     0        0
## 7       0            0       0    0   0       0        0     0        0
## 8       1            6       2    3   8       4        4     8        4
## 9       0            0       0    0   0       0        0     0        0
## 10      0            0       0    0   0       0        0     1        0
## 11      0            0       0    0   0       0        0     0        0
## 12      0            0       0    0   0       0        0     0        0
## 13      0            0       0    0   0       0        0     0        0
## 14      0            0       0    0   0       0        0     0        0
## 15      0            0       0    0   0       0        0     0        0
## 16      0            0       0    0   0       0        0     0        0
## 17      0            0       0    0   0       0        0     0        0
## 18      3            2       2    4   3       4        0     4        3
## 19      3            5       1    0   3       2        3     7        5
## 20      1            2       0    1   2       0        3     2        1
## 21      0            0       0    0   0       0        0     0        0
## 22      0            0       0    0   0       0        0     0        0
## 23      0            0       0    0   0       0        0     0        0
## 24      1            5       1    2   1       4        2     2        5
## 25      0            0       0    0   0       0        0     0        0
## 26      0            0       0    0   0       0        0     0        0
## 27      0            0       0    0   0       0        0     0        0
## 28      2           11       2    4   7       1        4    10        3
## 29      0            0       0    0   0       0        0     0        0
## 30      1            3       1    1   2       1        1     3        2
## 31      0            0       0    0   0       0        0     0        0
## 32      0            0       0    0   0       0        0     0        0
## 33      0            0       0    0   2       0        1     1        0
## 34      0            0       0    0   0       0        0     0        0
## 35      0            0       0    0   0       0        0     0        0
## 36      0            1       1    0   0       0        0     0        1
## 37      0            0       0    0   0       0        0     0        0
## 38      1            1       1    1   0       1        1     0        1
## 39      0            0       0    0   0       0        0     0        0
## 40      0            0       0    0   0       0        0     0        0
## 41      0            0       0    0   0       0        0     0        0
## 42      0            0       0    0   0       0        0     0        1
## 43      1            7       1    4   6       1        3     9        6
## 44      0            0       0    0   0       0        0     0        0
## 45      0            0       0    0   0       0        0     0        0
## 46      0            0       0    0   0       0        0     0        0
## 47      0            0       0    0   0       0        0     0        0
## 48      1            1       1    1   0       1        0     0        1
## 49      0            0       0    0   0       0        0     0        0
## 50      0            0       0    0   0       0        0     0        0
## 51      1            2       1    1   1       1        0     1        1
## 52      0            0       0    0   0       0        0     0        0
## 53      0            0       0    0   0       0        0     0        0
## 54      0            0       0    0   0       0        0     0        0
## 55      0            0       0    0   0       0        0     0        0
## 56      0            0       0    0   0       0        0     0        0
## 57      0            0       0    0   0       0        0     0        0
## 58      4            7       2    4   2       3        4     2        5
## 59      0            0       0    0   0       0        0     0        0
## 60      0            0       0    0   0       0        0     0        0
## 61      0            0       0    0   0       0        0     0        0
## 62      0            0       0    0   0       0        0     0        0
## 63      0            0       0    0   0       0        0     0        0
## 64      0            0       0    0   0       0        0     0        0
## 65      0            0       0    0   0       0        0     0        0
## 66      0            0       0    0   0       0        0     0        0
## 67      0            0       1    0   0       0        0     0        0
## 68      0            1       1    2   1       0        1     1        1
## 69      0            0       0    0   0       0        0     0        0
## 70      0            0       0    0   0       0        0     1        0
## 71      0            0       0    0   0       0        0     0        0
## 72      2            0       1    1   0       1        1     1        4
## 73      0            0       0    0   0       0        0     0        0
## 74      0            0       0    0   0       0        0     0        0
## 75      0            0       0    0   0       0        0     0        0
## 76      1            5       3    2   4       1        6     5        2
## 77      2            2       1    3   3       1        1     3        3
## 78      0            0       0    0   0       0        0     0        0
## 79      0            0       0    0   0       0        0     0        0
## 80      0            0       0    0   0       0        0     0        0
## 81      0            0       0    0   0       1        0     0        0
## 82      0            0       0    0   0       0        0     0        0
## 83      0            0       0    0   0       0        0     0        0
## 84      0            0       0    0   0       0        0     0        0
## 85      0            0       0    0   0       0        0     0        0
## 86      0            0       0    0   0       0        0     0        0
## 87      0            0       0    0   0       0        0     0        0
## 88      0            0       0    0   0       0        0     0        0
## 89      0            0       0    0   0       0        0     0        0
## 90      0            0       0    0   0       0        0     0        0
## 91      0            0       0    0   0       0        0     0        0
## 92      0            0       0    0   0       0        0     0        0
## 93      0            0       0    0   0       0        0     0        0
## 94      0            0       0    0   0       0        0     0        0
## 95      0            0       0    0   0       0        0     0        0
## 96      0            0       0    0   0       0        0     0        0
## 97      0            0       0    0   0       0        0     0        0
## 98      0            0       0    0   0       0        0     0        0
## 99      0            1       0    1   1       0        2     2        1
## 100     0            1       1    2   1       1        1     2        0
## 101     0            0       0    0   0       0        0     0        0
## 102     0            0       0    0   0       0        0     0        0
## 103    12           11       9   13  10      14        8    12       20
## 104     2            0       2    2   1       1        1     2        4
## 105     0            0       0    0   0       0        0     0        0
## 106     0            0       0    0   0       0        0     0        0
## 107     0            0       0    0   0       0        0     0        0
## 108     0            0       0    0   0       0        0     0        0
## 109     0            0       0    0   0       0        0     0        0
## 110     3            4       5    5   6       5        3     5        6
## 111     0            0       0    0   0       0        0     0        0
## 112     0            0       0    0   0       0        0     0        0
## 113     0            0       1    0   0       0        0     0        0
## 114     0            0       0    0   0       0        0     0        0
## 115     0            0       0    0   0       0        0     0        0
## 116     2            5       0    2   3       2        2     3        6
## 117     0            0       0    0   0       0        0     0        0
## 118     0            0       0    0   0       0        1     0        0
## 119     0            0       0    0   0       0        0     0        0
## 120     0            0       0    0   0       0        0     0        0
## 121     1            0       0    0   0       0        0     1        1
## 122     0            0       0    0   0       0        0     0        0
## 123     0            0       0    0   0       0        0     0        0
## 124     0            0       0    0   0       0        0     0        0
## 125     0            0       0    0   0       0        0     2        0
## 126     2            2       1    1   0       2        0     1        4
## 127     1            1       0    0   2       0        1     2        0
## 128     0            0       0    0   0       0        0     0        0
## 129     0            0       0    0   0       0        1     0        1
## 130     0            0       0    0   0       0        0     0        0
## 131     0            0       0    0   0       0        0     0        0
## 132     0            0       0    0   0       0        0     0        0
## 133     0            0       0    0   0       0        0     0        0
## 134     0            0       0    0   0       0        0     0        0
## 135     0            0       0    0   0       0        0     0        0
## 136     0            0       0    0   0       0        0     0        0
## 137     0            0       0    0   0       0        0     0        0
## 138     0            0       0    0   0       0        0     0        0
## 139     0            0       0    0   0       0        0     0        0
## 140     0            0       0    0   0       0        0     0        0
## 141     4            3       0    3   2       3        2     3        7
## 142     0            0       0    0   0       0        0     0        0
## 143     0            8       0    2   6       0        3     6        6
## 144     0            0       0    0   0       0        0     0        0
## 145     0            0       0    0   0       0        0     0        0
## 146     0            0       0    0   0       0        0     0        0
## 147     0            0       0    0   0       0        0     0        0
## 148     0            0       0    0   0       0        0     0        0
## 149     0            0       0    0   0       0        0     0        0
## 150     0            0       0    0   0       0        0     0        0
## 151     0            0       0    0   0       0        0     0        0
## 152     0            0       0    0   0       0        1     0        0
## 153     0            0       0    0   0       0        0     0        0
## 154     0            0       0    0   0       0        0     0        0
## 155     0            0       0    0   0       0        0     0        0
## 156     0            0       0    0   0       0        0     0        0
## 157     0            0       0    0   0       0        0     0        0
## 158     0            1       0    0   0       0        0     0        0
## 159     0            0       0    0   0       0        0     0        0
## 160     0            0       0    0   0       0        0     0        0
## 161     0            0       0    0   0       0        0     0        0
## 162     0            0       0    0   0       0        0     0        0
## 163     0            0       0    0   0       0        0     0        0
## 164     0            0       0    0   0       0        0     0        0
## 165     0            0       0    0   0       0        0     0        0
## 166     0            0       0    0   0       0        0     0        0
## 167     0            0       0    0   0       0        0     0        0
## 168     0            0       0    0   0       0        0     0        0
## 169     0            0       0    0   0       0        0     0        0
## 170     0            0       0    0   0       0        0     0        0
## 171     0            0       0    0   0       0        0     0        1
## 172     0            1       2    0   2       0        2     3        3
## 173     0            0       0    0   0       0        0     0        0
## 174     0            0       0    0   0       0        0     0        0
## 175     0            0       0    0   0       0        0     0        0
## 176     0            0       0    0   0       0        0     0        0
## 177     0            0       0    0   0       0        0     0        0
## 178     0            0       0    0   0       0        0     0        0
## 179     0            0       0    0   0       0        0     0        0
## 180     0            0       0    0   0       0        0     0        0
## 181     0            0       0    0   0       0        0     0        0
## 182     0            0       0    0   1       0        0     1        0
## 183     4            7       4    6   7       6        4    14        9
## 184     0            0       0    0   0       0        0     0        0
## 185     6            0       3    7   0       4        0     3        8
## 186     0            0       0    0   0       0        0     0        0
## 187     1            1       1    1   0       1        0     0        1
## 188     0            0       0    0   0       0        0     0        0
## 189     0            0       0    0   0       0        0     0        0
## 190     0            0       0    0   0       0        0     0        0
## 191     0            0       1    1   1       1        0     1        2
## 192     0            0       0    0   0       0        0     0        0
## 193     0            0       0    0   0       0        0     0        0
## 194     0            0       0    0   0       0        0     1        0
## 195     0            0       0    0   0       0        0     0        0
## 196     6            4       3    4   3       3        2     3        9
## 197     1            1       1    2   0       2        0     2        4
## 198     0            0       0    0   0       0        0     0        0
## 199     0            0       0    0   0       0        0     0        0
## 200     0            0       0    0   0       0        0     0        0
## 201     0            0       0    0   0       0        0     0        0
## 202     1            1       2    4   1       2        1     2        3
## 203     0            0       0    0   0       0        0     0        0
## 204     0            0       0    0   0       0        0     0        0
## 205     0            0       0    0   0       0        0     0        0
## 206     0            0       0    0   0       0        0     0        0
## 207     4            7       4    6   3       5        6     5        8
## 208     0            0       0    0   0       0        0     0        0
## 209     0            0       0    0   0       0        0     0        0
## 210     0            0       0    0   0       0        0     0        0
## 211     0            1       0    1   0       0        0     0        0
## 212     0            0       0    0   0       0        0     0        0
## 213     0            0       0    0   0       0        0     0        0
## 214     1            1       0    1   0       2        0     1        2
## 215     0            0       0    0   0       0        0     0        0
## 216     0            0       0    0   0       0        0     0        0
## 217     0            1       0    1   0       0        0     0        0
## 218     0            0       0    0   0       0        0     0        0
## 219     0            0       0    0   0       0        0     0        0
## 220     2            2       1    4   3       3        1     2        4
## 221     0            0       0    0   0       0        0     0        0
## 222     0            0       0    0   0       0        0     0        0
## 223     0            0       0    0   0       0        0     0        0
## 224     1            0       1    0   1       0        1     0        1
## 225     0            2       1    0   2       0        2     3        0
## 226     0            0       0    0   0       0        0     0        0
## 227     0            0       0    0   0       0        0     0        0
## 228     0            0       0    0   0       0        0     0        0
## 229     0            0       0    0   0       0        0     0        0
## 230     0            0       0    0   0       0        0     0        0
## 231     0            0       0    0   0       0        0     0        0
## 232     0            0       0    0   0       0        0     0        0
## 233     0            0       0    0   0       0        0     0        0
## 234     0            0       0    0   0       0        0     0        0
## 235     0            0       0    0   0       0        0     0        0
## 236     0            0       0    0   0       0        0     0        0
## 237     0            0       0    0   0       0        0     0        0
## 238     0            0       0    0   0       0        0     0        0
## 239     0            0       0    0   0       0        0     0        0
## 240     0            0       0    0   0       0        0     0        0
## 241     0            0       0    0   0       0        0     0        0
## 242     0            0       0    0   0       0        1     0        0
## 243     0            0       0    0   0       0        0     0        0
## 244     0            0       0    0   0       0        0     0        0
## 245     0            0       0    0   0       0        0     0        0
## 246     0            0       0    0   0       0        0     0        0
## 247     0            0       0    0   0       0        0     0        0
## 248     3            1       3    2   0       2        1     3        6
## 249     0            0       0    0   0       0        0     0        0
##     positive
## 1          0
## 2         18
## 3          3
## 4          6
## 5          0
## 6          0
## 7          0
## 8         12
## 9          0
## 10         0
## 11         0
## 12         0
## 13         0
## 14         0
## 15         0
## 16         0
## 17         0
## 18         6
## 19        12
## 20         3
## 21         0
## 22         0
## 23         0
## 24         2
## 25         0
## 26         0
## 27         0
## 28        11
## 29         0
## 30         4
## 31         0
## 32         0
## 33         2
## 34         0
## 35         1
## 36         2
## 37         0
## 38         0
## 39         0
## 40         0
## 41         0
## 42         0
## 43        19
## 44         0
## 45         0
## 46         0
## 47         0
## 48         0
## 49         0
## 50         0
## 51         2
## 52         0
## 53         0
## 54         0
## 55         0
## 56         0
## 57         0
## 58         7
## 59         0
## 60         0
## 61         0
## 62         0
## 63         0
## 64         0
## 65         0
## 66         0
## 67         2
## 68         3
## 69         0
## 70         0
## 71         0
## 72         5
## 73         0
## 74         0
## 75         0
## 76        11
## 77         4
## 78         0
## 79         0
## 80         0
## 81         0
## 82         0
## 83         0
## 84         0
## 85         0
## 86         0
## 87         0
## 88         0
## 89         0
## 90         0
## 91         0
## 92         0
## 93         0
## 94         0
## 95         0
## 96         0
## 97         0
## 98         0
## 99         4
## 100        3
## 101        0
## 102        0
## 103       28
## 104        3
## 105        0
## 106        0
## 107        0
## 108        0
## 109        0
## 110       13
## 111        0
## 112        0
## 113        0
## 114        0
## 115        0
## 116        9
## 117        0
## 118        3
## 119        0
## 120        0
## 121        4
## 122        0
## 123        0
## 124        0
## 125        3
## 126        5
## 127        4
## 128        0
## 129        0
## 130        0
## 131        0
## 132        0
## 133        0
## 134        0
## 135        0
## 136        0
## 137        0
## 138        0
## 139        0
## 140        1
## 141        7
## 142        0
## 143       11
## 144        0
## 145        0
## 146        0
## 147        0
## 148        0
## 149        0
## 150        0
## 151        0
## 152        2
## 153        0
## 154        0
## 155        0
## 156        0
## 157        0
## 158        0
## 159        0
## 160        0
## 161        0
## 162        0
## 163        0
## 164        0
## 165        0
## 166        0
## 167        0
## 168        0
## 169        0
## 170        0
## 171        0
## 172        4
## 173        0
## 174        1
## 175        0
## 176        0
## 177        0
## 178        0
## 179        0
## 180        0
## 181        0
## 182        1
## 183       17
## 184        0
## 185        2
## 186        0
## 187        0
## 188        0
## 189        0
## 190        0
## 191        2
## 192        0
## 193        0
## 194        1
## 195        0
## 196        5
## 197        3
## 198        0
## 199        0
## 200        0
## 201        0
## 202        2
## 203        0
## 204        0
## 205        0
## 206        0
## 207       12
## 208        0
## 209        0
## 210        0
## 211        0
## 212        0
## 213        0
## 214        2
## 215        0
## 216        0
## 217        0
## 218        0
## 219        0
## 220        4
## 221        0
## 222        0
## 223        0
## 224        2
## 225        4
## 226        0
## 227        0
## 228        0
## 229        0
## 230        0
## 231        0
## 232        0
## 233        0
## 234        0
## 235        0
## 236        0
## 237        0
## 238        0
## 239        0
## 240        0
## 241        0
## 242        1
## 243        0
## 244        0
## 245        0
## 246        0
## 247        0
## 248        1
## 249        0
# by string
library(sentimentr)

sentiment_by(threads_2_f$text)
## Key: <element_id>
##      element_id word_count        sd ave_sentiment
##           <int>      <int>     <num>         <num>
##   1:          1          0        NA    0.00000000
##   2:          2        616 0.4149229    0.08268525
##   3:          3         84 0.2554850    0.19380210
##   4:          4        166 0.9055534   -0.89195202
##   5:          5          0        NA    0.00000000
##  ---                                              
## 245:        245         16        NA    0.22500000
## 246:        246          0        NA    0.00000000
## 247:        247          0        NA    0.00000000
## 248:        248         78 0.5713983   -0.34350844
## 249:        249          0        NA    0.00000000
# by sentence
sentiment(threads_2_f$text)
## Key: <element_id, sentence_id>
##      element_id sentence_id word_count  sentiment
##           <int>       <int>      <int>      <num>
##   1:          1           1         NA  0.0000000
##   2:          2           1         15  0.0000000
##   3:          2           2         14 -0.5478855
##   4:          2           3         16 -0.3500000
##   5:          2           4         17 -0.1819017
##  ---                                             
## 664:        248           1         28 -0.4889915
## 665:        248           2          9  0.0000000
## 666:        248           3         29 -1.0584634
## 667:        248           4         12  0.2309401
## 668:        249           1         NA  0.0000000
# --- Preview a slice of your dataset ---
threads_2_f[20:30, ]
##      date_utc  timestamp
## 20 2025-06-24 1750783563
## 21 2024-12-19 1734569871
## 22 2025-09-16 1758045179
## 23 2025-04-03 1743710395
## 24 2025-02-13 1739462512
## 25 2025-06-29 1751218833
## 26 2025-08-27 1756299442
## 27 2025-01-18 1737160215
## 28 2025-06-29 1751219647
## 29 2024-12-12 1734031921
## 30 2025-06-14 1749912568
##                                                                                                                                                                                                                                     title
## 20                                                                                                                                                                            Don't Care What Nobody Says, This Hyped Me Up Back in 2023.
## 21 Charlie Cox says the upcoming Disney+ Daredevil series will go darker than the Netflix series: "We really pushed for the show to remain geared towards an older audience and not dumbed down to kind of capture a wider net of people"
## 22                                                                                                                                                   What do you think of the mcu version of lady death/ Rio Vidal played by audrey plaza
## 23                              Chris Pratt Confirms Star-Lord Will Return, Jokes About Being Absent from 'Doomsday' Reveal: "They must have cut away from it. I don't know what happened. My chair was there. I'm sure it was there.\035
## 24                                                                                                                  Michael B. Jordan Says Marvel Will Get Its Success Back, but He Tells the Studio: \030I Want to See a Blade Movie\031
## 25                                                                                                                                                      Scarlett Johansson: \030I was cast for my desirability \024 that\031s shifted\031
## 26                                                                                                                                                      Jake Schreier shares new BTS pics to celebrate Thunderbolts* streaming on Disney+
## 27                                                                                                                                                                              People think Daredevil isn't funny, but Matt is hilarious
## 28                                                                                                                                                                                    I am clearly not Ironheart\031s target demographic.
## 29                                                                                                                                           Denzel Washington Called Ryan Coogler to Apologize for Spilling \030Black Panther 3\031 News
## 30                                                                                                                                                                                                           Is someone erased from shot?
##                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                             text
## 20                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                      Kang was literally the best part about Quantumania. Just thinking about The Kang Dynasty and The Avengers and Co having to fight many many different versions of Kang was enough for me to get excited for Loki Season 2, Kang Dynsasty, everything else involving Kang. I truly hope that Marvel comes to their senses and bring Kang back for Phase 7.
## 21                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                              
## 22                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                              
## 23                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                              
## 24                                                                                                                                                                                &gt; \034[Marvel\031s] doing great,\035 said Jordan, who is one of the MCU\031s all-time great villains after playing Erik Killmonger in \034Black Panther\035 and its sequel. \034They\031ll get it back.\035 &gt; One comic book tentpole Jordan hopes Marvel gets off the ground is its long-in-the-works \034Blade\035 movie. First announced in 2019 with Mahershala Ali tapped to play the eponymous vampire hunter, \034Blade\035 has been through various writers and directors. Marvel officially took the movie off its release calendar last fall. &gt; \034Launching any franchise, it\031s tough,\035 Jordan said. \034I hope it gets together. I want to see a \030Blade\031 movie, you know what I\031m saying? The \030Blade\031 franchise was everything.\035
## 25                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                              
## 26                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                              
## 27                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                              
## 28 Nearly middle-aged white dude. Have had some qualms about some projects since Endgame. And here is this show about a teenage girl that seems like it is trying to fill the Iron Man void. But damn if this show isn\031t actually good. I am really enjoying the acting, the storytelling, and the way the show is going. It\031s really fun to watch and I am really getting in to the characters- especially NATALIE. And Joe. Riri is having a pretty great arc here, and I get the feeling I am going to be way more invested in her as a character as more episodes come out. I wasn\031t planning on watching this. It just so happened that my wife had a girl\031s night and I put my kid to bed and had nothing else to do after finishing Andor. So I said \034fuck it, let\031s see.\035 And I\031m glad I did. I highly suggest checking it out. There are some great action sequences, some mysterious intrigue, and ya know, it\031s just cool.
## 29                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                              
## 30                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                            Might be a very rogue theory but i think in typical MCU fashion, someone important to the story is erased from this shot. Maybe Victor? Steve (Another rogue theory that Cap is somehow in F4 because his time travel was the reason for this F4 timeline), BABY FRANKLIN???. There's too much unused space there that it gives me NWH trailer vibes. Like how Toby and Andrew were erased from the swinging shot.
##        subreddit comments
## 20 marvelstudios      457
## 21 marvelstudios      202
## 22 marvelstudios      412
## 23 marvelstudios      276
## 24 marvelstudios      259
## 25 marvelstudios      390
## 26 marvelstudios      140
## 27 marvelstudios      152
## 28 marvelstudios      793
## 29 marvelstudios      146
## 30 marvelstudios      580
##                                                                                                           url
## 20  https://www.reddit.com/r/marvelstudios/comments/1ljg6p7/dont_care_what_nobody_says_this_hyped_me_up_back/
## 21    https://www.reddit.com/r/marvelstudios/comments/1hhgtx9/charlie_cox_says_the_upcoming_disney_daredevil/
## 22      https://www.reddit.com/r/marvelstudios/comments/1niokze/what_do_you_think_of_the_mcu_version_of_lady/
## 23   https://www.reddit.com/r/marvelstudios/comments/1jqsj5z/chris_pratt_confirms_starlord_will_return_jokes/
## 24 https://www.reddit.com/r/marvelstudios/comments/1iomc2i/michael_b_jordan_says_marvel_will_get_its_success/
## 25 https://www.reddit.com/r/marvelstudios/comments/1lnkmkg/scarlett_johansson_i_was_cast_for_my_desirability/
## 26    https://www.reddit.com/r/marvelstudios/comments/1n1gb1n/jake_schreier_shares_new_bts_pics_to_celebrate/
## 27     https://www.reddit.com/r/marvelstudios/comments/1i3v74j/people_think_daredevil_isnt_funny_but_matt_is/
## 28    https://www.reddit.com/r/marvelstudios/comments/1lnkyne/i_am_clearly_not_ironhearts_target_demographic/
## 29          https://www.reddit.com/r/marvelstudios/comments/1hct77d/denzel_washington_called_ryan_coogler_to/
## 30                       https://www.reddit.com/r/marvelstudios/comments/1lbagts/is_someone_erased_from_shot/
# --- 1. Clean text and handle negations ---
handle_negations <- function(text){
  text %>%
    str_replace_all("\\bnot ([a-z]+)", "not_\\1") %>%
    str_replace_all("\\bnever ([a-z]+)", "never_\\1") %>%
    str_replace_all("\\bno ([a-z]+)", "no_\\1") %>%
    str_squish()
}

threads_2_f <- threads_2_f %>%
  mutate(text_clean = handle_negations(text))

# --- 2. Dictionary-based sentiment (syuzhet) ---
threads_2_f <- threads_2_f %>%
  mutate(
    syuzhet_score = get_sentiment(text_clean, method = "syuzhet"),
    bing_score    = get_sentiment(text_clean, method = "bing"),
    afinn_score   = get_sentiment(text_clean, method = "afinn"),
    nrc_score     = get_sentiment(text_clean, method = "nrc")
  )

# Optional: NRC emotions
nrc_emotions <- get_nrc_sentiment(threads_2_f$text_clean)

# --- 3. Negation-aware sentiment using sentimentr ---
threads_2_f <- threads_2_f %>%
  mutate(
    text_split = get_sentences(text_clean)
  )

reddit_sentiment <- sentiment_by(threads_2_f$text_split)

# --- 4. Merge sentiment results for analysis ---
threads_2_f <- threads_2_f %>%
  bind_cols(reddit_sentiment %>% select(ave_sentiment, sd, word_count))

# --- 5. Summarize dictionary sentiment ---
dict_summary <- threads_2_f %>%
  summarize(
    syuzhet_avg = mean(syuzhet_score),
    bing_avg    = mean(bing_score),
    afinn_avg   = mean(afinn_score),
    nrc_avg     = mean(nrc_score)
  )
dict_summary
##   syuzhet_avg  bing_avg afinn_avg   nrc_avg
## 1   0.3891566 0.2851406 0.9076305 0.4216867
# --- 6. Plot negation-aware sentiment (sentimentr) ---
ggplot(threads_2_f, aes(x = ave_sentiment)) +
  geom_histogram(binwidth = 0.1, fill = "steelblue", color = "black") +
  labs(
    title = "Distribution of Reddit Comment Sentiment (Negation-Aware)",
    x = "Average Sentiment per Comment",
    y = "Frequency"
  )

# --- 7. Optional: test sarcasm (demonstration) ---
text_test <- c(
  "I loved this movie! Not good at all.",
  "Marvel really nailed it. Never boring!",
  "Worst movie ever, bless your heart."
)
sentiment_by(text_test)
## Key: <element_id>
##    element_id word_count         sd ave_sentiment
##         <int>      <int>      <num>         <num>
## 1:          1          8 0.44194174    -0.0625000
## 2:          2          6 0.02270292     0.6910534
## 3:          3          6         NA     0.2041241

10 sample comments with sentiment scores

# Select 10 random samples
set.seed(123)  # for reproducibility
sample_comments <- threads_2_f %>% 
  sample_n(10) %>% 
  select(text_clean, syuzhet_score, bing_score, afinn_score, nrc_score, ave_sentiment)

# Display as a table
library(knitr)
kable(sample_comments, caption = "Sample Reddit Comments with Sentiment Scores")
Sample Reddit Comments with Sentiment Scores
text_clean syuzhet_score bing_score afinn_score nrc_score ave_sentiment
0.00 0 0 0 0.0000000
I think we’re all aware that it’s nearly impossible for Marvel to ever beat Infinity War or Endgame, but the way the events of those movies (especially the Blip) are still playing part in most of the Marvel movies and TV shows to this day is so cool. We got four different point of views of the blip: - Normal POV, shown in almost all movies and TV series since Infinity War. People slowly turning into dust one by one - Third person POV, shown in Far From Home as the students suddenly disappeared without anyone having idea of what’s happening - Monica POV, shown in Wandavision: everyone returning from the dust, as the world slowly becomes pure chaos because of the amount of people coming back - Yelena POV, shown in Hawkeye: the point of view of someone who was dusted. They were deleted from the existence for about less than 10 seconds until Hulk snapped his fingers and everyone went back. The whole background around them changes as well since they were out for 5 years Even if it’s small mentions, they keep finding a way to bring the blip consequences back, after all it was the biggest disaster in the whole universe, so it’ll obviously play a part on the plot forever Not only the blip, but basically everything that happened in those two movies. Thanos killing Vision resulted on the legendary Wandavision series and improved Wanda’s character so much (we don’t talk about MoM tho) Some characters’ deaths had huges consequences for other characters too, like how Iron Man’s death impacted Spider Man’s story I really hope Marvel finds a way to do a movie as good as Endgame and Infinity War, those two movies affected how their whole cinematographic universe worked and even though some movies like Quantummania or Far From Home were hated by a big part of the public it’s still cool to see how they are also affected by the snap events some way. 1.65 0 -2 4 0.1543963
0.00 0 0 0 0.0000000
0.00 0 0 0 0.0000000
0.00 0 0 0 0.0000000
0.00 0 0 0 0.0000000
0.00 0 0 0 0.0000000
Post-Credits Scene: A new character shows up. Actual Payoff: …never_addressed again, but hey, cool orb, right? At this point, Marvel post-credits scenes are like checking your phone for a message that never_comes. 4.00 4 6 3 0.6666667
I keep seeing people on other social platforms talk about the decline of Marvel or DC movies to superhero fatigue, and honestly, I think that explanation misses a lot. Its become a catch-all phrase that ignores other issues with how these movies are being made and released. First, Disney put a lot of pressure on Marvel. Disney pushed for more and more content, especially on Disney+, which led to a bunch of shows and movies coming out back to back. This is the explanation weve heard from studio execs like Kevin Feige. What I feel largely does not_get discussed is the ramifications from the pandemic. It changed how people go to the movies. Some people still havent gone back to theaters regularly, and streaming is now a bigger part of how we watch things. Plus, Disney+ drops Marvel movies just a few months after their theatrical release so for a lot of people, why rush to see it in theaters when you can wait and watch at home? For example, even if its anecdotal, when I asked my brother what movie he would see in July, he said Superman because Fantastic Four will drop in a few months. I also think going to the movies has become expensive, especially for families which is part of the core general audience of these films. Imagine you have a family, you probably already have Disney+, why go to a theatre, spend about $60 on tickets, pay for higher marked food items, etc. Also, international audiences have shifted. Marvel used to crush globally, but those numbers have softened a lot. Not every market is still hyped on the superhero genre the way they used to be. This can be due to a variety of things. There are still places around the globe that havent recovered economically or some other places that have implemented policies to promote their country movies as opposed to American movies. I agree with James Gunns sentiments that the U.S is not_on good terms with other countries. So yeah, superhero fatigue might sound like an easy answer, but it lets studios off the hook. I also think general audiences just love nostalgia. Its human nature too. People gravitate towards what they know after becoming familiar with something for so many years. Its the reason why I think Spider-Man NWH and Deadpool & Wolverine did well. Its the same reason why I think Nintendo can remake the same game with updated graphics and sell it for a higher price. As much as people say they want new stories, people in overwhelming numbers flock to what they already know. 7.40 2 13 13 0.0850477
0.00 0 0 0 0.0000000

Plots that show intriguing insights derived from the sentiment analysis

Plot 1: Distribution of Average Sentiment per Comment (Negation-Aware)

library(ggplot2)

ggplot(threads_2_f, aes(x = ave_sentiment)) +
  geom_histogram(binwidth = 0.1, fill = "steelblue", color = "black") +
  labs(
    title = "Distribution of Reddit Comment Sentiment (Negation-Aware)",
    x = "Average Sentiment per Comment",
    y = "Frequency"
  )

The distribution in Plot 1 shows average sentiment per comment using a negation-aware method. From the histogram, average sentiments are clustered near zero, but more spread than dictionary methods. The small negative tail indicates some meaningful negativity that lexicons may undercount. on the other hand, the Small but visible positive tail suggests some genuinely enthusiastic comments but not as many as one might expect for a fan-driven Reddit topic.

Plot 2

threads_2_f[20:30, ] 
##      date_utc  timestamp
## 20 2025-06-24 1750783563
## 21 2024-12-19 1734569871
## 22 2025-09-16 1758045179
## 23 2025-04-03 1743710395
## 24 2025-02-13 1739462512
## 25 2025-06-29 1751218833
## 26 2025-08-27 1756299442
## 27 2025-01-18 1737160215
## 28 2025-06-29 1751219647
## 29 2024-12-12 1734031921
## 30 2025-06-14 1749912568
##                                                                                                                                                                                                                                     title
## 20                                                                                                                                                                            Don't Care What Nobody Says, This Hyped Me Up Back in 2023.
## 21 Charlie Cox says the upcoming Disney+ Daredevil series will go darker than the Netflix series: "We really pushed for the show to remain geared towards an older audience and not dumbed down to kind of capture a wider net of people"
## 22                                                                                                                                                   What do you think of the mcu version of lady death/ Rio Vidal played by audrey plaza
## 23                              Chris Pratt Confirms Star-Lord Will Return, Jokes About Being Absent from 'Doomsday' Reveal: "They must have cut away from it. I don't know what happened. My chair was there. I'm sure it was there.\035
## 24                                                                                                                  Michael B. Jordan Says Marvel Will Get Its Success Back, but He Tells the Studio: \030I Want to See a Blade Movie\031
## 25                                                                                                                                                      Scarlett Johansson: \030I was cast for my desirability \024 that\031s shifted\031
## 26                                                                                                                                                      Jake Schreier shares new BTS pics to celebrate Thunderbolts* streaming on Disney+
## 27                                                                                                                                                                              People think Daredevil isn't funny, but Matt is hilarious
## 28                                                                                                                                                                                    I am clearly not Ironheart\031s target demographic.
## 29                                                                                                                                           Denzel Washington Called Ryan Coogler to Apologize for Spilling \030Black Panther 3\031 News
## 30                                                                                                                                                                                                           Is someone erased from shot?
##                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                             text
## 20                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                      Kang was literally the best part about Quantumania. Just thinking about The Kang Dynasty and The Avengers and Co having to fight many many different versions of Kang was enough for me to get excited for Loki Season 2, Kang Dynsasty, everything else involving Kang. I truly hope that Marvel comes to their senses and bring Kang back for Phase 7.
## 21                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                              
## 22                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                              
## 23                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                              
## 24                                                                                                                                                                                &gt; \034[Marvel\031s] doing great,\035 said Jordan, who is one of the MCU\031s all-time great villains after playing Erik Killmonger in \034Black Panther\035 and its sequel. \034They\031ll get it back.\035 &gt; One comic book tentpole Jordan hopes Marvel gets off the ground is its long-in-the-works \034Blade\035 movie. First announced in 2019 with Mahershala Ali tapped to play the eponymous vampire hunter, \034Blade\035 has been through various writers and directors. Marvel officially took the movie off its release calendar last fall. &gt; \034Launching any franchise, it\031s tough,\035 Jordan said. \034I hope it gets together. I want to see a \030Blade\031 movie, you know what I\031m saying? The \030Blade\031 franchise was everything.\035
## 25                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                              
## 26                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                              
## 27                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                              
## 28 Nearly middle-aged white dude. Have had some qualms about some projects since Endgame. And here is this show about a teenage girl that seems like it is trying to fill the Iron Man void. But damn if this show isn\031t actually good. I am really enjoying the acting, the storytelling, and the way the show is going. It\031s really fun to watch and I am really getting in to the characters- especially NATALIE. And Joe. Riri is having a pretty great arc here, and I get the feeling I am going to be way more invested in her as a character as more episodes come out. I wasn\031t planning on watching this. It just so happened that my wife had a girl\031s night and I put my kid to bed and had nothing else to do after finishing Andor. So I said \034fuck it, let\031s see.\035 And I\031m glad I did. I highly suggest checking it out. There are some great action sequences, some mysterious intrigue, and ya know, it\031s just cool.
## 29                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                              
## 30                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                            Might be a very rogue theory but i think in typical MCU fashion, someone important to the story is erased from this shot. Maybe Victor? Steve (Another rogue theory that Cap is somehow in F4 because his time travel was the reason for this F4 timeline), BABY FRANKLIN???. There's too much unused space there that it gives me NWH trailer vibes. Like how Toby and Andrew were erased from the swinging shot.
##        subreddit comments
## 20 marvelstudios      457
## 21 marvelstudios      202
## 22 marvelstudios      412
## 23 marvelstudios      276
## 24 marvelstudios      259
## 25 marvelstudios      390
## 26 marvelstudios      140
## 27 marvelstudios      152
## 28 marvelstudios      793
## 29 marvelstudios      146
## 30 marvelstudios      580
##                                                                                                           url
## 20  https://www.reddit.com/r/marvelstudios/comments/1ljg6p7/dont_care_what_nobody_says_this_hyped_me_up_back/
## 21    https://www.reddit.com/r/marvelstudios/comments/1hhgtx9/charlie_cox_says_the_upcoming_disney_daredevil/
## 22      https://www.reddit.com/r/marvelstudios/comments/1niokze/what_do_you_think_of_the_mcu_version_of_lady/
## 23   https://www.reddit.com/r/marvelstudios/comments/1jqsj5z/chris_pratt_confirms_starlord_will_return_jokes/
## 24 https://www.reddit.com/r/marvelstudios/comments/1iomc2i/michael_b_jordan_says_marvel_will_get_its_success/
## 25 https://www.reddit.com/r/marvelstudios/comments/1lnkmkg/scarlett_johansson_i_was_cast_for_my_desirability/
## 26    https://www.reddit.com/r/marvelstudios/comments/1n1gb1n/jake_schreier_shares_new_bts_pics_to_celebrate/
## 27     https://www.reddit.com/r/marvelstudios/comments/1i3v74j/people_think_daredevil_isnt_funny_but_matt_is/
## 28    https://www.reddit.com/r/marvelstudios/comments/1lnkyne/i_am_clearly_not_ironhearts_target_demographic/
## 29          https://www.reddit.com/r/marvelstudios/comments/1hct77d/denzel_washington_called_ryan_coogler_to/
## 30                       https://www.reddit.com/r/marvelstudios/comments/1lbagts/is_someone_erased_from_shot/
##                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                       text_clean
## 20                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                      Kang was literally the best part about Quantumania. Just thinking about The Kang Dynasty and The Avengers and Co having to fight many many different versions of Kang was enough for me to get excited for Loki Season 2, Kang Dynsasty, everything else involving Kang. I truly hope that Marvel comes to their senses and bring Kang back for Phase 7.
## 21                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                              
## 22                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                              
## 23                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                              
## 24                                                                                                                                                                                &gt; \034[Marvel\031s] doing great,\035 said Jordan, who is one of the MCU\031s all-time great villains after playing Erik Killmonger in \034Black Panther\035 and its sequel. \034They\031ll get it back.\035 &gt; One comic book tentpole Jordan hopes Marvel gets off the ground is its long-in-the-works \034Blade\035 movie. First announced in 2019 with Mahershala Ali tapped to play the eponymous vampire hunter, \034Blade\035 has been through various writers and directors. Marvel officially took the movie off its release calendar last fall. &gt; \034Launching any franchise, it\031s tough,\035 Jordan said. \034I hope it gets together. I want to see a \030Blade\031 movie, you know what I\031m saying? The \030Blade\031 franchise was everything.\035
## 25                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                              
## 26                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                              
## 27                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                              
## 28 Nearly middle-aged white dude. Have had some qualms about some projects since Endgame. And here is this show about a teenage girl that seems like it is trying to fill the Iron Man void. But damn if this show isn\031t actually good. I am really enjoying the acting, the storytelling, and the way the show is going. It\031s really fun to watch and I am really getting in to the characters- especially NATALIE. And Joe. Riri is having a pretty great arc here, and I get the feeling I am going to be way more invested in her as a character as more episodes come out. I wasn\031t planning on watching this. It just so happened that my wife had a girl\031s night and I put my kid to bed and had nothing else to do after finishing Andor. So I said \034fuck it, let\031s see.\035 And I\031m glad I did. I highly suggest checking it out. There are some great action sequences, some mysterious intrigue, and ya know, it\031s just cool.
## 29                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                              
## 30                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                            Might be a very rogue theory but i think in typical MCU fashion, someone important to the story is erased from this shot. Maybe Victor? Steve (Another rogue theory that Cap is somehow in F4 because his time travel was the reason for this F4 timeline), BABY FRANKLIN???. There's too much unused space there that it gives me NWH trailer vibes. Like how Toby and Andrew were erased from the swinging shot.
##    syuzhet_score bing_score afinn_score nrc_score
## 20          1.75          4           8         2
## 21          0.00          0           0         0
## 22          0.00          0           0         0
## 23          0.00          0           0         0
## 24          1.00          2          12        -3
## 25          0.00          0           0         0
## 26          0.00          0           0         0
## 27          0.00          0           0         0
## 28          3.40          5          14         8
## 29          0.00          0           0         0
## 30          2.05          1           7         2
##                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                   text_split
## 20                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                Kang was literally the best part about Quantumania., Just thinking about The Kang Dynasty and The Avengers and Co having to fight many many different versions of Kang was enough for me to get excited for Loki Season 2, Kang Dynsasty, everything else involving Kang., I truly hope that Marvel comes to their senses and bring Kang back for Phase 7.
## 21                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                          
## 22                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                          
## 23                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                          
## 24                                                                                                                                                                                     &gt; \034[Marvel\031s] doing great,\035 said Jordan, who is one of the MCU\031s all-time great villains after playing Erik Killmonger in \034Black Panther\035 and its sequel., \034They\031ll get it back.\035 &gt; One comic book tentpole Jordan hopes Marvel gets off the ground is its long-in-the-works \034Blade\035 movie., First announced in 2019 with Mahershala Ali tapped to play the eponymous vampire hunter, \034Blade\035 has been through various writers and directors., Marvel officially took the movie off its release calendar last fall., &gt; \034Launching any franchise, it\031s tough,\035 Jordan said., \034I hope it gets together., I want to see a \030Blade\031 movie, you know what I\031m saying?, The \030Blade\031 franchise was everything.\035
## 25                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                          
## 26                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                          
## 27                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                          
## 28 Nearly middle-aged white dude., Have had some qualms about some projects since Endgame., And here is this show about a teenage girl that seems like it is trying to fill the Iron Man void., But damn if this show isn\031t actually good., I am really enjoying the acting, the storytelling, and the way the show is going., It\031s really fun to watch and I am really getting in to the characters- especially NATALIE., And Joe., Riri is having a pretty great arc here, and I get the feeling I am going to be way more invested in her as a character as more episodes come out., I wasn\031t planning on watching this., It just so happened that my wife had a girl\031s night and I put my kid to bed and had nothing else to do after finishing Andor., So I said \034fuck it, let\031s see.\035 And I\031m glad I did., I highly suggest checking it out., There are some great action sequences, some mysterious intrigue, and ya know, it\031s just cool.
## 29                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                          
## 30                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                    Might be a very rogue theory but i think in typical MCU fashion, someone important to the story is erased from this shot., Maybe Victor?, Steve (Another rogue theory that Cap is somehow in F4 because his time travel was the reason for this F4 timeline), BABY FRANKLIN???., There's too much unused space there that it gives me NWH trailer vibes., Like how Toby and Andrew were erased from the swinging shot.
##    ave_sentiment        sd word_count
## 20   0.267505592 0.2977986         59
## 21   0.000000000        NA          0
## 22   0.000000000        NA          0
## 23   0.000000000        NA          0
## 24   0.120650878 0.1201703        117
## 25   0.000000000        NA          0
## 26   0.000000000        NA          0
## 27   0.000000000        NA          0
## 28   0.073634652 0.2187022        180
## 29   0.000000000        NA          0
## 30  -0.007113112 0.3371808         72
library(sentimentr)

reddit_sentiment <- threads_2_f %>%
  mutate(text_split = get_sentences(text)) %$%
  sentiment_by(text_split)

reddit_sentiment %>% arrange(desc(ave_sentiment))
## Key: <element_id>
##      element_id word_count         sd ave_sentiment
##           <int>      <int>      <num>         <num>
##   1:        225         37 0.34127094     0.6078778
##   2:         33         18 0.27639320     0.5116673
##   3:        152          7         NA     0.5102520
##   4:        118         36 0.11996198     0.4642992
##   5:        125         42         NA     0.4073608
##  ---                                               
## 245:        126        146 0.33275290    -0.2289097
## 246:        187         19 0.02335499    -0.2432932
## 247:        248         78 0.57139830    -0.3435084
## 248:        185         94 0.35494365    -0.4299785
## 249:          4        166 0.90555344    -0.8919520
plot(reddit_sentiment)
## Warning: `aes_string()` was deprecated in ggplot2 3.0.0.
## ℹ Please use tidy evaluation idioms with `aes()`.
## ℹ See also `vignette("ggplot2-in-packages")` for more information.
## ℹ The deprecated feature was likely used in the sentimentr package.
##   Please report the issue at <https://github.com/trinker/sentimentr/issues>.
## This warning is displayed once every 8 hours.
## Call `lifecycle::last_lifecycle_warnings()` to see where this warning was
## generated.
## Warning: Using `size` aesthetic for lines was deprecated in ggplot2 3.4.0.
## ℹ Please use `linewidth` instead.
## ℹ The deprecated feature was likely used in the sentimentr package.
##   Please report the issue at <https://github.com/trinker/sentimentr/issues>.
## This warning is displayed once every 8 hours.
## Call `lifecycle::last_lifecycle_warnings()` to see where this warning was
## generated.

Plot 2 shows how emotional valence changes across the full corpus, treating all combined comments like a single narrative. That is, the begins stable and neutral comments start at moderate emotional valence. Then there is a strong rise in the middle suggesting that users express the most positivity/excitement mid-discussions, possibly in response to specific trailers, casting news, or fan theories. It declines towards the end which might be as a result of arguments or disagreements surfacing.

Plot 3: Comparison of Dictionary-Based Scores

library(reshape2)
## 
## Attaching package: 'reshape2'
## The following object is masked from 'package:tidyr':
## 
##     smiths
# Reshape for plotting
dict_scores <- threads_2_f %>%
  select(syuzhet_score, bing_score, afinn_score) %>%
  melt(variable.name = "method", value.name = "score")
## No id variables; using all as measure variables
ggplot(dict_scores, aes(x = score, fill = method)) +
  geom_histogram(alpha = 0.5, position = "identity", bins = 30) +
  labs(
    title = "Comparison of Dictionary-Based Sentiment Scores",
    x = "Sentiment Score",
    y = "Frequency"
  ) +
  theme_minimal()

The histogram in plot 3 overlays distributions from the three lexicon-based sentiment methods. All three methods show a very heavy concentration around zero, meaning the majority of Reddit comments are either neutral or only mildly emotional. The narrow clustering around 0 Suggests conversations about the Marvel topic are more discussion-oriented than emotional. Reddit users aren’t extremely positive or negative most of the time. Again, the few extreme values on either tail from the graph further suggests the rare highly positive/negative values which likely represents excited fans reacting strongly, criticisms or frustrations about specific movie decisions or sarcastic remarks that lexicons incorrectly mark as extremely negative

Plot 4: NRC Emotion Distribution

The NRC lexicon provides emotions like joy, anger, sadness, trust, etc.

nrc_emotions <- get_nrc_sentiment(threads_2_f$text_clean)

# Sum each emotion
nrc_summary <- colSums(nrc_emotions)

# Convert to a dataframe for plotting
nrc_df <- data.frame(
  emotion = names(nrc_summary),
  count = as.numeric(nrc_summary)
)

ggplot(nrc_df, aes(x = reorder(emotion, -count), y = count, fill = emotion)) +
  geom_bar(stat = "identity") +
  labs(
    title = "NRC Emotion Distribution Across Reddit Comments",
    x = "Emotion",
    y = "Total Count"
  ) +
  theme_minimal() +
  theme(legend.position = "none")

Overall Insight

Across all sentiment approaches, the discussion surrounding the Marvel topic on Reddit appears overwhelmingly neutral, with relatively few extreme emotional reactions. Dictionary-based methods (syuzhet, bing, afinn) all show heavy clustering around a sentiment score of zero, while the negation-aware method (sentimentr) reveals slightly more emotional variation but the same overall pattern. The sentiment trajectory plot indicates that the conversation becomes increasingly positive halfway through before turning more negative toward the end, suggesting moments of enthusiasm followed by criticism or debate. Taken together, the four plots demonstrate that Reddit discourse is nuanced and context-dependent: while fans express excitement, the overall tone remains cautious, mixed, or analytical rather than strongly emotional.