Introduction
League of Legends has a massive and vocal online community, which makes it an ideal space for sentiment analysis. For this project, I compare two bot-lane ADC champions — Smolder and Sivir — using unstructured text from Reddit and YouTube discussions.
These champions were chosen because:
Both fill the same role (ranged ADCs)
They generate very different emotional reactions
Smolder is newer and more controversial
Sivir is older, stable, and considered “quietly strong”
They both represent different philosophies: scaling burst (Smolder) vs. macro wave-control (Sivir)
Research Questions:
Q1: How does overall community sentiment differ between Smolder and Sivir?
Q2: What themes or topics appear most frequently in discussions of each champion?
Q3: How has sentiment around Smolder changed over time?
Data Collection
Sources include:
Reddit champion discussion threads
Reddit patch note reactions
YouTube comments on champion guides and gameplay
Community tier list discussions
Two datasets were created:
smolder_sivir_combined.csv – 100+ comments labeled by champion
smolder_timeline_.csv – Smolder-only comments with manually assigned chronological dates
all_data <- read_csv("smolder_sivir_combined.csv")
## Rows: 75 Columns: 2
## ── Column specification ────────────────────────────────────────────────────────
## Delimiter: ","
## chr (2): champion, text
##
## ℹ Use `spec()` to retrieve the full column specification for this data.
## ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
smolder_time <- read_csv("smolder_timeline_.csv")
## Rows: 31 Columns: 3
## ── Column specification ────────────────────────────────────────────────────────
## Delimiter: ","
## chr (3): date, text, champion
##
## ℹ Use `spec()` to retrieve the full column specification for this data.
## ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
Q1: Smolder vs. Sivir
nrc <- get_sentiments("nrc")
tidy_tokens <- all_data %>%
unnest_tokens(word, text) %>%
anti_join(stop_words, by = "word")
sentiment_counts <- tidy_tokens %>%
inner_join(nrc, by = "word", relationship = "many-to-many") %>%
count(champion, sentiment)
ggplot(sentiment_counts, aes(sentiment, n, fill = champion)) +
geom_col(position = "dodge") +
labs(title = "NRC Sentiment Comparison: Smolder vs. Sivir",
x = "Sentiment", y = "Count") +
theme_minimal() +
theme(axis.text.x = element_text(angle = 45, hjust = 1))
From the ggplot I found that sivir had a slightly more negative
sentiment compared to smolder. Smolder had a slightly higher positive
sentiment. Sivir showed high levels of trust, which makes sense because
she is known as a very reliable champion.
Q2: Keyword Themes with Bigrams
bigrams <- all_data %>%
unnest_tokens(bigram, text, token = "ngrams", n = 2) %>%
separate(bigram, c("word1", "word2"), sep = " ") %>%
filter(!word1 %in% stop_words$word,
!word2 %in% stop_words$word) %>%
unite(bigram, word1, word2) %>%
count(champion, bigram, sort = TRUE)
bigrams %>%
group_by(champion) %>%
slice_max(n, n = 10) %>%
ggplot(aes(reorder(bigram, n), n, fill = champion)) +
geom_col(show.legend = FALSE) +
coord_flip() +
facet_wrap(~champion, scales = "free") +
labs(title = "Top 10 Bigrams by Champion",
x = "Bigram", y = "Count") +
theme_minimal()
From the bigrams I found that Smolder seems to have a love/hate
relationship with the community. Sivir needs to team fight because she
does very little damage alone. In other words, you have to play with the
team to win on Sivir.
smolder_time$date <- as.Date(smolder_time$date, format = "%m/%d/%Y")
smolder_time_tokens <- smolder_time %>%
mutate(date = as.Date(date, format = "%m/%d/%Y")) %>%
unnest_tokens(word, text) %>%
anti_join(stop_words, by = "word") %>%
inner_join(nrc, by = "word", relationship = "many-to-many") %>%
count(date, sentiment)
timeline2 <- smolder_time_tokens %>%
arrange(date, sentiment) %>%
group_by(sentiment) %>%
mutate(point = row_number())
ggplot(timeline2, aes(point, n, color = sentiment, group = sentiment)) +
geom_line(linewidth = 1) +
scale_x_continuous(
breaks = timeline2$point,
labels = format(timeline2$date, "%m/%d")
) +
labs(title = "Smolder Sentiment Over Time",
x = "Date",
y = "Sentiment Count") +
theme_minimal() +
theme(axis.text.x = element_text(angle = 45, hjust = 1))
In the timeline I found that Smolder had a big spike in positivity but
other than that, I could not find anything else of value.
Ultimately, the analysis showed clear differences between how players discuss Smolder and Sivir. Sivir displayed slightly more negative sentiment overall, but also showed high levels of trust, reflecting her reputation as a dependable, team-focused marksman. Smolder, on the other hand, generated more positive sentiment and a wider emotional range, fitting his newer and more polarizing design. Bigrams revealed that Smolder has a clear love-hate relationship with the community, while Sivir’s themes emphasized her reliance on teamfighting and coordinated play. The sentiment timeline for Smolder showed one strong spike in positivity, with little variation outside that event. Taken together, these results suggest that Smolder drives more emotional reactions, while Sivir remains steady and reliable but less exciting to the community.