Introduction

League of Legends has a massive and vocal online community, which makes it an ideal space for sentiment analysis. For this project, I compare two bot-lane ADC champions — Smolder and Sivir — using unstructured text from Reddit and YouTube discussions.

These champions were chosen because:

Both fill the same role (ranged ADCs)

They generate very different emotional reactions

Smolder is newer and more controversial

Sivir is older, stable, and considered “quietly strong”

They both represent different philosophies: scaling burst (Smolder) vs. macro wave-control (Sivir)

Research Questions:

Q1: How does overall community sentiment differ between Smolder and Sivir?

Q2: What themes or topics appear most frequently in discussions of each champion?

Q3: How has sentiment around Smolder changed over time?

Data Collection

Sources include:

Reddit champion discussion threads

Reddit patch note reactions

YouTube comments on champion guides and gameplay

Community tier list discussions

Two datasets were created:

smolder_sivir_combined.csv – 100+ comments labeled by champion

smolder_timeline_.csv – Smolder-only comments with manually assigned chronological dates

all_data <- read_csv("smolder_sivir_combined.csv")
## Rows: 75 Columns: 2
## ── Column specification ────────────────────────────────────────────────────────
## Delimiter: ","
## chr (2): champion, text
## 
## ℹ Use `spec()` to retrieve the full column specification for this data.
## ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
smolder_time <- read_csv("smolder_timeline_.csv")
## Rows: 31 Columns: 3
## ── Column specification ────────────────────────────────────────────────────────
## Delimiter: ","
## chr (3): date, text, champion
## 
## ℹ Use `spec()` to retrieve the full column specification for this data.
## ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.

Q1: Smolder vs. Sivir

nrc <- get_sentiments("nrc")

tidy_tokens <- all_data %>%
unnest_tokens(word, text) %>%
anti_join(stop_words, by = "word")

sentiment_counts <- tidy_tokens %>%
inner_join(nrc, by = "word", relationship = "many-to-many") %>%
count(champion, sentiment)

ggplot(sentiment_counts, aes(sentiment, n, fill = champion)) +
geom_col(position = "dodge") +
labs(title = "NRC Sentiment Comparison: Smolder vs. Sivir",
x = "Sentiment", y = "Count") +
theme_minimal() +
theme(axis.text.x = element_text(angle = 45, hjust = 1))

From the ggplot I found that sivir had a slightly more negative sentiment compared to smolder. Smolder had a slightly higher positive sentiment. Sivir showed high levels of trust, which makes sense because she is known as a very reliable champion.

Q2: Keyword Themes with Bigrams

bigrams <- all_data %>%
unnest_tokens(bigram, text, token = "ngrams", n = 2) %>%
separate(bigram, c("word1", "word2"), sep = " ") %>%
filter(!word1 %in% stop_words$word,
!word2 %in% stop_words$word) %>%
unite(bigram, word1, word2) %>%
count(champion, bigram, sort = TRUE)

bigrams %>%
group_by(champion) %>%
slice_max(n, n = 10) %>%
ggplot(aes(reorder(bigram, n), n, fill = champion)) +
geom_col(show.legend = FALSE) +
coord_flip() +
facet_wrap(~champion, scales = "free") +
labs(title = "Top 10 Bigrams by Champion",
x = "Bigram", y = "Count") +
theme_minimal()

From the bigrams I found that Smolder seems to have a love/hate relationship with the community. Sivir needs to team fight because she does very little damage alone. In other words, you have to play with the team to win on Sivir.

smolder_time$date <- as.Date(smolder_time$date, format = "%m/%d/%Y")

smolder_time_tokens <- smolder_time %>%
  mutate(date = as.Date(date, format = "%m/%d/%Y")) %>%
  unnest_tokens(word, text) %>%
  anti_join(stop_words, by = "word") %>%
  inner_join(nrc, by = "word", relationship = "many-to-many") %>%
  count(date, sentiment)

timeline2 <- smolder_time_tokens %>%
  arrange(date, sentiment) %>%
  group_by(sentiment) %>%
  mutate(point = row_number())

ggplot(timeline2, aes(point, n, color = sentiment, group = sentiment)) +
  geom_line(linewidth = 1) +
  scale_x_continuous(
    breaks = timeline2$point,
    labels = format(timeline2$date, "%m/%d")
  ) +
  labs(title = "Smolder Sentiment Over Time",
       x = "Date",
       y = "Sentiment Count") +
  theme_minimal() +
  theme(axis.text.x = element_text(angle = 45, hjust = 1))

In the timeline I found that Smolder had a big spike in positivity but other than that, I could not find anything else of value.

Ultimately, the analysis showed clear differences between how players discuss Smolder and Sivir. Sivir displayed slightly more negative sentiment overall, but also showed high levels of trust, reflecting her reputation as a dependable, team-focused marksman. Smolder, on the other hand, generated more positive sentiment and a wider emotional range, fitting his newer and more polarizing design. Bigrams revealed that Smolder has a clear love-hate relationship with the community, while Sivir’s themes emphasized her reliance on teamfighting and coordinated play. The sentiment timeline for Smolder showed one strong spike in positivity, with little variation outside that event. Taken together, these results suggest that Smolder drives more emotional reactions, while Sivir remains steady and reliable but less exciting to the community.