Changes in sentiment towards Drake throughout 03.2024 - 03.2025
For your reference: The “Drake vs. Kendrick Lamar” Feud
The Context: Drake and Kendrick Lamar are two of the most commercially successful and critically acclaimed figures in modern hip-hop. For over a decade, they have maintained a subtle rivalry regarding who claims the title of the “greatest rapper” of their generation.
The Incident (May 2024): The tension escalated into a direct conflict in early 2024. In May, Kendrick Lamar released a series of “diss tracks” (songs intended to insult a rival), most notably the global hit “Not Like Us.”
The Climax (February 2025): The rivalry reached its definitive peak when Kendrick Lamar headlined the Super Bowl LIX Halftime Show. This performance was widely interpreted by the public and media as his ultimate “victory lap,” effectively cementing his dominance over Drake in the cultural narrative.
packages <-c("RedditExtractoR", "anytime", "magrittr", "httr", "tidytext", "tidyverse", "igraph", "ggraph", "wordcloud2", "textdata", "here", "ggdark", "syuzhet", "sentimentr", "lubridate")
# Install packages not yet installed
installed_packages <- packages %in% rownames(installed.packages())
if (any(installed_packages == FALSE)) {
install.packages(packages[!installed_packages])
}
# Load packages
invisible(lapply(packages, library, character.only = TRUE))
# # using keyword
# drake_1 <- find_thread_urls(keywords = "drake",
# sort_by = 'relevance',
# period = 'all') %>%
# drop_na()
#
# rownames(drake_1) <- NULL
#
# drake_1 <- drake_1 %>%
# mutate(
# date_utc = as.Date(date_utc) # Convert PostDate to Date value
# ) %>%
# filter(
# date_utc >= as.Date("2024-03-01"), # Filter
# date_utc <= as.Date("2025-03-31")
# )
# # using subreddit
# #hiphopheads: biggest hiphop subreddit
# drake_2 <- find_thread_urls(keywords= "drake",
# subreddit = "hiphopheads",
# sort_by = 'relevance',
# period = 'all') %>%
# drop_na()
#
# rownames(drake_2) <- NULL
#
# drake_2 <- drake_2 %>%
# mutate(
# date_utc = as.Date(date_utc) # Convert PostDate to Date value
# ) %>%
# filter(
# date_utc >= as.Date("2024-03-01"), # Filter
# date_utc <= as.Date("2025-03-31")
# )
# # using subreddit
# #hiphopheads: biggest hiphop subreddit
# drake_3 <- find_thread_urls(keywords= "drake",
# subreddit = "rap",
# sort_by = 'relevance',
# period = 'all') %>%
# drop_na()
#
# rownames(drake_3) <- NULL
#
# drake_3 <- drake_3 %>%
# mutate(
# date_utc = as.Date(date_utc) # Convert PostDate to Date value
# ) %>%
# filter(
# date_utc >= as.Date("2024-03-01"), # Filter
# date_utc <= as.Date("2025-03-31")
# )
# #Merge Data
# drake_total <- bind_rows(drake_1, drake_2, drake_3)
#
# #Drop Duplicates
# drake_total <- drake_total %>% distinct()
#
# # Sanitize text
# drake_total %<>%
# mutate(across(
# where(is.character),
# ~ .x %>%
# str_replace_all("\\|", "/") %>% # replace vertical bars
# str_replace_all("\\n", " ") %>% # replace newlines
# str_squish() # clean up extra spaces
# ))
# Save merged data as CSV
# write.csv(drake_total, "drake_total.csv", row.names = FALSE)
# Reload the saved CSV
drake_total <- read.csv("drake_total.csv", stringsAsFactors = FALSE)
# Word tokenization
words <- drake_total %>%
unnest_tokens(output = word, input = text, token = "words")
# Word tokenization
words <- drake_total %>%
unnest_tokens(output = word, input = text, token = "words") %>%
filter(word != "drake")
words %>%
count(word, sort = TRUE) %>%
wordcloud2()
It seems there are too many stop words, so I will remove stop words and see word cloud again.
# Regex that matches URL-type string
replace_reg <- "http[s]?://[A-Za-z\\d/\\.]+|&|<|>"
words_clean <- words %>%
# drop stop words
anti_join(stop_words, by = "word") %>%
# drop non-alphabet-only strings
filter(str_detect(word, "[a-z]")) %>%
filter(!str_detect(word, replace_reg))
# Check the number of rows after removal of the stop words. There should be fewer words now
print(
glue::glue("Before: {nrow(words)}, After: {nrow(words_clean)}")
)
## Before: 17559, After: 6642
words_clean %>%
count(word, sort = TRUE) %>%
wordcloud2(size = 0.8, shuffle = FALSE)
It is funny that “Kendrick” is the most frequently mentioned word in the data on “Drake”.
# Get tri-grams.
drake_trigram <- drake_total %>%
mutate(text = str_replace_all(text, replace_reg, "")) %>%
select(text) %>%
unnest_tokens(output = paired_words,
input = text,
token = "ngrams",
n = 3)
#separate the paired words into three columns
words_ngram_pair <- drake_trigram %>%
separate(paired_words, c("word1", "word2", "word3"), sep = " ")
# filter rows where there are stop words under word 1 column, word 2 column, word 3 column
words_ngram_pair_filtered <- words_ngram_pair %>%
# drop stop words
filter(!word1 %in% stop_words$word & !word2 %in% stop_words$word & !word3 %in% stop_words$word) %>%
# drop non-alphabet-only strings
filter(str_detect(word1, "[a-z]") & str_detect(word2, "[a-z]") & str_detect(word3, "[a-z]"))
# Filter out words that are not encoded in ASCII
# To see what's ASCII, google 'ASCII table'
library(stringi)
words_ngram_pair_filtered %<>%
filter(stri_enc_isascii(word1) & stri_enc_isascii(word2))
# Sort the new tri-gram (n=3) counts:
words_counts <- words_ngram_pair_filtered %>%
count(word1, word2, word3) %>%
arrange(desc(n))
head(words_counts, 20) %>%
knitr::kable()
| word1 | word2 | word3 | n |
|---|---|---|---|
| drake | partynextdoor | prod | 11 |
| prod | dj | lewis | 5 |
| prod | kid | masterpiece | 5 |
| certified | lover | boy | 3 |
| drake | partynextdoor | feat | 3 |
| partynextdoor | prod | dj | 3 |
| partynextdoor | prod | kid | 3 |
| wah | gwan | delilah | 3 |
| 10m | sales | us.html | 2 |
| 46t | y9orgxgtsfhfmnh0bdarlw | edit | 2 |
| dark | lane | demo | 2 |
| die | hard | drake | 2 |
| dj | lewis | noel | 2 |
| dr | dre | ft | 2 |
| drake | prod | kid | 2 |
| drake’s | front | door | 2 |
| dre | ft | kendrick | 2 |
| express.com | entertainment | music | 2 |
| ft | future | drake | 2 |
| future | ft | kendrick | 2 |
I had to search the context, and I think there are some interesting tri-grams according to the incident.
future ft kendrick (Frequency: 2)Context: This phrase is likely refer to Kendrick Lamar’s feature on Future and Metro Boomin’s track, “Like That,” released in March 2024. Discussion: This trigram seems mark the catalyst of the feud. In his verse, Kendrick Lamar rejected the idea of a “Big 3” (Drake, J. Cole, Kendrick) and famously claimed, “It’s just big me,” which officially ignited the conflict between the two artists.
drake’s front door (Frequency: 2)Context: This phrase is likely refer to the shooting incident involving a security guard outside Drake’s Toronto residence (The Embassy) in May 2024. Discussion: This keyword highlights the escalation of the feud from music to real-world violence. It reflects public concern that the lyrical battle had crossed a line into dangerous territory, shifting the online discourse from entertainment to safety concerns.
wah gwan delilah (Frequency: 3)Context: This is likely refer to the title of a parody remix of “Hey There Delilah” released by Drake featuring a heavy Jamaican Patois accent. Discussion: Released in the aftermath of the main diss battle, this track caused confusion and ridicule among listeners. The appearance of this keyword suggests that public sentiment questioned Drake’s seriousness or state of mind following his perceived loss in the feud, viewing the release as either bizarre behavior or an attempt to “troll” the audience.
drake partynextdoor prod (Frequency: 11),
drake partynextdoor feat (Frequency: 3)However, the most frequently mentioned trigram was
drake partynextdoor prod (Frequency: 11), and
drake partynextdoor feat was also mentioned three times.
This prominence is likely attributed to the release of their
collaborative album in February 2025. PartyNextDoor is recognized as one
of the few associates who remained loyal to Drake throughout the
controversy. Thus, their continued partnership and the timing of this
release served as a significant talking point, contrasting with the
departure of other collaborators.
#Merge Title and Text
drake_sentiment <- drake_total %>%
mutate(
title = replace_na(title, ""),
text = replace_na(text, ""),
title_text = str_c(title, text, sep = ". ")
)
#Separate Setences
drake_sentiment_sts <- drake_sentiment %>%
mutate(title_text_split = get_sentences(title_text))
#Sentiment Score
drake_sentiment_scores <- drake_sentiment_sts %$%
sentiment_by(title_text_split)
drake_sentiment$sentiment_dict <- drake_sentiment_scores %>% pull(ave_sentiment)
drake_sentiment$word_count <- drake_sentiment_scores %>% pull(word_count)
drake_sentiment %>%
select(title_text, sentiment_dict, word_count) %>%
head()
examples_sentiment <- bind_rows(
drake_sentiment %>%
arrange(sentiment_dict) %>%
slice_head(n = 5) %>%
mutate(sentiment_type = "most_negative"),
drake_sentiment %>%
arrange(desc(sentiment_dict)) %>%
slice_head(n = 5) %>%
mutate(sentiment_type = "most_positive")
)
examples_sentiment %>%
select(sentiment_type, sentiment_dict, title_text) %>%
knitr::kable()
| sentiment_type | sentiment_dict | title_text |
|---|---|---|
| most_negative | -0.7100469 | Universal Music Group Responds to Drake Legal Filing Over Not Like Us: Offensive & Untrue. |
| most_negative | -0.6250000 | Tupac Shakurs Estate Threatens to Sue Drake Over Diss Track Featuring AI-Generated Tupac Voice. |
| most_negative | -0.6147009 | Drake claims label should have refused to release Kendrick Lamars Not Like Us. |
| most_negative | -0.5809475 | Yuno Miles Says He Regrets Kendrick & Drake Diss: The Past Two Days Felt Terrible. |
| most_negative | -0.5773503 | Drake officially loses. |
| most_positive | 0.5964787 | What’s your favorite Kendrick song and your favorite Drake song?. With this beef going on a lot of people are picking sides, but they’re both great artist in their own regard and lane so what’s your favorite song from each? Mine for Kendrick is Poe Man’s Dreams and from Drake It’s Redemption. |
| most_positive | 0.5964787 | What’s your favorite Kendrick song and your favorite Drake song?. With this beef going on a lot of people are picking sides, but they’re both great artist in their own regard and lane so what’s your favorite song from each? Mine for Kendrick is Poe Man’s Dreams and from Drake It’s Redemption. |
| most_positive | 0.5659970 | Kendrick Lamars music streams increase by almost 50% while Drakes drops amid beef. https://www.the-express.com/entertainment/music/137028/kendrick-lamar-music-streams-increase-drake-beef |
| most_positive | 0.4157609 | [FRESH VIDEO] DRAKE - NOKIA (Official Music Video). |
| most_positive | 0.3897114 | [Highlight] Drake Maye delivers a perfect throw for first career touchdown pass. |
I do not think this result is credible. For example, “Universal Music Group Responds to Drake Legal Filing Over”Not Like Us: Offensive & Untrue.”” is categorized as most_negative phrase, but it seems this was because of the song’s name. So in my perspective, this is neutral phrase rather than negative tone.
drake_sentiment_clean <- drake_sentiment %>%
filter(!is.na(sentiment_dict),
!is.na(date_utc)) %>%
mutate(
date_utc = as.Date(date_utc)
)
# Monthly Average
sentiment_monthly <- drake_sentiment_clean %>%
group_by(month = floor_date(date_utc, "month")) %>%
summarise(
mean_sentiment = mean(sentiment_dict),
n_posts = n(),
.groups = "drop"
)
# Significant Date
event_dates <- tibble(
event_date = as.Date(c("2024-03-22", "2024-05-04", "2025-02-09")),
event_label = c("Like That release",
"Not Like Us release",
"Super Bowl LIX (Kendrick)")
)
ggplot(sentiment_monthly, aes(x = month, y = mean_sentiment)) +
geom_line() +
geom_point() +
geom_vline(data = event_dates,
aes(xintercept = as.numeric(event_date)),
linetype = "dashed") +
geom_text(data = event_dates,
aes(x = event_date,
y = max(sentiment_monthly$mean_sentiment, na.rm = TRUE),
label = event_label),
angle = 90, vjust = -0.5, hjust = 1, size = 3) +
labs(
title = "Monthly Average Sentiment of Drake-related Reddit Posts",
x = "Month",
y = "Average sentiment score"
) +
theme_minimal()
Correlation with Key Events: The sentiment timeline
strongly mirrors real-world events in the feud. The sharp decline in
April 2024 coincides immediately with the release of “Like That” (March
22), suggesting an instant negative reaction from the community as the
feud ignited.
Volatility and “Diss Fatigue”: The graph displays significant volatility, particularly the erratic ups and downs between May and October 2024. This suggests that public sentiment was highly reactive to individual news cycles and track releases rather than following a steady trend.
The Pre-Super Bowl Low: Interestingly, the lowest sentiment point occurs in January 2025, just before the Super Bowl. This drastic dip may reflect the community’s negative anticipation of Kendrick Lamar’s “victory lap” performance or dissatisfaction with Drake’s activities during that period (such as the build-up to the release of his collaborative projects).
drake_phase <- drake_sentiment_clean %>%
mutate(
phase = case_when(
date_utc >= as.Date("2024-03-01") & date_utc <= as.Date("2024-04-30") ~ "Phase 1: Early feud",
date_utc >= as.Date("2024-05-01") & date_utc <= as.Date("2025-01-31") ~ "Phase 2: Diss war",
date_utc >= as.Date("2025-02-01") & date_utc <= as.Date("2025-03-31") ~ "Phase 3: Post Super Bowl",
TRUE ~ NA_character_
)
) %>%
filter(!is.na(phase))
ggplot(drake_phase, aes(x = phase, y = sentiment_dict)) +
geom_violin(trim = FALSE, alpha = 0.5) +
geom_boxplot(width = 0.15, outlier.size = 0.7) +
labs(
title = "Sentiment Distribution by Feud Phase",
x = "Phase",
y = "Sentiment score"
) +
theme_minimal()
Consistency of the Median: Despite the intensity of the
rivalry, the median sentiment (represented by the bold horizontal line
within the box plots) remains hovering near zero across all three
phases. This indicates that while the “loudest” posts may have been
extreme, the average discourse on the subreddit remained relatively
balanced or neutral.
Polarization in Phase 1: The “Early Feud” phase exhibits a slightly stretched distribution compared to the others. This suggests a period of higher polarization where fans were fiercely debating the start of the conflict, resulting in a mix of both defensive praise for Drake and sharp criticism.
Stabilization in Phase 3: By the “Post Super Bowl” phase, the shape of the violin plot appears slightly more condensed. This implies that as the narrative of the feud settled—with the public largely viewing Kendrick as the victor—the emotional intensity of the Reddit threads cooled down, leading to fewer extreme sentiment outliers.
drake_kendrick <- drake_sentiment_clean %>%
mutate(
has_kendrick = if_else(
str_detect(str_to_lower(title_text), "kendrick"),
"Kendrick mentioned",
"Kendrick not mentioned"
)
)
ggplot(drake_kendrick, aes(x = has_kendrick, y = sentiment_dict)) +
geom_violin(trim = FALSE, alpha = 0.5) +
geom_boxplot(width = 0.15, outlier.size = 0.7) +
labs(
title = "Sentiment When Kendrick is Mentioned vs Not Mentioned",
x = "",
y = "Sentiment score"
) +
theme_minimal()
Unexpected Similarity: Visually, the distributions for
posts mentioning Kendrick Lamar versus those that do not are remarkably
similar. This is a counter-intuitive finding; one might expect threads
discussing a bitter rival to be significantly more negative.
The “Spectacle” Factor: The fact that sentiment does not drop significantly when Kendrick is mentioned suggests that the subreddit users may view the feud as entertainment. Words associated with high-profile rap battles (e.g., “legendary,” “classic,” “winner”) are often coded as positive in sentiment dictionaries, which might be counterbalancing the negative context of the conflict.
Drake’s Baseline Sentiment: This comparison implies that Drake’s sentiment on Reddit is not solely defined by his rivalry with Kendrick. The “Not Mentioned” category still contains a wide range of negative scores, indicating that listeners have critiques of Drake’s music or behavior independent of the feud itself.