This project aims to analyze the textual and sentiment information of four of Kendrick Lamar’s most popular albums: Untitled Unmastered, good kid, m.A.A.d city, To Pimp a Butterfly, and DAMN. I hypothesize that the most negative album sentiment will come from To Pimp a Buttefly, as it addresses and comments on the systemic racism in American society more than any of his other albums. All the other albums I expect to be negative as well, mainly due to all the cursing Kendrick Lamar uses in his lyrics.
The data for these albums was acquired via Genius API
library(tidyverse)
library(tidytext)
library(jsonlite)
library(ggthemes)
library(wordcloud2)
I read in the jsons and converted them into usable data frames for each album.
untitledunmastered <- fromJSON("Lyrics_untitledunmastered_.json")
goodkidmadcity <- fromJSON("Lyrics_goodkidm.A.A.dcity.json")
topimpabutterfly <- fromJSON("Lyrics_ToPimpaButterfly.json")
damn <- fromJSON("Lyrics_DAMN.json")
as.data.frame(untitledunmastered$tracks$song) -> um_df
um_df %>%
mutate(album = "Untitled Unmastered") -> um_df
as.data.frame(goodkidmadcity$tracks$song) -> gkmc_df
gkmc_df %>%
mutate(album = "good kid, m.A.A.d city") -> gkmc_df
as.data.frame(topimpabutterfly$tracks$song) -> tpab_df
tpab_df %>%
mutate(album = "To Pimp a Butterfly") -> tpab_df
as.data.frame(damn$tracks$song) -> damn_df
damn_df %>%
mutate(album = "DAMN") -> damn_df
Here, I unnested the tokens for each album to seperate each word into a new row. I also created a dataframe with all the albums together.
um_df %>%
unnest_tokens(word, lyrics) -> um_words
gkmc_df %>%
unnest_tokens(word, lyrics) -> gkmc_words
tpab_df %>%
unnest_tokens(word, lyrics) -> tpab_words
damn_df %>%
unnest_tokens(word, lyrics) -> damn_words
rbind(
subset(um_df, select = c("title", "lyrics", "album")),
subset(gkmc_df, select = c("title", "lyrics", "album")),
subset(tpab_df, select = c("title", "lyrics", "album")),
subset(damn_df, select = c("title", "lyrics", "album"))
) -> combined_df
combined_df %>%
unnest_tokens(word, lyrics) %>%
anti_join(stop_words) %>%
filter(!word %in% c('chorus', 'verse', 'ooh', 'yeah'))-> combined_words
## Joining, by = "word"
For these four albums, the top words overall are represented in this wordcloud.It’s interesting to see that through an abundance of negative lyrics, the word ‘love’ is the second most popular word.
combined_words %>%
count(word,sort=TRUE) %>%
head(100) %>%
filter(!word %in% c("lamar", "kendrick")) %>%
wordcloud2()
For each album, I plotted the word frequency as well as the accompanying AFINN sentiment score for each word. Additionally, these plots are paired with wordclouds specific to the albums.
Untitled Unmastered appears to scored the highest of all four albums in negative sentiment. It seems that besides the word ‘love’ ranking 7th most popular, the most popular words all seem to have a negative sentiment to them. However, in reality a lot of these words aren’t necessarily meant to be negative in any way whatsoever. For example, Kendrick Lamar uses the word ‘bitch’ more than any other word, but he actually uses it mostly as a slang-replacement for a female.
um_words %>%
anti_join(stop_words) %>%
filter(!word %in% c('chorus', 'verse', 'ooh', 'yeah')) %>%
count(word, sort = TRUE) %>%
filter(n > 2) %>%
inner_join(get_sentiments('afinn')) %>%
ggplot(aes(reorder(word, n),n, fill = value)) + geom_col() + coord_flip() + theme_economist() + xlab("Word") + ylab("Count") + ggtitle("Untitled Unmastered | Word Frequency with Sentiment")
## Joining, by = "word"
## Joining, by = "word"
Wordcloud for Untitled Unmastered album
Kendrick Lamar’s second studio album “good kid, m.A.A.d city” illustrates a life of a young black man growing up in a lower-income area. It’s fascinating how the word ‘love’ is used so ironically in this album. Although being the most common lyric among all the words in this album, Lamar uses it to contrast his surroundings of Compton, California. Among the album, some notable popular words are: promise, kill, tired, hope, and justice.”
gkmc_words %>%
anti_join(stop_words) %>%
filter(!word %in% c('chorus', 'verse', 'ooh', 'yeah')) %>%
count(word, sort = TRUE) %>%
filter(n > 10) %>%
inner_join(get_sentiments('afinn')) %>%
ggplot(aes(reorder(word, n),n, fill = value)) + geom_col() + coord_flip() + theme_economist() + xlab("Word") + ylab("Count") + ggtitle("good kid, m.A.A.d City | Word Frequency with Sentiment")
## Joining, by = "word"
## Joining, by = "word"
Wordcloud for good kid, m.A.A.d city album
To Pimp a Butterfly is Kendrick Lamar’s masterpiece and gift to black people of the future. It addresses a materialistic society and using fame correctly, turning to black history for help. With a strong commentary on the systemic racism of America, I assumed this would be Kendrick Lamar’s most negative album. However, it is actually the most positive of the four. This is most definetly due to Kendrick Lamar’s neat use of lyricism, typically using sarcasm and rhetorical devices to enhance a point.
tpab_words %>%
anti_join(stop_words) %>%
filter(!word %in% c('chorus', 'verse', 'ooh', 'yeah')) %>%
count(word, sort = TRUE) %>%
filter(n > 10) %>%
inner_join(get_sentiments('afinn')) %>%
ggplot(aes(reorder(word, n),n, fill = value)) + geom_col() + coord_flip() + theme_economist() + xlab("Word") + ylab("Count") + ggtitle("To Pimp a Butterfly | Word Frequency with Sentiment")
## Joining, by = "word"
## Joining, by = "word"
Wordcloud for To Pimp a Butterfly album
The album “DAMN” is Kendrick Lamar’s fourth studio album, and another commentary on societal norms, this time focusing on morality and religion rather than specifically black oppression. This album showed a fair amount of contrast, which was fairly predictable based off previous knowledge of this album. While songs like XXX, ft. U2 offer a typical Kendrick Lamar performance with negativity, tracks like Loyalty ft. Rihanna and LOVE ft. Zacari offer antidotes to Kendricks negativity. This is evident clearly in the word frequency sentiment graph below: while “bitch” takes the top spot, “love” and “loyalty” counter that with extremely positive sentiment.
damn_words %>%
anti_join(stop_words) %>%
filter(!word %in% c('chorus', 'verse', 'ooh', 'yeah')) %>%
count(word, sort = TRUE) %>%
filter(n > 10) %>%
inner_join(get_sentiments('afinn')) %>%
ggplot(aes(reorder(word, n),n, fill = value)) + geom_col() + coord_flip() + theme_economist() + xlab("Word") + ylab("Count") + ggtitle("DAMN | Word Frequency with Sentiment")
## Joining, by = "word"
## Joining, by = "word"
Wordcloud for DAMN album
Here I plotted the aggregated average sentiment for each of the four albums. The results show that “Untitled Unmastered” and “good kid, m.A.A.d city”: are signifiantly more negative than “To Pimp a Butterfly” and “DAMN”.
combined_words %>%
inner_join(get_sentiments('afinn')) %>%
group_by(album) %>%
summarize(album_sentiment= mean(value)) %>%
ggplot(aes(y=reorder(album,-album_sentiment),x=album_sentiment)) + geom_col() + theme_economist() + xlab("Album Sentiment (afinn)") + ylab("Album") + ggtitle("Average Sentiment by Album")
## Joining, by = "word"
In conclusion, I was not expecting the results that I initially hypothesized. While I thought that “To Pimp a Butterfly,” the album that focuses primarily on systemic oppression of the black male, would be the most negative of all albums: it was actually the least. With a score of approximately -0.8 album sentiment (afinn), “To Pimp a Butterfly” received a more positive score than the other three hit albums. This is likely due to a few of reasons: the main one being all albums scored negative sentiment due to heavy profanity, thus making albums with more curse words a greater score. There is no surprise that all albums are far in the negative spectrum though. Kendrick Lamar is known as one of the most influential artists of our time, and he raps from the point-of-view of himself growing up in the streets of Compton, California.