This project aims to analyze the textual and sentiment information of four of Kendrick Lamar’s most popular albums: Untitled Unmastered, good kid, m.A.A.d city, To Pimp a Butterfly, and DAMN. I hypothesize that the most negative album sentiment will come from To Pimp a Buttefly, as it addresses and comments on the systemic racism in American society more than any of his other albums. All the other albums I expect to be negative as well, mainly due to all the cursing Kendrick Lamar uses in his lyrics.

The data for these albums was acquired via Genius API

Library

library(tidyverse)
library(tidytext)
library(jsonlite)
library(ggthemes)
library(wordcloud2)

Data Processing

I read in the jsons and converted them into usable data frames for each album.

untitledunmastered <- fromJSON("Lyrics_untitledunmastered_.json")
goodkidmadcity <- fromJSON("Lyrics_goodkidm.A.A.dcity.json")
topimpabutterfly <- fromJSON("Lyrics_ToPimpaButterfly.json")
damn <- fromJSON("Lyrics_DAMN.json")

as.data.frame(untitledunmastered$tracks$song) -> um_df

um_df %>% 
  mutate(album = "Untitled Unmastered") ->  um_df

as.data.frame(goodkidmadcity$tracks$song) -> gkmc_df
gkmc_df %>% 
  mutate(album = "good kid, m.A.A.d city") ->  gkmc_df

as.data.frame(topimpabutterfly$tracks$song) -> tpab_df
tpab_df %>% 
  mutate(album = "To Pimp a Butterfly") ->  tpab_df

as.data.frame(damn$tracks$song) -> damn_df
damn_df %>% 
  mutate(album = "DAMN") ->  damn_df

Here, I unnested the tokens for each album to seperate each word into a new row. I also created a dataframe with all the albums together.

um_df %>% 
  unnest_tokens(word, lyrics) -> um_words

gkmc_df %>% 
  unnest_tokens(word, lyrics) -> gkmc_words

tpab_df %>% 
  unnest_tokens(word, lyrics) -> tpab_words

damn_df %>% 
  unnest_tokens(word, lyrics) -> damn_words

rbind(
  subset(um_df, select = c("title", "lyrics", "album")),
  subset(gkmc_df, select = c("title", "lyrics", "album")),
  subset(tpab_df, select = c("title", "lyrics", "album")),
  subset(damn_df, select = c("title", "lyrics", "album"))
  ) -> combined_df

combined_df %>% 
  unnest_tokens(word, lyrics) %>% 
  anti_join(stop_words) %>% 
  filter(!word %in% c('chorus', 'verse', 'ooh', 'yeah'))-> combined_words
## Joining, by = "word"

Wordcloud

For these four albums, the top words overall are represented in this wordcloud.It’s interesting to see that through an abundance of negative lyrics, the word ‘love’ is the second most popular word.

combined_words %>%
  count(word,sort=TRUE) %>%
  head(100) %>% 
  filter(!word %in% c("lamar", "kendrick")) %>% 
  wordcloud2()

Plot Word Frequency for Each Album

For each album, I plotted the word frequency as well as the accompanying AFINN sentiment score for each word. Additionally, these plots are paired with wordclouds specific to the albums.

Untitled Unmastered appears to scored the highest of all four albums in negative sentiment. It seems that besides the word ‘love’ ranking 7th most popular, the most popular words all seem to have a negative sentiment to them. However, in reality a lot of these words aren’t necessarily meant to be negative in any way whatsoever. For example, Kendrick Lamar uses the word ‘bitch’ more than any other word, but he actually uses it mostly as a slang-replacement for a female.

um_words %>% 
  anti_join(stop_words) %>% 
  filter(!word %in% c('chorus', 'verse', 'ooh', 'yeah')) %>%
  count(word, sort = TRUE) %>% 
  filter(n > 2) %>% 
  inner_join(get_sentiments('afinn')) %>% 
  ggplot(aes(reorder(word, n),n, fill = value)) + geom_col() + coord_flip() + theme_economist() + xlab("Word") + ylab("Count") + ggtitle("Untitled Unmastered | Word Frequency with Sentiment")
## Joining, by = "word"
## Joining, by = "word"

Wordcloud for Untitled Unmastered album

Kendrick Lamar’s second studio album “good kid, m.A.A.d city” illustrates a life of a young black man growing up in a lower-income area. It’s fascinating how the word ‘love’ is used so ironically in this album. Although being the most common lyric among all the words in this album, Lamar uses it to contrast his surroundings of Compton, California. Among the album, some notable popular words are: promise, kill, tired, hope, and justice.”

gkmc_words %>% 
  anti_join(stop_words) %>% 
  filter(!word %in% c('chorus', 'verse', 'ooh', 'yeah')) %>% 
  count(word, sort = TRUE) %>% 
  filter(n > 10) %>% 
  inner_join(get_sentiments('afinn')) %>% 
  ggplot(aes(reorder(word, n),n, fill = value)) + geom_col() + coord_flip() + theme_economist() + xlab("Word") + ylab("Count") + ggtitle("good kid, m.A.A.d City | Word Frequency with Sentiment")
## Joining, by = "word"
## Joining, by = "word"

Wordcloud for good kid, m.A.A.d city album

To Pimp a Butterfly is Kendrick Lamar’s masterpiece and gift to black people of the future. It addresses a materialistic society and using fame correctly, turning to black history for help. With a strong commentary on the systemic racism of America, I assumed this would be Kendrick Lamar’s most negative album. However, it is actually the most positive of the four. This is most definetly due to Kendrick Lamar’s neat use of lyricism, typically using sarcasm and rhetorical devices to enhance a point.

tpab_words %>% 
  anti_join(stop_words) %>% 
  filter(!word %in% c('chorus', 'verse', 'ooh', 'yeah')) %>% 
  count(word, sort = TRUE) %>% 
  filter(n > 10) %>% 
  inner_join(get_sentiments('afinn')) %>% 
  ggplot(aes(reorder(word, n),n, fill = value)) + geom_col() + coord_flip() + theme_economist() + xlab("Word") + ylab("Count") + ggtitle("To Pimp a Butterfly | Word Frequency with Sentiment")
## Joining, by = "word"
## Joining, by = "word"

Wordcloud for To Pimp a Butterfly album

The album “DAMN” is Kendrick Lamar’s fourth studio album, and another commentary on societal norms, this time focusing on morality and religion rather than specifically black oppression. This album showed a fair amount of contrast, which was fairly predictable based off previous knowledge of this album. While songs like XXX, ft. U2 offer a typical Kendrick Lamar performance with negativity, tracks like Loyalty ft. Rihanna and LOVE ft. Zacari offer antidotes to Kendricks negativity. This is evident clearly in the word frequency sentiment graph below: while “bitch” takes the top spot, “love” and “loyalty” counter that with extremely positive sentiment.

damn_words %>% 
  anti_join(stop_words) %>% 
  filter(!word %in% c('chorus', 'verse', 'ooh', 'yeah')) %>% 
  count(word, sort = TRUE) %>% 
  filter(n > 10) %>% 
  inner_join(get_sentiments('afinn')) %>% 
  ggplot(aes(reorder(word, n),n, fill = value)) + geom_col() + coord_flip() + theme_economist() + xlab("Word") + ylab("Count") + ggtitle("DAMN | Word Frequency with Sentiment")
## Joining, by = "word"
## Joining, by = "word"

Wordcloud for DAMN album

Compare Average Sentiments of Each Album

Here I plotted the aggregated average sentiment for each of the four albums. The results show that “Untitled Unmastered” and “good kid, m.A.A.d city”: are signifiantly more negative than “To Pimp a Butterfly” and “DAMN”.

combined_words %>% 
  inner_join(get_sentiments('afinn')) %>% 
  group_by(album) %>% 
  summarize(album_sentiment= mean(value)) %>% 
  ggplot(aes(y=reorder(album,-album_sentiment),x=album_sentiment)) + geom_col() + theme_economist() + xlab("Album Sentiment (afinn)") + ylab("Album") + ggtitle("Average Sentiment by Album")
## Joining, by = "word"

Conclusion

In conclusion, I was not expecting the results that I initially hypothesized. While I thought that “To Pimp a Butterfly,” the album that focuses primarily on systemic oppression of the black male, would be the most negative of all albums: it was actually the least. With a score of approximately -0.8 album sentiment (afinn), “To Pimp a Butterfly” received a more positive score than the other three hit albums. This is likely due to a few of reasons: the main one being all albums scored negative sentiment due to heavy profanity, thus making albums with more curse words a greater score. There is no surprise that all albums are far in the negative spectrum though. Kendrick Lamar is known as one of the most influential artists of our time, and he raps from the point-of-view of himself growing up in the streets of Compton, California.