Introduction

This project examines the sentiment expressed in the lyrics and user reviews of two prominent artists from different music genres: Ed Sheeran, a singer-songwriter known for his romantic pop music, and 21 Savage, a rapper known for his raw and intense lyrics about sex, drugs, and murder. By analyzing both their lyrics and the sentiment in album reviews, this project aims to uncover patterns in emotional expression and public perception.

Research Questions

  1. What emotions are most prevalent in the lyrics of Ed Sheeran compared to those of 21 Savage?
  2. Which artist exhibits a wider range of emotional expression in their lyrics?

After looking at the emotions expressed in the lyrics, I want to perform a similar sentiment analysis on the words used in the reviews for both artists. I am hoping to find some correlation between the types of language used by the artists and the language used by their listeners.

Collecting the Data

Before I could start collecting lyrics, I needed access to the Genius API. This required me to register on the Genius website and get my API key. With my API key in hand, I went to work in R. Here is some background on some of the functions I wrote to get the text data I wanted.

Get Songs Function: I wrote a function called get_genius_songs that would take an artist’s name and my API key, and it would ask the Genius API for a list of songs by that artist. The function would send a request to the API and receive data about the songs.

Get Lyrics Function: I also wrote another function called get_lyrics, which would take a URL to a song’s page on Genius and pull the lyrics from that page.

Once I had my functions ready, I used them to fetch a list of songs for both Ed Sheeran and 21 Savage. For each song, I used the song’s URL provided by the first function to fetch the actual lyrics with the second function.

After getting the lyrics, I had to clean them up. Lyrics on Genius include things like [Chorus] or [Verse], which I didn’t need for my analysis. So, I wrote some code to remove these parts, leaving just the words of the songs.

For some context, here are the top 10 words used by Sheeran and Savage in their top 10 most streamed songs: (please excuse the naughty language used by our guy 21)

Question 1

To find out which emotions are most common in the songs of Ed Sheeran and 21 Savage, I used sentiment analysis with the NRC Word-Emotion Association Lexicon to identify different emotions in the words of the lyrics. I then counted how many times each emotion appeared in their songs. This process helped me see which emotions are expressed more frequently by each artist.

Question 1 Analysis

From the analysis, the results are pretty much what you might expect. Ed Sheeran’s songs often talk about love and positive feelings, which shows up in the lyrics analysis as lots of positive words. On the other hand, 21 Savage’s songs frequently deal with tougher subjects like struggles and conflicts, which is reflected in his lyrics having more words that express anger and negativity. This difference in their music can clearly be seen in the charts, where Ed Sheeran’s graph has higher counts of joyful and trusting words, while 21 Savage’s chart is filled with words that show sadness and anger. This contrast helps us understand not just their music styles, but also the emotions they choose to express through their songs.

Question 2

To determine which artist shows a greater variety of emotions in their lyrics, I calculated a measure called “entropy” for the lyrics of both Ed Sheeran and 21 Savage. Entropy is a way of measuring how spread out the emotional expressions are across different emotions. I analyzed all the words in their lyrics to see how many different emotions each word could be linked to and then used these counts to compute the entropy. A higher entropy value means that the artist’s lyrics display a wider range of emotions. This calculation gave me a clear comparison of the emotional diversity in the lyrics of both artists.

Question 2 Analysis

In the entropy chart, both Ed Sheeran and 21 Savage scored around 2.0, which tells us something interesting about their music. Even though their styles and themes are quite different, this score indicates that they both use a similar range of emotions in their lyrics. A score of 2.0 in entropy suggests that neither artist sticks to just one or two emotions; instead, they both express a variety of feelings through their songs.

This result might be surprising given their different musical genres. Ed Sheeran’s pop and acoustic ballads often focus on love and personal stories, while 21 Savage’s rap songs frequently tackle themes of things like street life. Despite these differences, the entropy value shows that both artists explore a broad emotional spectrum, from joy to sadness, from trust to fear. This analysis shows that diversity in emotional expression is not limited to the type of music an artist produces but is also a sign of their versatility and depth as a songwriter.

Using Another Data Source

To understand how people feel about Ed Sheeran and 21 Savage’s music, I needed to gather opinions from fans and listeners. These opinions are available in the form of online reviews, which people write after listening to their albums. To get these reviews, I turned to a popular music website called “Album of the Year,” where music fans frequently post their thoughts and ratings on different albums. I tried to scrape from multiple other sites including X and MusicBoard, but I couldn’t find the right elements using the isnpect tool, so I stuck with this website.

Once I found the right pages, I saw that the reviews were listed, but they were spread across multiple pages. This meant I had to collect data from each page separately, so I wrote a script in R that could go to each review page and pull the text of the reviews. I only had to scrape from 2 pages for each artist, since there were 25 reviews on each page and I wanted 50 reviews.

Once the reviews were collected and cleaned, they were ready for analysis. I pretty much did the same thing with these reviews as I did with the lyrics. I used the NRC lexicon to get sentiment analysis for the words in the reviews.

Analyzing the Review Sentiments

I knew that the review data was going to go one of two ways:

  1. The language of the review would reflect only how good the listeners thought the music was (this is what I assumed the case probably was)

  2. The language that Ed Sheeran’s listeners used would reflect Sheeran’s positive, upbeat lyrics and the language Savage’s listeners used would reflect his more angry, negative language. Basically I was hoping that Sheeran’s listeners would be happy-go-lucky reviewers who used positive language, and Savage’s listeners would curse a lot.

Unfortunately, the former ended up being the case. Both Sheeran and Savage had overall ratings of around 8, which is a relatively positive score. They are both popular artists who have dozens of popular songs. As a result, positive language ended up being the most prevalent in reviews for both artists. The sentiment analysis for both artists honestly looks pretty identical, which is somewhat disappointing. I was hoping to see a reflection of their lyric-styles within their reviewers’ tones.

Conclusion

I’m really happy with my results for the lyrical analysis, but I wish there was more of a relationship between the artist’s lyrics and the emotions expressed in the reviews of their listeners. I should have seen this coming, since I used REVIEW data, which is intended to just express people’s opinions of the music, whether that be positive or negative.

If I could do it again, I would use the same lyrical analysis, but I would try to find a different source for the reviews. Instead, I maybe use something like YouTube comments, where fans are typing their thoughts. I would assume that there would be more of a relationship there.

```