Introduction

Spotify is a digital music, podcast and video streaming service that gives you access to millions of songs and other content from artists all over the world. As of April 2019, Spotify had 217 million active users, including 100 million paying subscribers. Spotify pays royalities based on the number of artists’ streams as a proportion of total songs streams. When it was released, Spotify completely changed the music industry for the best. The application tracks top artists by the number of streams they have overall. This is a fairly accurate representation of how successful each artist is. With this information, I completed a sentiment analysis of the top 10 most-streamed artists on Spotify to see if the sentiments of the lyrics changed from the first and most recent albums.

Hypothesis

Does musical success have an effect on the sentiment of lyrics?

Methodology

In this project, the following libraries were used: tidyverse, tidytext, gridExtra, dplyr, genius, and wordcloud2.

In order to carry out the analysis, I needed information for the top streaming artists on spotify. Rolling stone came out with an article that referenced Spotify’s “Decade Of Discovery”. This article was released on October 10th, 2018. The pictograph in the article, created by Spotify, includes the most-streamed artists of all time since the app was created. Below is a snapshot of the pictograph that shows the top 10 most streamed artists of all time on Spotify.

Ranking Artist
1 Drake
2 Ed Sheeran
3 Eminem
4 The Weeknd
5 Rihanna
6 Kanye West
7 Coldplay
8 Justin Bieber
9 Calvin Harris
10 Ariana Grande

The artists included are all from the US, with three exceptions. Although Ed Sheeran, Calvin Harris and The Weeknd are very popular in the US, Sheeran and Harris reside in the UK and The Weeknd resides in Canada. I was not able to find a complete list of Spotify’s most downloaded artists for US artists only.


From this list, I was able to complile each of the artists first and most recent albums using Wikipedia’s discography pages. Next, I used the Genius API and compile this information into R. The afinn lexicon was used for this project which assigns words with a score that runs between -5 and 5, with negative scores indicating a negative sentiment and positive scores indicating a positive sentiment.


Below is one chart I created using Drake’s first album. I first started out by created a variable D1 which contains all of the lyrics from Drake’s album “Thank Me Later”.

#Creating a variable called D1 which contains all of the lyrics from Drake's album "Thank Me Later"
D1 <- genius_album(artist = "Drake", album = "Thank Me Later")

#Using the piping operator to create a count of the most popular words in the album 
D1 %>% 
  unnest_tokens(word, lyric) %>% 
  anti_join(stop_words) %>% 
  count(word, sort = TRUE) -> D1Count

#Using more piping operators to create afinn sentiment as well as create a subset of the data that includes the top 20 most popular words 
D1Count %>% 
  inner_join(get_sentiments("afinn")) ->  D1Sentiment

D1Sentiment %>% head(20) -> D1Sentiment2

#Creating a column called color that uses an if else statment to color the sentiment score by red if it is below 0 and green if it is above
D1Sentiment2$color <- ifelse(D1Sentiment2$score < 0, "red", "green")

#Creating a bar graph that shows each sentiment. The "color=color" in the ggplot() and scale_color_identity() are what allow the graph to color by red and green based off of the ifelse statement.
Drake1 <- ggplot(D1Sentiment2, aes(reorder(word, -n), score, color=color)) + geom_col(fill="white") + 
  theme(axis.text.x = element_text(angle = 90, hjust = 1)) +
  labs(title="Drake - Thank me later", x="Top 20 Most Popular Words", y="Sentiment Score") +
  theme(plot.title = element_text(size=15,hjust = 0.5)) +
  scale_color_identity()

Drake1

This process was completed for each album for all 10 artists. Further, each artist has this code twice - one for their first album as well as their most recent album.





Results

For viewing purposes, I decided to use the gridExtra package to separate each artist by genre to examine the sentiments for each. Each artist has two bar graphs listed. The top graph represents the artist’s first album and the bottom graph represents the artist’s most recent album. I also created word clouds for each genre to visualize the most used words in each of the artists first and most recent albums. This was done by creating a data frame that included the counts for each artists first and last albums. The data frame created was then used to make word clouds to visulize the results. Note, because words like “love” and “yeah” were popular across all genres they were not included in each wordcloud.

Pop Music

Included in this genre are three pop artists - Ed Sheeran, Justin Bieber and Ariana Grande.

Ed Sheeran’s first album, “+” came out in 2011. The sentiment analysis shows the sentiments evenly split between positive and negative. His most recent album “Divide”, however, has a seemingly large increase of words with a positive sentiment value. Justin Bieber and Ariana Grande have similar patterns and similar positive and negative ratings for both of their albums. It is interesting to note for all 6 albums in this genre that the most popular word had a positive sentiment rating. The most popular words besides love are hold, wanna life and heart for the pop cateory.



Contemporary R&B

Drake, The Weeknd, and Rihanna were the artists included in this genre.

The difference in sentiments for this genre is very glaring compared to the rest. The Weeknd has a stark difference between positive and negative sentiments over time. The first album, “Kiss Land” came out in 2013. Among some of the top lyrics throughout the album are love, sh*t, diamond, leave and die. The most popular words in the album are split between having a postive and negative sentiment. However, The Weeknd’s most recent album “Starboy” which came out in 2016 had mostly negative sentiment ratings. 16 out of the top 20 words throughout the album had a negative sentiment rating. This is quite shocking. Drake and Rihanna had a similar pattern to The Weeknd, just not as apparent. All artists here are going from fairly even sentiments to more negative dominated sentiments across the most popular words. Additionally, the wordcloud shows that “life” seemed was the most popular word for all 6 albums.



Hip Hop

Included in this group were rappers Eminem and Kanye West.

This category had the highest negative sentiment ratings across all albums examined. Eminem and Kanye’s first albums were released about 5 years apart but had an even amount of positive and negative rated sentiments. In their most recent albums, they both have a considerably larger amount of negative rated words. It also appears that the afinn sentiment score is considerably lower for each negative word in the most recent albums. This makes sense because the most popular words in this genre include many swear words. According to the wordcloud, “life” is even more popular for Hip Hop compared to Contempory R&B.



Alternative Rock

Coldplay was the only artist in the alternative rock category.

Coldplay’s sentiment analysis seemed evenly spread out for their first album, “Parachutes” which was released in 2000. Among the most popular words were “lost, beautiful, wrong, confidence, hide” which I thought was interesting because the words were more spread out compared to the other genres examined. Their most recent album “A Head Full of Dreams” has a greater amount of positive words. The ratings of the positive words seemed to also be a lot higher in the most recent album. I thought it was interesting that for this category that “live” seemed to be the most popular word among this artist compared to “life” in the previous two genres. There is a huge distinguishing factor between life and live, which is time. To live represents living in the present and life can mean past present and/or future. Life is also a bit more reflective. Coldplay has a more positive sentiment compared to the artists in the Contemporary R&B and Hip Hop genres.



EDM

Calvin Harris was the only artist in the electronic dance music category.

First, it’s important to note that there was only 12 identifiable positive and negative words for Calvin Harris’s first album “I Created Disco”. For this analysis, I only examined the top 12 words for both “I Created Disco” as well as “Funk Wav Bounces Vol. 1” which was their most recent album. This being said, even though only 12 words were evaluated, it seems like there was a fairly even spread of positive and negative words in each album. Here, the types of most popular words was fascinating. The word 80s and rock were interstingly very popular in both of Calvin Harris’s albums.







A Deeper Look Into Ed Sheeran and The Weeknd

Ed Sheeran

Ed Sheeran and the Weeknd have the most interesting sentiment graphs at the beginning of their career compared to most recently. I decided to look deeper into Sheeran and The Weeknd’s albums to see if the lyric sentiments changed steadily over time, or if they were all over the board. In order to do so, I completed the same steps as above.


To date, Ed Sheeran has three albums out. + (2011), x - (2014), and Divide (2017). When I completed a sentiment analysis for his second album ‘x’ I was shocked by the results. Sheeran’s second album actually shows a greater negative sentiment across the most popular words in the album. As seen previously, his most recent album, ‘Divide’ has a highly-dominated positive sentiment for the most popular words. The variability in sentiments across all three albums is quite interesting. Ed Sheeran once said that each song he writes is either an expression of himself as an artist or a form of therapy. After taking a deeper look into the songs on his second album, they sound more depressing compared to the songs in + and divide. For example there are songs titled Runaway, Make It Rain, and I’m a Mess. Overall, this could explain the predominately negative sentiment.

## # A tibble: 814 x 2
##    word          n
##    <chr>     <int>
##  1 love         85
##  2 ye           75
##  3 baby         39
##  4 girl         22
##  5 barcelona    20
##  6 yeah         20
##  7 day          18
##  8 time         18
##  9 wanna        18
## 10 body         17
## # … with 804 more rows





The Weeknd

The Weeknd also has three albums, Kiss Land (2013), Beauty Behind the Madness (2015), and Starboy (2016). It seems like the sentiment analysis for his second album was similar to the first. Then, his most recent album Starboy has completely predominately negative sentiments. I decided to read more about the album Starboy. An article was released that said this album actually created a genre of its own. Many critics said that The Weeknd took more risks with his sound in Starboy. It could be that The Weeknd was trying to explore a different side of himself as an artist through this album, which in turn had a larger amount of negative sentiment scores compared to his first two.





Conclusion

One similarity among all genres and albums is that words like “love” and “yeah” are among the most popular words. The most interesting question that remains is why do some artists, such as The Weeknd, have a fairly even sentiment in their first album to mostly negative words in their most recent album? This pattern was also consistent with Eminem and Kanye. The Weeknd falls under the Contemporary R&B genre while Eminem and Kanye fall under the hip hop genre. I would consider these genres more “hard-core” compared to the rest, so maybe this pattern is consistent with these types of genres.

Similarly, why do some artists have higher popular lyric sentiment ratings in their most recent albums? Ed Sheeran and Coldplay resemble these patterns. Ed Sheeran falls under the Pop genre and Coldplay falls under the Alternative Rock genre. Both of these genres are more light-hearted compared to Contemporary R&B and Hip Hop.

Examining Ed Sheeran and The Weeknd further also produced interesting results. Sheeran’s sentiment scores were all across the board depending on the album. On the other hand, The Weeknd had similar sentiment scores for his first two albums, and then his most recent album had predominantly negative sentiment scores.

So, does musical success have an effect on the sentiment of lyrics? Based off of the afinn sentiment analysis it seems like most albums, regardless of the genre, started out with a fairly even sentiment. Then, as time went on and the most recent album was examined, the sentiments of the most popular lyrics became more positive, more negative or were similar to the first album. In other words, for some artists, there seems to be a correlation between time and sentiment scores. This leads me to believe that the artists who had a greater positive or negative sentiment in their most recent albums were influenced at some point in their career.





Future Research

I think there is a lot more to explore in this topic. First and foremost, if I had more time it would be interesting to complete a greater in depth analysis of all of the albums for each artist to see how the sentiments were changing for each album. I did look at Ed Sheeran and The Weeknd because their results between the first and most recent album were the most different compared to the other artists. That being said, being able to look at every album for each top artist would be very interesting and would explain more what happens over time. Additionally, I think using a different lexicon for the sentiment analysis would produce different results.