Hypotheses

For my final project, I wanted to look at how music has changed over time. My hypothesis is that over the years, music has become more energetic and loud with more dancibility.

Data

The data comes from Kaggle, and it is of the top Spotify songs from 2010 to 2019, categorized by year. This data set has the name and artist of each song and includes information about the genre of music, the bpm (beats per minute), energy, dancibility, dB (loudness), and liveness (how likely it is that the song was a live recording).

A higher number indicates that the song has high levels of energy, that it is easy to dance to, that it is loud, and that it was likely recorded live.

The data set can be found here: https://www.kaggle.com/leonardopena/top-spotify-songs-from-20102019-by-year.

Dancibility

As we can see from the graph and the trendline, dancibility is increasing over time, though it is increasing at a very slight rate. We can also see that each year there are a few outliers in terms of low dancibility. This aligns with my hypothesis and is what I expected I might see.

Energy

It was unexpected to see that energy was actually decreasing over time, especially after finding that dancibility was increasing because I assumed those two might go hand in hand.

BPM

BPM was decreasing over time every so slightly, which I could only tell by looking at the equation for the trendline. I didn’t really have any expectations as to how BPM might be changing over time but it was interesting to look at.

Loudness

Another surprise came with the results for loudness. This was also a variable I expected to see increasing over time but it did not.

Comparing Multiple Variables

Next, I wanted to look at whether or not these variables had correlations with each other.

Energy and Dancibility

We can see that energy and dancibility and have a positive correlation, which makes sense. However, it is interesting that these two had opposite trends over time when they are positively correlated.

BPM and Dancibility

Next, I wanted to see if BPM had impacts on energy and dancibility. I initially thought that songs with a midrange BPM would be easier to dance to, because a song shouldn’t be too fast or too slow in order to dance to it. Though there is a negative correlation, we can see some high outliers in dancibility around 90 to 130 BPM, which makes sense with my original theory.

BPM and Energy

I expected that high BPM songs would be correlated with high energy, which was accurate. However, it is now interesting to see that BPM and dancibility had a negative correlation despite the fact that BPM and energy had a positive correlation since energy and dancibility are positively correlated.

Ranking Individual Songs

Next, I wanted to see if I could notice any other trends in music across the years. First, I loaded all my R packages.

library(tidyverse)
library(tidytext)
library(ggthemes)
library(readxl)

Then I loaded the data into R.

top_spotify_songs <- read_excel("~/Desktop/Final Project Data.xlsx")

I wanted to create tables of each of the variables to see if there were any songs or artists that consistently ranked highly for each variable. ## Ranking BPM

top_spotify_songs %>%
  arrange(desc(bpm)) %>%
  mutate(rank = row_number()) %>%
  head(10) %>%
  knitr::kable()
title artist top genre year bpm nrgy dnce dB live val dur acous spch pop rank
FourFiveSeconds Rihanna barbadian pop 2015 206 27 58 -6 13 35 188 88 5 80 1
L.A.LOVE (la la) Fergie dance pop 2015 202 39 48 -8 26 27 193 2 9 0 2
How Ya Doin’? (feat. Missy Elliott) Little Mix dance pop 2013 201 95 36 -3 37 51 211 9 48 50 3
Shot Me Down (feat. Skylar Grey) - Radio Edit David Guetta dance pop 2014 192 77 35 -4 12 4 191 6 5 61 4
I’ll Show You Justin Bieber canadian pop 2015 192 61 36 -7 18 8 200 5 10 68 5
The Greatest Sia australian dance 2017 192 73 67 -6 6 73 210 1 27 76 6
Love Me Like You Do - From “Fifty Shades Of Grey” Ellie Goulding dance pop 2015 190 61 26 -7 13 28 253 25 5 79 7
Animals Maroon 5 pop 2015 190 74 28 -6 59 33 231 0 9 76 8
Chained To The Rhythm Katy Perry dance pop 2017 190 80 45 -5 20 47 238 8 17 72 9
Whataya Want from Me Adam Lambert australian pop 2010 186 68 44 -5 6 45 227 1 5 66 10

Ranking Energy

top_spotify_songs %>%
  arrange(desc(nrgy)) %>%
  mutate(rank = row_number()) %>%
  head(10) %>%
  knitr::kable()
title artist top genre year bpm nrgy dnce dB live val dur acous spch pop rank
Hello Martin Solveig big room 2010 128 98 67 -3 10 45 191 1 3 0 1
Pom Poms Jonas Brothers boy band 2013 148 98 68 -2 28 90 198 7 9 52 2
Don’t Stop the Party (feat. TJR) Pitbull dance pop 2012 127 96 72 -4 38 95 206 1 9 59 3
Rock N Roll Avril Lavigne canadian pop 2013 184 96 47 -3 34 67 207 1 13 61 4
All The Right Moves OneRepublic dance pop 2010 146 95 53 -4 28 65 238 26 5 65 5
Written in the Stars (feat. Eric Turner) Tinie Tempah dance pop 2010 91 95 64 -4 18 57 220 6 7 52 6
Written in the Stars (feat. Eric Turner) Tinie Tempah dance pop 2011 91 95 64 -4 18 57 220 6 7 52 7
How Ya Doin’? (feat. Missy Elliott) Little Mix dance pop 2013 201 95 36 -3 37 51 211 9 48 50 8
She Looks So Perfect 5 Seconds of Summer boy band 2014 160 95 49 -4 33 44 202 0 13 71 9
Booty Jennifer Lopez dance pop 2015 129 95 71 -4 26 40 210 0 5 64 10

Ranking Dancibility

top_spotify_songs %>%
  arrange(desc(dnce)) %>%
  mutate(rank = row_number()) %>%
  head(10) %>%
  knitr::kable()
title artist top genre year bpm nrgy dnce dB live val dur acous spch pop rank
Bad Liar Selena Gomez dance pop 2018 121 41 97 -6 8 73 215 19 7 75 1
Drip (feat. Migos) Cardi B pop 2018 130 59 97 -8 8 27 264 5 26 45 2
Anaconda Nicki Minaj dance pop 2014 130 60 96 -6 21 65 260 7 18 50 3
Come Get It Bae Pharrell Williams dance pop 2014 120 80 93 -6 10 90 202 27 8 59 4
Me Too Meghan Trainor dance pop 2016 124 69 93 -6 48 84 181 10 10 73 5
WTF (Where They From) Missy Elliott dance pop 2016 120 82 93 -3 6 56 193 2 20 58 6
Bodak Yellow Cardi B pop 2017 125 72 93 -6 35 46 224 7 11 70 7
Lemon N.E.R.D hip hop 2018 95 73 92 -7 12 20 220 0 9 68 8
Fancy Iggy Azalea australian hip hop 2014 95 72 91 -4 5 38 200 9 7 70 9
Dangerous Jennifer Hudson dance pop 2015 109 53 90 -5 8 65 255 0 5 18 10

Ranking Loudness

top_spotify_songs %>%
  arrange(desc(val)) %>%
  mutate(rank = row_number()) %>%
  head(10) %>%
  knitr::kable()
title artist top genre year bpm nrgy dnce dB live val dur acous spch pop rank
Mmm Yeah (feat. Pitbull) Austin Mahone dance pop 2014 126 92 71 -4 27 98 232 0 4 65 1
There’s Nothing Holdin’ Me Back Shawn Mendes canadian pop 2018 122 81 87 -4 8 97 199 38 6 84 2
Happy - From “Despicable Me 2” Pharrell Williams dance pop 2014 160 82 65 -5 9 96 233 22 18 79 3
All About That Bass Meghan Trainor dance pop 2015 134 88 81 -4 11 96 189 5 5 65 4
Don’t Stop the Party (feat. TJR) Pitbull dance pop 2012 127 96 72 -4 38 95 206 1 9 59 5
Lips Are Movin Meghan Trainor dance pop 2015 139 83 78 -5 11 95 183 5 5 68 6
Sucker Jonas Brothers boy band 2019 138 73 84 -5 11 95 181 4 6 86 7
Shake It Off Taylor Swift pop 2014 160 80 65 -5 33 94 219 6 17 78 8
Treasure Bruno Mars pop 2014 116 69 87 -5 32 94 179 4 4 77 9
Sing Ed Sheeran pop 2015 120 67 82 -4 6 94 235 30 5 71 10

There were a few overlaps between songs and artists across a few variables. Little Mix’s song ‘How Ya Doin’? (feat. Missy Elliot)’ ranked in both BPM and energy. Many of the artists that ranked highly in dancibility also ranked highly in loudness, though they were ranking for different songs. Meghan Trainor ranked the most frequently out of any of the artists ranking for multiple categories.

When looking at these 4 tables, I noticed that there was an overwhelming number of ranking songs in the genre of ‘dance pop.’ I wanted to see how popular that genre was compared to other genres in the top Spotify songs.

top_spotify_songs %>% 
  count(`top genre`, sort = TRUE) %>%
  head(10) %>%
  knitr::kable()
top genre n
dance pop 327
pop 60
canadian pop 34
barbadian pop 15
boy band 15
electropop 13
british soul 11
big room 10
canadian contemporary r&b 9
neo mellow 9

Dance pop is clearly the most popular genre among the top Spotify songs of the past 10 years. Let’s also graph the popularity of dance pop over the years to see if there are any trends across time. First, let’s make a table.

top_spotify_songs %>% 
  filter(`top genre` %in% 'dance pop') %>% 
  group_by(year) %>% 
  count(sort=TRUE)%>%
  knitr::kable()
year n
2015 52
2016 46
2013 42
2011 38
2018 38
2010 31
2017 31
2014 27
2012 15
2019 7

Now, let’s create a graph.

top_spotify_songs %>%
  filter(`top genre` %in% 'dance pop') %>%
  group_by(year) %>%
  count(sort=TRUE)%>%
  ggplot(aes(year,n))+geom_col()

Though dance pop was the most popular genre overall, it fluctuated in popularity over the last decade. The biggest spike was in 2015 with 52 ranking songs, and there was a severe drop off in 2019 with only 7 ranking songs.

Conclusion

We can tell that over the last 10 years, dance pop has consistently been the most popular genre, and that artists often rank frequently among the multiple variables measured. The graphs of the variables over time were not always aligned with the previous hypothesis. Though we do have a lot of data for this 10 year time frame, a further opportunity for analysis would be to look at top Spotify songs over a longer range of time to see if trends stayed consistent or had more fluctuation over a longer time period. A larger time frame would likely lead to a better picture of trends over time.