From the robot to the Dougie, and everything in between, dancing is a huge part of music culture. But what truly makes the perfect party song? It seems that dance music is constantly evolving, and showcasing more music technology influences. This project aims to explore how dance music has changed from 1980 – 2019 by analyzing Billboard music’s weekly Dance/Electronic chart. By the end of this analysis you will know the most important factors of dance music and be able to curate the perfect party playlist.
This is an exploratory analysis of dance music. My preliminary assumptions about the evolution of this industry is that there are more male artists on the top charts in the recent decades given the explosion in popularity of techno and EDM music (a micro-industry dominated by males). I also predict that dance music has gotten more profane, especially considering the influence of the rap and R&B sectors on dance music. I also predict that the introduction of music streaming has increased the size of the industry allowing more artists to make the charts. As an exploratory project, I will look into a number of factors that may, or may not have changed through the 4 decades under analysis.
R: tidyverse, ggplot, dplyr, devtools, spotifyr, knitr, tidytext, textdata, ggthemes, lubridate, gridExtra, readr, radarchart, and fmsb
Spotify: developer account
Soundiiz: account
Tableau: Desktop and Public
library(tidyverse)
library(ggplot2)
library(dplyr)
library(devtools)
devtools::install_github('charlie86/spotifyr')
library(spotifyr)
library(knitr)
library(tidytext)
library(textdata)
library(ggthemes)
library(lubridate)
library(gridExtra)
library(readr)
knitr::opts_chunk$set(echo = FALSE)
To start this project, I collected data from the Billboard music site. The website only allows you to view the weekly top songs one at a time, so I took the data from a corresponding Wikipedia page that presented all top dance hits by year. While the data goes back to 1974, I decided to start with 1980 to have 4 full decades to research that are all of equal length. After cleaning this data, I also manually added the gender of the artist or group and where they were born in order to explore more facets of this topic.
From there, I created a Spotify developer account in order to access the API. In order to use this API I needed to get all 1,639 song from the Billboard charts on to a Spotify playlist. I used an application called Soundiiz to help automatically input songs into a playlist. Only about 50% of the songs were recognized via the app, so I manually inputted the other half. 72 of the songs were not listed on Spotify, but certain weeks had multiple number one hits. This resulted in a playlist of 1,592 songs.
Sys.setenv(SPOTIFY_CLIENT_ID = 'fcd6529bffb34e31bcfe76aaf7137615')
Sys.setenv(SPOTIFY_CLIENT_SECRET = '548ed10ecd274970935bf6f5f94ca5ab')
access_token <- get_spotify_access_token()
knitr::opts_chunk$set(echo = FALSE)
Six data sources were referenced in the preliminary section of this analysis.
read_csv("~/Downloads/AllSongsDates.csv") -> all_songs_dates
read_csv("~/Downloads/AllSongs1.csv") -> all_songs
read_csv("~/Desktop/Folder/dance_hits_decade.csv") -> dance_hits_decade
read_csv("~/Desktop/Folder/key_decade.csv") -> key_decade
get_playlist_audio_features("All Dance Hits", "1M1iOJsMDGkrWjMrgPip81") -> dance_hits
my_lists <- get_user_playlists("alexkelly1112")
First, I wanted to explore on average how long songs stayed on the top of the charts. As seen in the graph below, songs lasted between 1 and 11 weeks. Eleven, the top number of weeks for one song to stay on the top of the Billboard music dance chart, was for the Michael Jackson hit “Thriller”. It is to no surprise that most songs only lasted 1 week on top, whereas fewer lasted multiple weeks. It appears as though the turning point is 3 weeks.
all_songs %>%
group_by(`Weeks on Top`) %>%
summarise(counts = n()) -> top_charts
top_charts %>%
ggplot(aes(`Weeks on Top`, counts)) + geom_line() +
xlab("Number of Weeks on the Chart") +
ylab("Number of Songs") +
ggtitle("How Long Songs Stay on Top") +
theme_igray() +
scale_colour_tableau('Classic Cyclic')
More importantly, I wanted to see how this metric was changing over time. Given the natural turning point of 3 weeks, I decided to only visualize songs lasting on the top chart for more than 2 weeks. As seen here, the 80’s had the most songs over the 2-week threshold, as well as the highest number of weeks on the chart. We can also see that the graph cuts off at 2005. This is because there is no song from 2006-2019 that has lasted on the chart for more than 2 weeks. This shows us that songs are not lasting on the charts as long as they did previously.
all_songs %>%
filter(`Weeks on Top` > 2) -> top_hits
top_hits %>%
ggplot(aes(Year, `Weeks on Top`, color = Decade)) + geom_point() +
xlab("Year") +
ylab("Number of Weeks on the Chart") +
ggtitle("How Long Songs Stay on Top by Year") +
theme_igray() +
scale_colour_tableau('Classic Cyclic', breaks=c("The 80s","The 90s","The 2000s"))
With songs not lasting on the charts as long, this made me wonder if the number of songs charting per year was changing as well. The graph below shows the total number of songs that made the Billboard music’s weekly Dance/Electronic chart by year. The clear upward trend shows that there were significantly less songs in the 80’s and 90’s meaning that they stayed on top for longer. Conversely, we see the trend line over 50 in the 2010’s showing that there was a new top song just about every week.
This is likely tied to the introduction of streaming (in the late 2000s) providing more accessibility for consumers and ease of release for artists. Considering that there is a new top song almost every week, it appears that for artists today, to stay on top of the charts for the most number of weeks in a year they need to favor quantity over quality. The more music they produce and release, the better chance they have of staying on top.
all_songs %>%
group_by(Year) %>%
summarise(counts = n()) -> songs_year
songs_year %>%
ggplot(aes(Year, counts)) + geom_line() + geom_smooth() +
xlab("Year") +
ylab("Number of Songs on the Chart") +
ggtitle("Number of Songs on the Chart per Year") +
theme_igray() +
scale_colour_tableau('Classic Cyclic')
As I analyze all of the top songs, I can’t help but wonder if there are more solo artists or groups and males or females. This next graph shows the change in gender and group type of artists on the Billboard chart through the decades.
Female solo artists are the top gender and group type mix in all four decades. Aside from remaining number 1, they also had major spikes in the past two decades. Again, many of these increases are likely related to the increase in number of songs on the chart each year.
Groups in general are slightly trending downward, but males, like females, are also on the rise, just at a slower rate. The increase in male solo artists in between the 2000s and 2010s is likely tied to the increasing popularity of EDM music, a field heavily inundated by male DJs. While EDM music has been around since the 80s, it really exploded in the early 2010s as a way to revamp the rave culture scene. EDM music has majorly impacted the dance music charts as well.
all_songs %>%
group_by(Gender, Decade) %>%
summarise(counts = n()) %>%
mutate(Decade = factor(Decade, levels = c("The 80s", "The 90s", "The 2000s", "The 2010s")))-> song_gender
song_gender %>%
ggplot(aes(Decade, counts, color = Gender)) + geom_point() +
xlab("Decade") +
ylab("Number of Groups") +
ggtitle("Artist Group Types by Gender Through the Decades") +
theme_igray() +
scale_colour_tableau('Classic Cyclic')
Next, I wanted to see how dance music titles changed throughout the years. To analyze this factor, I took all of the songs that charted in each decade and took out all of the stop words. From there I used the sentiment analysis “afinn” to plot words that appeared in titles for each decade 2 or more times. Below is a table showing the frequencies of each word for the 4 decades. It is no surprise that simiarly to most genres, “love” was the number one word used each decade.
| Word | Number of Times Used | Decade |
|---|---|---|
| love | 35 | The 80s |
| cuts | 5 | The 80s |
| feeling | 3 | The 80s |
| kiss | 3 | The 80s |
| bad | 2 | The 80s |
| crazy | 2 | The 80s |
| fire | 2 | The 80s |
| free | 2 | The 80s |
| fresh | 2 | The 80s |
| fun | 2 | The 80s |
| super | 2 | The 80s |
| top | 2 | The 80s |
| love | 59 | The 90s |
| beautiful | 4 | The 90s |
| joy | 4 | The 90s |
| heaven | 3 | The 90s |
| dreams | 2 | The 90s |
| free | 2 | The 90s |
| miss | 2 | The 90s |
| pressure | 2 | The 90s |
| strong | 2 | The 90s |
| love | 43 | The 2000s |
| beautiful | 6 | The 2000s |
| stop | 4 | The 2000s |
| amazing | 3 | The 2000s |
| bad | 3 | The 2000s |
| bitch | 3 | The 2000s |
| lonely | 3 | The 2000s |
| crazy | 2 | The 2000s |
| feeling | 2 | The 2000s |
| free | 2 | The 2000s |
| fuck | 2 | The 2000s |
| hate | 2 | The 2000s |
| lost | 2 | The 2000s |
| perfect | 2 | The 2000s |
| reach | 2 | The 2000s |
| save | 2 | The 2000s |
| sunshine | 2 | The 2000s |
| love | 36 | The 2010s |
| beautiful | 5 | The 2010s |
| fire | 5 | The 2010s |
| kill | 4 | The 2010s |
| kiss | 4 | The 2010s |
| feeling | 3 | The 2010s |
| forget | 3 | The 2010s |
| free | 3 | The 2010s |
| pretty | 3 | The 2010s |
| sweat | 3 | The 2010s |
| alive | 2 | The 2010s |
| bad | 2 | The 2010s |
| bitch | 2 | The 2010s |
| crazy | 2 | The 2010s |
| dirty | 2 | The 2010s |
| dream | 2 | The 2010s |
| fun | 2 | The 2010s |
| god | 2 | The 2010s |
| hell | 2 | The 2010s |
| lost | 2 | The 2010s |
| lucky | 2 | The 2010s |
| mess | 2 | The 2010s |
| yeah | 2 | The 2010s |
Of the 61 words listed here, there are only 40 unique words, and 12 words appear on 2 or more of the decade lists. Despite similarities of certain words throughout the years, we also see new words introduced, prominently expletives which only appear in the 2000s and 2010s. Finally, we can the number of words for each decade. From the 80s to the 2010s the number of words that show up in 2 or more titles has doubled. This is likely tied to the fact that more songs are on the top charts and staying on the charts for less weeks during these decades.
I also calculated an overall sentiment score for each decade. This was calculated by multiplying each word’s afinn sentiment value by the number of times that word was used. Then I added them all together and divided it by the number of words for that decade. I again only included the words that were used 2 or more times.
The 80s = 9.75 The 90s = 23.22 The 2000s = 7.12 The 2010s = 4.78
We see here a clear decrease in sentiment in the past two decades. Again, this is only the sentiment of the words used in song titles, but typically the types of words used in song titles give a decent picture of what types of words will be used in the song. I think the reason the 90s has such a high score is because only 9 words in that decade appeared in titles 2 or more times. This ultimately gave more weight to the positive word “love” which was used 59 times. I don’t think the 90s had significantly happier music based on this calculation.
For the next section of the analysis I wanted to focus on the Spotify metrics. I chose to compare 5 metrics throughout the 4 decades, each of which are ranked from 0 to 1, with the exception of tempo. First is valence which rates the musical positiveness of each song. Second is danceability which rates how danceable the song is based on factors like rhythm and beat. Third is energy which rates the dynamics of the song. Fourth is speechiness which rates how much talking is in the song. Finally, I looked at tempo which measures the beats per minute of the song.
For the purposes of the graph, I needed all metrics to be in a similar range. In order to accomplish this, I multiplied all speechiness scores by 10 and divided all tempos by 100. The chart below displays the original metrics given by Spotify.
| Decade | Valence | Danceability | Energy | Speechiness | Tempo |
|---|---|---|---|---|---|
| The 80s | .775 | .747 | .730 | .0594 | 118 |
| The 90s | .638 | .700 | .772 | .0608 | 120 |
| The 2000s | .599 | .689 | .773 | .0699 | 124 |
| The 2010s | .542 | .660 | .771 | .0735 | 123 |
library(radarchart)
library(fmsb)
knitr::opts_chunk$set(echo = FALSE)
For valence we see a clear decrease over the 4 decades. This is likely tied to the decreasing sentiment of titles and increase of expletives. This means in general dance music is becoming less positive.
While not as drastic as valence, we also see a decrease in danceability. This is surprising because with techno music on the rise you would think the rhythm and beats would be more conducive to dancing. Perhaps despite the trends within the techno and beats music space, those songs are actually more difficult to dance to than the pop dance classics from the 80s.
Energy had a big spike from the 80s to the 90s, and has remained steady from there on out.
Speechiness has continued to increase over the 4 decades. This is likely tied to two factors. First, the influence of rap music on the electronic sector. Second, the decreasing number of lyrics in the EDM sector. With less total words in songs, and more rapping, that could explain for why average speechiness is increasing. With that said, speechiness is by far the smallest factor in dance music. While numerically it has increased, the differences are likely not significant.
Tempo did not actually increase as much as I projected. It has slightly increased and plateaued over the years. Pop songs typically come in at around 116 bpm and EDM songs usually fall around 128 bpm. It is probable that the slight increases result from EDM and electronic music.
When looking at the top 10 most used keys within the dance sector, the top songs through the years have an even mix of major and minor keys. The most popular key used is C major, which is a widely used key in most genres.
dance_hits %>%
count(key_mode, sort = TRUE) %>%
head(10) -> keys
keys %>%
ggplot(aes(reorder(key_mode, n), n)) + geom_col(fill =c("cadetblue3", "cadetblue3", "cadetblue3", "cadetblue4", "cadetblue3", "cadetblue4", "cadetblue4", "cadetblue4", "cadetblue4", "cadetblue3")) +
coord_flip() +
xlab("Key") +
ylab("Number of Times Used") +
ggtitle("Most Popular Keys in Dance Songs") +
theme_igray()
When breaking this down by decade, we see that the 80s, the 90s, and the 2010s used C major the most, whereas the 2000s used G major the most. There is still a relatively even split between major and minor chords. Overall, there does not appear to be a significant shift in key trends through the decades. C and G are consistently the most popular within each decade. It is important to view this graph by comparing decades to themselves given that each decade has an increasing number of songs on the Billboard chart thus resulting in increases in the number of times each key is used.
key_decade %>%
mutate(Decade = factor(Decade, levels = c("The 80s", "The 90s", "The 2000s", "The 2010s")))-> keys_decades
keys_decades %>%
ggplot(aes(key_mode, n, color = Decade)) + geom_point() +
coord_flip() +
xlab("Key") +
ylab("Number of Times Used") +
ggtitle("Most Popular Keys In Dance Songs") +
theme_igray() +
scale_colour_tableau('Classic Cyclic')
One global debate in the Dance/Electronic music debate is who has better music, The United States or Europe? The following animated map shows the changes in where Billboard’s Dance/Electronic number 1 hits’ artists came from. Solo artists are listed with their birth country and groups are listed where the group was formed.
Each circle shows a sum of “weeks on top” meaning that all of the artists or groups from that country and their total number of weeks on the top of the chart for that year are recorded.
Here’s a link to the map, also shown below, which shows which country’s the Billboard Dance hits changing from 1980 to 2019.
While the United States is always the largest country, when evaluating European countries together, they do produce a competitive number of hits compared to the U.S. It makes sense that the U.S. is the number one country especially considering that Billboard’s metrics are based on U.S. consumers.
My first instinct was that if Europe ever did produce more weeks on top than the U.S. it would likely be in the 1980s during the second British Invasion. While there are notable differences in the number of weeks that England had on top during that time, it did not tip the charts to make Europe the leader over the U.S.
Interesting enough, there are only 3 years out of the 40 analyzed where Europe has more weeks on top than the U.S. They are 2013 (19 weeks verse 17), 2016 (20 weeks verse 19), and 2019 (25 weeks verse 23). While many countries are helping to tip the scale, one of interest is Sweden. Spotify’s headquarters are in Stockholm, and despite having on and off contributions from 1980-2010, from 2011 on Sweden had consistently had at least 2 weeks on top.
It is not shock that the conclusion from this map is American consumers listen to U.S. based artists and groups the most, however it is interesting to see the fluctuations through the decades.
As mentioned above, it is important to note that the two major data frames used in this project do vary by 47 songs. This is due to the 72 songs missing from the original chart on Spotify and certain weeks having multiple number one hits. I do not believe the missing songs made a large impact on the output of the analysis, however it is not a 100% match.
We also have to consider that Billboard is not the only organization to rank songs. While Billboard is a notable and long-lasting company, their charts do vary from others such as Nielsen or Spotify. Especially in the recent years with different songs as number one each week, there could be discrepancies as to what makes the top song of the week within the dance category.
From a geospatial perspective, it is important to remember that the solo artists are listed with their birthplace and the groups are listed where they were formed. It is possible that a solo artist was born in one location, but grew up somewhere else that had a larger impact on their music career. The same theory goes for groups that may have been musically influenced by their hometowns rather than the location the group was formed.
This analysis definitely brings to light trends we observe within the dance music industry. First, we see that more songs are hitting the charts, yet staying on top for less time, likely due to the influence from streaming. Next, we see the consistent popularity of female solo artists within the sector, and male solo artists slowly on the rise. This may be tied to the increasing presence of male EDM DJs. From there we can note that the sentiment of dance titles is becoming more negative, mostly correlating with the increasing use of expletives in titles.
From a Spotify perspective we see decreases in valence and danceability, whereas energy, speechiness, and tempo have all increased. We can also see that the use of chords and keys is relatively consistent through the decades.
Finally, from a geospatial perspective we see the dominance of American artists on the U.S. market with increasing popularity in European countries. While the United Kingdom has been a consistent second place participant, we now see the wealth distributing to other European countries such as Sweden.
Overall, this research shows us the influence that other genres are having on the dance/electronic sector. It also give a clear picture of where this sector of the industry has evolved from as well as giving us clues as to where it will go in the future.
If I had the opportunity to conduct further research, I would like to complete this same process for other genres. This would then allow me to compare music on a greater scale. I would also be interested to see which of these songs appear on other genres top charts. Especially today with the lines of genres blending, I would be curious to know which genres most heavily influence dance music. Finally, I would like to dive deeper into the lyrics of dance songs and see how they have changed through the years. With the introduction of EDM, I would predict that the number of unique words in dance music lyrics has decreased.
Dance music is a relatively under researched category. Much of dance music today is tied to clubs, parties, raves, and concerts. Post coronavirus, this is likely to continue to evolve in order for artists to stay relevant despite changes in venue and demand for their music. It will be fascinating to see where this next decade of dance music takes us.