Introduction and Problem statement
The music industry has seen significant evolution and growth over the years, with each decade bringing new styles, artists, and songs to the forefront. In this data visualization project, we aim to shed light on the trends and patterns that have emerged in popular music, with a focus on the different elements that make up a song. Our dataset comprises of a comprehensive collection of Spotify music data that includes important attributes such as the song name, artist name, year, popularity, genre, and various musical features such as danceability, energy, key, and more.
In the first section of this project, we will analyze the popular music trends of each decade, exploring the most dominant genres and artists that have defined the different eras. We also explore explicitness as a factor in songs over the decades. Our goal is to provide an overview of the musical landscape of each decade, giving a better understanding of the cultural and social context in which the music was produced.
The second section of this project will delve deeper into the musicality of songs, providing insights into the different elements that contribute to the overall sound and feel of a song. We will explore various musical features such as musicality, loudness, speechiness, and more, to gain a better understanding of how these elements interact and influence each other.
By utilizing data visualization techniques, we aim to present the insights and trends of popular music in a visually compelling and intuitive manner, providing a unique perspective on the history and evolution of music. This project serves as a valuable resource for music lovers, researchers, and industry professionals alike, offering a comprehensive analysis of popular music trends and musical features.
## # A tibble: 574,812 × 25
## track_id track…¹ track…² track…³ track…⁴ track…⁵ artis…⁶ track…⁷ track…⁸
## <chr> <chr> <int> <int> <int> <chr> <chr> <chr> <dbl>
## 1 35iwgR4jXetI… Carve 6 126903 0 ['Uli'] 45tIt0… 1922-0… 0.645
## 2 021ht4sdgPcr… Capítu… 0 98200 0 ['Fern… 14jtPC… 1922-0… 0.695
## 3 07A5yehtSnoe… Vivo p… 0 181640 0 ['Igna… 5LiOoJ… 1922-0… 0.434
## 4 08FmqUhxtyLT… El Pri… 0 176907 0 ['Igna… 5LiOoJ… 1922-0… 0.321
## 5 08y9GfoqCWfO… Lady o… 0 163080 0 ['Dick… 3BiJGZ… 1922 0.402
## 6 0BRXJHRNGQ3W… Ave Ma… 0 178933 0 ['Dick… 3BiJGZ… 1922 0.227
## 7 0Dd9ImXtAtGw… La But… 0 134467 0 ['Fran… 2nuMRG… 1922 0.51
## 8 0IA0Hju8CAgY… La Java 0 161427 0 ['Mist… 4AxgXf… 1922 0.563
## 9 0IgI1UCz84pY… Old Fa… 0 310073 0 ['Greg… 5nWlsH… 1922 0.488
## 10 0JV4iqw2lSKJ… Martín… 0 181173 0 ['Igna… 5LiOoJ… 1922-0… 0.548
## # … with 574,802 more rows, 16 more variables: track_energy <dbl>,
## # track_key <int>, track_loudness <dbl>, track_mode <int>,
## # track_speechiness <dbl>, track_acousticness <dbl>,
## # track_instrumentalness <dbl>, track_liveness <dbl>, track_valence <dbl>,
## # track_tempo <dbl>, track_time_signature <int>, artist_followers <dbl>,
## # artist_genres <chr>, artist_name <chr>, artist_popularity <int>,
## # track_time_mins <dbl>, and abbreviated variable names ¹track_name, …
##Trend Analysis Question 1: Analyze Spotify’s performance since it’s inception in 2006
Aim: To provide a visual representation of the distribution of tracks released by Spotify over different decades from 2006 to 2020.
Conclusion: The aim of this visualization is to provide a quick and simple way to see how the number of tracks released by Spotify has changed over the years, broken down by decade. By seeing the size of each slice, it should be easy to quickly compare the number of tracks released in different decades and get an overall understanding of the data. The donut chart depicts the evolution of the number of tracks released on Spotify, starting with 5304 tracks in 2006 and steadily rising to 12331 tracks in 2020. However, there was a slight decrease in the number of tracks released in 2017, which was almost a thousand less than the previous year.
Question 2:
Hypothesis: Covid-19, also known as the novel coronavirus, a highly infectious respiratory illness swept across the world, affecting millions of people and causing widespread panic and disruption. The Covid-19 pandemic has had a profound impact on every aspect of life, from the economy and public health to the music industry. The pandemic not only affected physical health but also had a significant impact on mental health, particularly in terms of depression. Thus, we conducted an experiment to find the type of tracks released by Spotify in the years 2019 and 2020.
Aim: Do a comparative study for the years 2019 and 2020 in order to find a relation between the type of songs released on Spotify and people’s mental health.
Conclusion: The above graph is a boxplot with jitters plotted for two time periods - pre-covid and covid. In order to analyze the relation between type of tracks and the pandemic we have made use of the metric - valence.
Valence - A measure from 0.0 to 1.0 describing the musical positiveness conveyed by a track. Tracks with high valence sound more positive (e.g. happy, cheerful, euphoric), while tracks with low valence sound more negative (e.g. sad, depressed, angry).
During our analysis we found correlation between valence and other features. There is significant correlation between the following pairs. 1) valence - danceability 2) valence - loudness 3) valence - energy
Contrary to our belief we see no significant change in the type of tracks released on Spotify. Hence, our hypothesis was incorrect.
##Feature Analysis
Question 3: How are the features of a song related to one another?
## <ggproto object: Class CoordFixed, CoordCartesian, Coord, gg>
## aspect: function
## backtransform_range: function
## clip: on
## default: FALSE
## distance: function
## expand: TRUE
## is_free: function
## is_linear: function
## labels: function
## limits: list
## modify_scales: function
## range: function
## ratio: 1
## render_axis_h: function
## render_axis_v: function
## render_bg: function
## render_fg: function
## setup_data: function
## setup_layout: function
## setup_panel_guides: function
## setup_panel_params: function
## setup_params: function
## train_panel_guides: function
## transform: function
## super: <ggproto object: Class CoordFixed, CoordCartesian, Coord, gg>
From the above, its observed that track loudness and track energy has positive correlation.Hence visualized it using scater plot
Conclusion
The plot showcases the correlation between different characteristics of the song tracks, indicating how they are related to one another. If the values are close to 1, then the features are positively correlated and are depicted in green, such as between “track_loudness” and “track_energy.” On the other hand, if the values are close to -1, then the features are negatively correlated and are shown in red, for instance, between “track_acousticness” and both “track_energy” and “track_loudness”.
Question 4: Now that we know how various features are related or not related to each other. Understand what drives popularity of a track and thus from your analysis tell if a new song is released on Spotify will it be popular or no.
Conclusion:
From the heatmap used to analyze different track features we came to a conclusion that the track popularity is highly dependent on the popularity of an artist and how many followers he/she has. This Hex Bin plot helps us in understanding this relation further.
Hexbin plots take in lists of X and Y values and returns what looks somewhat similar to a scatter plot, the entire graphing space has been divided into hexagons (like a honeycomb) and all points have been grouped into their respective hexagonal regions with a color gradient indicating the density of each hexagonal area.
From this graph we undestand that highest majority of the artists have very low or negligible popularity and hence popularity of their tracks is low. At the same time, artists with high popularity have hit tracks. Thus if a new song is released on Spotify, given that the track artist is popular the track has a good chance of becoming a hit.
Question 5: What factors contribute to the tracks popularity of an popular artist?
## # A tibble: 333 × 28
## track_id track…¹ track…² track…³ track…⁴ track…⁵ artis…⁶ track_re…⁷ track…⁸
## <chr> <chr> <int> <int> <int> <chr> <chr> <date> <dbl>
## 1 6eDApnV9J… One Ti… 71 215867 0 ['Just… 1uNFoZ… 2009-01-01 0.691
## 2 69ghzc538… One Le… 69 229107 0 ['Just… 1uNFoZ… 2009-01-01 0.58
## 3 0yIywEqux… Love Me 67 191573 0 ['Just… 1uNFoZ… 2009-01-01 0.729
## 4 4nTjkWK59… Favori… 58 256800 0 ['Just… 1uNFoZ… 2009-01-01 0.581
## 5 6epn3r7S1… Baby 79 214240 0 ['Just… 1uNFoZ… 2010-01-01 0.728
## 6 0aPZbnkMo… That S… 70 232720 0 ['Just… 1uNFoZ… 2010-01-01 0.552
## 7 3rLIv187B… Somebo… 70 220920 0 ['Just… 1uNFoZ… 2010-01-01 0.714
## 8 6Xw2FLih8… U Smile 64 196907 0 ['Just… 1uNFoZ… 2010-01-01 0.705
## 9 1DyV9obL0… Stuck … 60 222960 0 ['Just… 1uNFoZ… 2010-01-01 0.721
## 10 5GYbkDveR… Never … 72 227853 0 ['Just… 1uNFoZ… 2011-01-01 0.739
## # … with 323 more rows, 19 more variables: track_energy <dbl>, track_key <dbl>,
## # track_loudness <dbl>, track_mode <int>, track_speechiness <dbl>,
## # track_acousticness <dbl>, track_instrumentalness <dbl>,
## # track_liveness <dbl>, track_valence <dbl>, track_tempo <dbl>,
## # track_time_signature <dbl>, artist_followers <dbl>, artist_genres <chr>,
## # artist_name <chr>, artist_popularity <int>, track_time_mins <dbl>,
## # Year <chr>, Month <chr>, Date <chr>, and abbreviated variable names …
The line graph indicates that track characteristics do not play a significant role in determining the popularity of an artist’s tracks
Question 6: What is the occurrence rate of track names for a leading artist?
Conclusion
The wordcloud visualization indicates that the track titles “hold_on” and “anyone” appear frequently for the artist Justin Bieber. The font size in the wordcloud allows us to determine which track titles occur most frequently and which do not. Hence, the wordcloud visual provides information about the occurrence rate of track names for a leading artist.
Question 7: What variation in track characteristics can be observed across the top 10 music genres?
Conclusion
According to the Stacked bar chart, it can be noted that the “kleine hoerspie” genre possesses the highest values for several track features, including “track_danceability,” “track_energy,” “track_key,” “track_loudness,” “track_mode,” “track_speechiness,” and “track_acousticness.” This allows us to observe the variations in each track characteristic among the top 10 music genres.
##Time Series Analysis Question 8: What is the inclination of Spotify users towards particular artists, and does the year of release play a role in shaping these preferences?
Conclusion
We plot the top 5 artists in each decade between 1920-2020. This plot also shows an increase in popularity for artists in more recent years compared to those in the early ones.This could be due to a few different factors such as the evolution of the music industry in terms of genre preferences, better technology used to create music, globalization of popular genres, etc.
Question 9: What is the inclination of Spotify users towards different genres of music? Does the decade that this genre originated have an impact on whether people still listen to it?
Conclusion
We plot the top 5 genres by popularity that originated in each decade between 1920-2020. What is interesting to note is that the most popular genres of all time on Spotify are those that originated in the 2000s. From the previous plot(PLOT1), we saw that artists of older decades did not fare well in terms of popularity, but the same trend does not apply to genres. This shows that artists of recent years have adapted older genres into their style of music to make it more appealing to the audience.
Question 10: What is the popular attitude towards explicit content in music?
To understand this better, we plot the mean for explicitness over the decades from 1920-2020. The mean in this case would represent the proportion of 1s (explicit) in the data set. So, for a given decade, if the mean of the explicitness variable is 0.6, this would mean that 60% of the tracks in the data set for that decade are explicit. By aggregating the mean explicitness per decade, we can see how the proportion of explicit tracks has changed over time and whether there are any trends or shifts in attitudes towards explicit content.This shows us that the cultural attitude towards explicit music and an artist’s freedom of expression has increased largely, with most popular songs in the recent decades of the 2000s having popular tracks with explicit lyrics.
To further understand the portion of explicit lyrics in top songs of the artists that have dominated each decade of release, we plot the explicitness for the top song of each artist by decade.
Conclusion
We notice that most popular artists of recent decades do not in fact have explicit lyrics in their top tracks. This shows us that though explicit music is popular, it is not a necessity for the success of the song. This could be due to various factors such as the demographic of the audience, Spotify’s recommendation algorithms, etc.
Report Conclusion:
In conclusion, our data visualization project based on Spotify’s track data has revealed several interesting insights into the music industry. Through the use of various graphical representations, we were able to understand the distribution of different audio features of songs, the popularity of different genres, and the trend of listening behavior over time. Our findings showed that danceability, energy, and tempo are important factors that contribute to the popularity of a song, while hip-hop and pop music are the most streamed genres on Spotify. Moreover, we observed a steady growth in the number of streams over the years, indicating an increasing trend in the usage of Spotify and digital music streaming services.
In summary, this project has provided a comprehensive view of the music industry and the trends that are shaping it. By utilizing data visualization techniques, we have been able to effectively communicate complex information and derive meaningful insights. We hope that these findings will be useful for artists, music labels, and other stakeholders in the industry to make informed decisions and shape the future of music.