Streaming Success: An Analysis of Music Trends and Popularity
This project explores key factors influencing music trends and popularity, providing insights for artists aiming to optimize their success on streaming platforms like Spotify and Apple Music. By analyzing metrics such as danceability, energy, and release timing, we examine how specific song attributes correlate with popularity and long-term sustainability on charts. Additionally, we investigate platform performance differences by comparing playlist presence and streaming trends on Spotify versus Apple Music. Lastly, we identify the most frequently charting artists and explore their commonalities, offering actionable recommendations for aspiring musicians. Through this analysis, we aim to empower artists with data-driven strategies to elevate their careers in a competitive music landscape.
Our dataset looks at the interconnection of each individual song and how it performs on multiple platforms as well as its musical attributes. There are almost 1000 entries and 25 attributes that it reports. This is a lot of observations for a dataset, but in the grand scheme of things, there are a lot more songs that you can listen to than reported. This means that there are some gaps in the data that could affect our analysis.
Year | Number of Songs |
---|---|
2000 | 4 |
2002 | 6 |
2003 | 2 |
2004 | 4 |
2005 | 1 |
2007 | 1 |
2008 | 2 |
2010 | 7 |
2011 | 10 |
2012 | 10 |
2013 | 13 |
2014 | 13 |
2015 | 11 |
2016 | 18 |
2017 | 23 |
2018 | 10 |
2019 | 36 |
2020 | 37 |
2021 | 119 |
2022 | 402 |
2023 | 175 |
The two line plots show the relationship between danceability and the average number of times it appears in Spotify and Apple Music playlists. Danceability is defined as the ability a song has to make people dance. For this analysis, I calculated the average number of playlist inclusions for each danceability percentage to highlight how this feature correlates with playlist additions. The trends in these graphs reveal the danceability levels that are most and least favored by each platform’s users and curators.
The high points of the Spotify graph represent danceability levels where songs are more frequently included. Most common in the 30–60% range, which shows that tracks with moderate to higher danceability are more popular among Spotify playlists. Songs with the highest danceability % are the lowest in additions to Spotify playlists.
The peak in the Apple Music graph show the songs with 30-50% danceability are more popular than those with a higher danceability%. Apple playlists prefer music with a below average (50%) danceability rate as Spotify playlists consist of higher level danceability.
Valence: Positivity of the song’s musical content
Energy: Perceived energy level of the song
Liveness: Presence of live performance elements
This plot is a stacked bar plot of all the different metrics describing songs released in a certain year. This dataset is from the year 2000 to the last release date which is: 2023-07-14. All the metrics were averaged as the raw data would have been too large. This bar chart allows us to analyze the relationship between these metrics and why songs released before 2023 continue to be streamed. The 2023 data was kept providing a basis for comparison which offered insights into what metrics contribute to the music’s longevity in streams. For 2021 to 2023, the metrics have remained similar with slight differences as the music has not changed significantly. In the bar chart, they all share the common denominator with valence having the highest average among the three metrics. While the metric that varies the most is liveness which can be attributed to the fact that technology is being relied on more and more in the process of creating music.
The scatterplots below compare the number of playlists that different songs are in and the streams each song has from 2000-2023. The first scatterplot with green points shows this data for Spotify. The second scatterplot with sky blue points shows this data for Apple Music. Both plots show the positive correlation between how many playlists songs are included in and how many streams the songs receive. There are a few outliers, but the general trend shows a positive correlation.
This graph is an interactive high chart that shows the summed number of streams on each day of the year. The slider at the bottom of the graph lets you zoom into a specific time frame. This graph shows in general that the number of streams across dates are pretty steady, except for two outliers on Jan 1, 2013 and May 6, 2022. These outliers are from a release of an album from the top artist on this dataset. Because all of these top songs were released on the same day, this created the spike during these times. The maximum number of streams across all the dates is 13432412789. After looking at the data, these are times in the data where the one of the top artists shown in the graph of the most popular artist dropped a new album. Otherwise, looking at the graph more closely, almost all of the peaks on the graph are on Fridays. This is because historically, producers tell artists to drop new music right before the weekend so people have more free time to listen to the music. The webiste ‘Other Record Labels’ says labels started to release music on Fridays in 2015 instead of on Tuesdays because of the new age of streaming with platforms like Apple Music and Spotify as well as a change on when charts would be released. This new industry standard made sure all new music internationally was able to appear on charts for the week.
This bar graph displays artists that have appeared on the dataset 10 or more times throughout the span of 2000-2023. The threshold is at 10 as many have appeared more than five times but less than 10. Only a select few in the dataset were able to pass the 10 times point which I believe distinguishes them from the rest. These are the artists that many should research into as they were the ones able to find musical success multiple times. Below are resources that allow you to look up a specific artist/group and see what genre of music they produce as well as a biography of their music’s history and background.
last.fm can be used to see the biographies of artists that are on the bar graph. It gives a great overview of the artist itself as well as their work, such as tracks and albums. If you want to learn about the history of these artists, this is a great source to look at.
uDiscoverMusic is great for finding recent news about artists such as album and track releases from certain artists. It’s a source you can go to if you want to keep up with what is going on in the music world and keeping up with trends.
This dashboard was created using Quarto in RStudio, and the R Language and Environment.
The data used to create this dashboard were downloaded from:
Abdullah M (2023). Spotify Most Streamed Songs. Kaggle. Retrieved December 4, 2024,
Arnold J (2024). ggthemes: Extra Themes, Scales and Geoms for ‘ggplot2’. R package version 5.1.0, https://github.com/jrnold/ggthemes, https://jrnold.github.io/ggthemes/.
Auguie B (2017). gridExtra: Miscellaneous Functions for “Grid” Graphics. R package version 2.3, https://CRAN.R-project.org/package=gridExtra.
Bache S, Wickham H (2022). magrittr: A Forward-Pipe Operator for R. R package version 2.0.3, https://CRAN.R-project.org/package=magrittr.
Dancho M, Vaughan D (2023). tidyquant: Tidy Quantitative Financial Analysis. R package version 1.0.7, https://github.com/business-science/tidyquant.
Kunst J (2022). highcharter: A Wrapper for the ‘Highcharts’ Library. R package version 0.9.4, https://CRAN.R-project.org/package=highcharter.
Neuwirth E (2022). RColorBrewer: ColorBrewer Palettes. R package version 1.1-3, https://CRAN.R-project.org/package=RColorBrewer.
Posit team (2024). RStudio: Integrated Development Environment for R. Posit Software, PBC, Boston, MA. URL http://www.posit.co/.
R Core Team (2024). R: A Language and Environment for Statistical Computing. R Foundation for Statistical Computing, Vienna, Austria. https://www.R-project.org/.
Rinker, T. W. & Kurkiewicz, D. (2017). pacman: Package Management for R. version 0.5.0. Buffalo, New York. http://github.com/trinker/pacman
Vanderkam D, Allaire J, Owen J, Gromer D, Thieurmel B (2018). dygraphs: Interface to ‘Dygraphs’ Interactive Time Series Charting Library. R package version 1.1.1.6, https://CRAN.R-project.org/package=dygraphs.
Wickham H, Averick M, Bryan J, Chang W, McGowan LD, François R, Grolemund G, Hayes A, Henry L, Hester J, Kuhn M, Pedersen TL, Miller E, Bache SM, Müller K, Ooms J, Robinson D, Seidel DP, Spinu V, Takahashi K, Vaughan D, Wilke C, Woo K, Yutani H (2019). “Welcome to the tidyverse.” Journal of Open Source Software, 4(43), 1686. doi:10.21105/joss.01686 https://doi.org/10.21105/joss.01686.
Xie Y (2024). knitr: A General-Purpose Package for Dynamic Report Generation in R. R package version 1.48, https://yihui.org/knitr/.
Yihui Xie (2015) Dynamic Documents with R and knitr. 2nd edition. Chapman and Hall/CRC. ISBN 978-1498716963
Yihui Xie (2014) knitr: A Comprehensive Tool for Reproducible Research in R. In Victoria Stodden, Friedrich Leisch and Roger D. Peng, editors, Implementing Reproducible Computational Research. Chapman and Hall/CRC. ISBN 978-1466561595
Zhu H (2024). kableExtra: Construct Complex Table with ‘kable’ and Pipe Syntax. R package version 1.4.0, https://github.com/haozhu233/kableExtra, http://haozhu233.github.io/kableExtra/.