MGM Label Data Analysis


By:Jonah Srulevich, Michaela Fry, Oliver Loewengart, Elijah Elster, Eoghan Daly, Jack Cavanagh

Date: 2023-12-12


What artists, genres, and styles of music will increase MGM’s gross music listening and thereby, sales?


Top Artists by Frequency on Spotify Top 50 Global Chart


Spotify’s Top 50 Global Song chart in 2020 is displayed here.

Travis Scott and Doja Catt have the largest volume of hits in the global top 50.


How will this compare to number 1 hits by country?


Number of Top 50 Songs by Artist per Continent

Countries with DaBaby as Artist With Top Song

Countries with BadBunny as Artist With Top Song

Bad Bunny Named Top International Artist of 2023

Countries with TheWeeknd as Artist With Top Song

Top 6 Artists by Geolocation 2020


What are the optimal values for each technical music element of a song by genre? What specific elements are most important to success?


Average Values for Relevant Variables in Spotify 2023 dataset

Average Values for Relevant Variables in Hot 100 dataset

Radar Plot of Hot 100 Song Variables

Calculating Weighted Averages

[1] 41.23555
Weighted Average for danceability: 0.6151191 
Weighted Average for valence: 0.5733629 
Weighted Average for acousticness: 0.2482113 
Weighted Average for energy: 0.6370948 
Weighted Average for liveness: 0.1857829 
Weighted Average for speechiness: 0.08313874 

Radar Plot of Hot 100 Weighted Averages

Weighted Average for danceability: 65.28485 
Weighted Average for valence: 50.35057 
Weighted Average for acousticness: 26.95035 
Weighted Average for energy: 63.79905 
Weighted Average for liveness: 17.48395 
Weighted Average for speechiness: 8.911907 

Model Showing these variables effect on spotify track popularity


Call:
lm(formula = spotify_track_popularity ~ danceability + valence + 
    acousticness + energy + liveness + speechiness, data = Hot100filtered)

Residuals:
    Min      1Q  Median      3Q     Max 
-68.481 -15.043   0.207  14.593  73.841 

Coefficients:
             Estimate Std. Error t value            Pr(>|t|)    
(Intercept)   36.6717     0.8396  43.676 <0.0000000000000002 ***
danceability  29.9664     0.9950  30.118 <0.0000000000000002 ***
valence      -34.6758     0.6310 -54.957 <0.0000000000000002 ***
acousticness -13.6588     0.5898 -23.158 <0.0000000000000002 ***
energy        17.6033     0.8615  20.432 <0.0000000000000002 ***
liveness      -7.8490     0.8232  -9.535 <0.0000000000000002 ***
speechiness   28.5932     1.6220  17.628 <0.0000000000000002 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 19.81 on 24323 degrees of freedom
Multiple R-squared:  0.223, Adjusted R-squared:  0.2228 
F-statistic:  1163 on 6 and 24323 DF,  p-value: < 0.00000000000000022

Call:
lm(formula = percentrank ~ danceability_. + valence_. + acousticness_. + 
    energy_. + liveness_. + speechiness_., data = Spotify2023_w_per)

Residuals:
     Min       1Q   Median       3Q      Max 
-0.57639 -0.24154  0.00771  0.24394  0.61452 

Coefficients:
                 Estimate Std. Error t value             Pr(>|t|)    
(Intercept)     0.7694026  0.0704776  10.917 < 0.0000000000000002 ***
danceability_. -0.0017713  0.0007302  -2.426             0.015465 *  
valence_.       0.0002610  0.0004684   0.557             0.577509    
acousticness_. -0.0011444  0.0004546  -2.517             0.011990 *  
energy_.       -0.0012116  0.0007479  -1.620             0.105557    
liveness_.     -0.0012121  0.0006833  -1.774             0.076411 .  
speechiness_.  -0.0032265  0.0009511  -3.392             0.000722 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 0.2855 on 945 degrees of freedom
Multiple R-squared:  0.02916,   Adjusted R-squared:  0.023 
F-statistic: 4.731 on 6 and 945 DF,  p-value: 0.00009292

Song Atttributes Over Time 1


How do the song attributes correlate to the time of year it was released?


Spotify’s 2023 Song chart is displayed here.


Danceability rises in hotter months and falls in the winter.

Energy alternates every quarter.


Song Atttributes Over Time 2


BPM is higher from past years. In recent years, peaks in months leading up to summer.

Acousticnesss has his lowest points during the fall season.


Spotify Stock Price Today:

Spotify Stock


Music Preferences Shift With Seasons


Monthly Total Song Streams

***


Visualizations of monthly song statistics from the spotify-2023.csv dataset.

Based on the total number of streams per month from the Spotify 2023 Dataset, it initially appeared January was the best time to release a song as it had the highest streams. However, after taking the average number of streams per song by month, it appeared there was no correlation between release month and popularity.

Is there an ideal time and genre to release a song to maximize its popularity?


Spotify Song Genre Popularity


The top 3 most popular song genres in 2023 are Pop, R&B, and Reggae. Genres were determined through data scraping from multiple music databases.



Song Length Per Year


How has the duration of popular songs changed over the years?


Max song duration of top 30 popular songs dropped from 308 seconds in 2010 to 261 in 2019. Average song duration shows a steady decline over the years 2010-2019.


Songs Under 3 Minutes

Popular Songs Under 3 Minutes by Year
YEAR COUNT
2010 0
2011 0
2012 1
2013 1
2014 3
2015 0
2016 1
2017 0
2018 1
2019 6

How many songs in the top 30 popular songs each year are under 3 minutes?


The 9 year span from 2010-2018 had 7 songs in the top 30 under 3 minutes. In 2019 alone their were 6 songs in the top 30 under 3 minutes. That was a 700% increase from the previous year.


Citations

Chosic. “Music Genre Finder.” Chosic, n.d., https://www.chosic.com/music-genre-finder/.

Pebesma, E., & Bivand, R. (2023). Spatial Data Science: With Applications in R. Chapman and Hall/CRC. https://doi.org/10.1201/9780429459016

learn about sf

Map tutorial

R Core Team (2023). R: A Language and Environment for Statistical Computing. R Foundation for Statistical Computing, Vienna, Austria. https://www.R-project.org/.

Posit team (2023). RStudio: Integrated Development Environment for R. Posit Software, PBC, Boston, MA. URL http://www.posit.co/.

lowikowski K (2023). ggrepel: Automatically Position Non-Overlapping Text Labels with ‘ggplot2’. R package version 0.9.4, https://CRAN.R-project.org/package=ggrepel.

Garrett Grolemund, Hadley Wickham (2011). Dates and Times Made Easy with lubridate. Journal of Statistical Software, 40(3), 1-25. URL https://www.jstatsoft.org/v40/i03/.

Massicotte P, South A (2023). rnaturalearth: World Map Data from Natural Earth. R package version 0.3.4, https://CRAN.R-project.org/package=rnaturalearth.

South A (2017). _rnaturalearthdata: World Vector Map Data from Natural Earth Used in

‘rnaturalearth’_. R package version 0.1.0, https://CRAN.R-project.org/package=rnaturalearthdata.

Wickham H, Averick M, Bryan J, Chang W, McGowan LD, François R, Grolemund G, Hayes A, Henry L, Hester J, Kuhn M, Pedersen TL, Miller E, Bache SM, Müller K, Ooms J, Robinson D, Seidel DP, Spinu V, Takahashi K, Vaughan D, Wilke C, Woo K, Yutani H (2019). “Welcome to the tidyverse.” Journal of Open Source Software, 4(43), 1686. doi:10.21105/joss.01686 https://doi.org/10.21105/joss.01686.