Introduction

Each genre comes from a distinctly different background of music, whether it be from different parts of the world or incorporating starkly different themes and styles. Sleep music is characterized by its sedative qualities, which typically consist of a slow tempo, repetitive rhythm, gentle contours, and strings. On the other hand, Turkish music is typically defined by region-specific origins, commonly played with a wide range of traditional instruments native to Turkey and the surrounding areas. Reggae music is known for its slow and steady tempo, offbeat rhythms, and very socially conscious lyrics regarding politics, community, and positivity. In contrast, Anime music is a genre that consists of songs from Anime TV shows, typically sung through the lyrics of characters and depending on the type of show that the Anime is categorized as.

The key reason that these genres were picked was due to their uniqueness in terms of origin of the genre, and characteristics of music in the genre. As we observed songs in each genre, we were specifically looking at the popularity of the song, guided by the question: what traits do successful, modern songs in specific genres tend to share, and are these patterns the same across a diversity of music genres?

The Dataset

Two datasets were merged in this study. The first dataset was found on Kaggle where it was titled “Spotify Tracks Dataset”. We renamed this to “modern_songs” in our code. The data was collected using Spotify’s Web API. This dataset was used due to its vastness of music metadata variables as well as its size of 114,000 songs. The second dataset was also found on Kaggle and is titled, “Spotify 1.2M+ Songs”. This was renamed to “song_dates” in our code. It was created by downloading the entire MusicBrainz catalog, a public, open music encyclopedia that collects music metadata. From here, the creator of the dataset queried each album’s Universal Product Code via the Spotify API. Afterwards, the tracks were obtained for each of the albums found, again via the Spotify API. This dataset was used in order to add the variable of the song’s release date to the collection of songs listed in the first dataset. After merging the datasets, 10366 songs were left with the variables of the first dataset and the release dates from the second dataset.

The popularity value that we used was calculated based on a formula calculated by the creator of our first dataset that took into account both the amount of listens a song has, and how recent those listens are. Because of this, older songs tend to have lower popularity values as compared to newer songs that have the same amount of listens solely due to the deciding factor of the age of the listens. The variable is in the range of 0 to 100.

All songs in the data used in our study were filtered to have a popularity score greater than 25. This was done because the data demonstrated a significant number of songs with popularity values at or slightly above zero that greatly disrupted the calculations and visualizations. In addition to filtering by popularity, all of the songs were additionally filtered to have a release date of or after the year 2000. This was done to ensure that the patterns analyzed within the songs were relatively modern, because our goal was to analyze the patterns of popularity for modern songs rather than having the presence of any historical trends.

Analysis

The visualization above is a correlation heatmap. Essentially, in a correlation heatmap, each box corresponds to the relationship between the variables listed besides and below the box. The strength of the correlation between the two variables is indicated by the color of the box. In the context of our project, each of the six heatmaps represents the relationships between the different variables of songs faceted by genre. The correlation coefficient is calculated using the Pearson Correlation Coefficient method, so it is based on the linear relationship between variables. The darker red the box is, the more positive and strong the linear relationship between the two variables is. The darker blue the box is, the more negative and strong the linear relationship between the two variables is. Since we are looking at how different aspects of a song relate to its popularity, we only care about the relationship between popularity and other variables for now (the bottom row or leftmost column of each heatmap).

Anime

The top left heatmap contains the relationship between popularity and different aspects of anime songs with a popularity score greater than 25. The strongest correlation between popularity and a different variable is its relationship with tempo. It has a correlation coefficient of about -0.4044, meaning that the linear relationship between popularity and tempo of popular anime songs (popularity > 25) is moderately strong and negative–so as the tempo of the song decreases, the popularity of that song tends to increase by a constant amount.

Strongest two correlations of Anime song popularity with other variables:

      tempo speechiness 
 -0.4043568  -0.3951596 

Reggae

The top right heatmap contains the relationship between popularity and different aspects of reggae songs with a popularity score greater than 25. The strongest correlation between popularity and an aspect of the songs is its relationship with energy. It has a correlation coefficient of about -0.5347, meaning that the linear relationship between popularity and energy of popular reggae pieces is moderately strong and negative–so as the energy of the song decreases, the popularity of that song tends to increase by a constant amount.

Strongest two correlations of Reggae song popularity with other variables:

    energy    valence 
-0.5347477 -0.4619058 

Sleep

The bottom left heatmap contains the relationship between popularity and different aspects of sleep tracks with a popularity score greater than 25. The strongest correlation between popularity and an aspect of the tracks is its relationship with tempo. It has a correlation coefficient of about -0.5052, meaning that the linear relationship between popularity and tempo of popular sleep tracks is moderately strong and negative–so as the tempo of the track decreases, the popularity of that track tends to increase by a constant amount.

Strongest two correlations of Sleep song popularity with other variables:

     tempo   liveness 
-0.5051951  0.4273387 

Turkish

The bottom right heatmap contains the relationship between popularity and different aspects of Turkish songs with a popularity score greater than 25. The strongest correlation between popularity and an aspect of the songs is its relationship with speechiness. It has a correlation coefficient of about -0.5595, meaning that the linear relationship between popularity and speechiness of popular Turkish songs is moderately strong and negative–so as the speechiness of the song decreases, the popularity of that song tends to increase by a constant amount.

Strongest two correlations of Turkish song popularity with other variables:

speechiness         key 
 -0.5594994   0.3619457 

Scatterplots

These heatmaps were created to compare the relationships of the popularity of genres of songs across a diversity of other variables regarding that genre of song. After finding the two variables that have the strongest correlations with these popular songs, we created scatterplots in order to visually analyze the patterns of the songs under these variables.

The visualization above contains the scatterplot of the tempo vs. popularity of popular Anime songs (popularity > 25). In addition, the size of each data point is determined by the speechiness of the song–so as the speechiness increases, the size of the point increases. The tempo of a song describes how fast the song is moving, so a tempo of 150 would mean that the song is moving at a pace of 150 beats per minute. The speechiness variable of a song describes the density and is measured on a scale between 0.0 and 1.0. Tracks with speechiness values above 0.66 are completely spoken audios while lower speechiness tracks have more acoustics and background music. Looking at the scatterplot above, it appears that the relationship between the tempo and popularity of an Anime song is fairly linear, negative, and moderately strong. As displayed before in the heatmap, the correlation coefficient of the relationship is approximately -0.4044, agreeing with the analysis above. One can see how this makes sense as in our background knowledge, Anime songs tend to be associated with the emotion of the characters or the scene while the song is playing, and Animes tend to be very dramatic shows that include very sad, depressing scenes. The speechiness, however, does not appear to have a distinct relationship with popularity. In terms of unusual features, the most distinct outliers appear to be the two, high-speechiness, approximately 180 bpm songs that are located between the 60-70 range of popularity. Those two songs are ALIVE by ClariS and Crossing Field by LiSA.

The visualization above contains the scatterplot of the energy vs. popularity of popular Reggae songs (popularity > 25). In addition, the size of each data point is determined by the valence of the song–so as the valence increases, the size of the point increases. The energy of a song is a value from 0.0 to 1.0 that represents a measure of intensity and activity in the song. Songs with higher energy values will feel fast, loud, and noisy compared to songs with lower energy values. The valence of a song is a value from 0.0 to 1.0 describing the musical positiveness conveyed by a track. Tracks with high valence sound more positive (e.g. happy, cheerful, euphoric), while tracks with low valence sound more negative (e.g. sad, depressed, angry). Looking at the scatterplot above, it appears that the relationship between the energy and popularity of a Reggae song is fairly linear, negative, and moderately strong. As displayed before in the heatmap, the correlation coefficient of the relationship is approximately -0.5347, agreeing with the analysis above. One can see how this makes sense as in our knowledge from research, Reggae songs are not very intense and fast, but rather have slower, more relaxed tones. The valence additionally appears to be very high across nearly all of the data points in the scatterplot, indicating a strong and positive correlation. This makes sense as while reggae songs may not be very intense, they tend to have very positive and happy tones behind them. In terms of unusual features, the most distinct outliers appear to be the three songs with low energy and high popularity, however, they unexpectedly have very low valence–all less than 0.25. Those three songs are Shower by Becky G, DÁKITI by Bad Bunny, and MIA by Bad Bunny.

The visualization above contains the scatterplot of the tempo vs. popularity of popular sleep songs (popularity > 25). In addition, the size of each data point is determined by the liveness of the song–so as the liveness increases, the size of the point increases. Liveness detects the presence of an audience in the recording. Higher liveness values represent an increased probability that the track was performed live. A value above 0.8 provides strong likelihood that the track is live. Looking at the scatterplot above, it appears that the relationship between the tempo and popularity of a sleep song is fairly linear, negative, and moderately strong. As displayed before in the heatmap, the correlation coefficient of the relationship is approximately -0.5052, agreeing with the analysis above. One can see how this makes sense because slower tempos give a more calming and relaxing effect to a song, and the intent of a sleep song is to put one to sleep. Although it is important to notice that when looking at the distribution of tempo across sleep songs in a box plot (shown right), there is very little variation in the tempos. The median tempo is 72.92 bpm, and the IQR is only 6.41 bpm. The liveness tends to be fairly high across all sleep songs, however this does not make sense as sleep songs are not the genre of song that tends to be performed live–it is not for entertainment. In terms of unusual features, the most distinct outliers appear to be the 2 songs in the upper left corner of the graph, with tempos above 125 bpm. These two songs are Blood Moon by Carufo and Aurora by The Destination. Upon listening to these songs, it does not seem that these songs have high bpm, but it is possible that there is a “hidden” aspect of the song with a very high tempo that may assist in helping people to fall asleep.

The visualization above contains the scatterplot of the log(speechiness) vs. popularity of popular Turkish songs (popularity > 25). In addition, the size of each data point is determined by the key of the song–so as the key increases, the size of the point increases. The key indicates the key that the track is in, where integers map to pitches using standard Pitch Class notation (i.e. 0 = C, 1 = C♯/D♭, 2 = D). If no key was detected, the value is -1. Looking at the scatterplot above, it appears that the relationship between the log of the speechiness and popularity of a Turkish song is fairly linear, negative, and moderately strong. One can see how this makes sense because songs that have less prominent speech or vocal elements tend to place a greater emphasis on the instrumental and melodic features. The key, however, does not appear to have a distinct relationship with popularity. In terms of unusual features, the most distinct outlier appears to be the song with a very high popularity score of 70. This song is Rockstar by Ilkay Sencan and Dynoro. This song is a Turkish remix of the very popular song Rockstar by Post Malone.

Conclusion

These scatterplots provide valuable insights into how popular songs within each genre reflect unique patterns of popularity across a diversity of music metadata variables. For example, the genres of Anime, Reggae, and Sleep demonstrated a negative linear relationship with tempo, tempo, and energy (respectively) and popularity. This suggests that in these genres, slower and less intense songs tend to be more successful. Turkish music, however, presents an interesting trend that differs from the others: when plotted directly, speechiness vs popularity form a parabolic pattern, indicating that both low and high popularity values correspond to high speechiness values whereas more mid-range popularity values correspond with lower speechiness values. With the log transformation however, the relationship becomes a clearer negative linear correlation. This suggests that while Turkish music as a whole may favor a balance of vocal presence, the most popular songs tend to have more speech-heavy elements. Overall, these varying patterns highlight that the popularity factors are highly genre-dependent.

Testing

Linear Regression

We initially began our testing by running linear regression for all genres of the strongest correlation variable with popularity vs. popularity.

Loading required package: zoo

Attaching package: 'zoo'
The following objects are masked from 'package:base':

    as.Date, as.Date.numeric
Genre Strongest Variable P-Value R-Squared
Anime tempo 0.0012130 0.1635
Reggae energy 0.0001103 0.2860
Sleep tempo 0.0001250 0.2552
Turkish speechiness 0.0000217 0.3130

The obtained p-value for the single linear regression of popularity and tempo in Anime songs was 0e+00, a value significantly smaller than our predetermined benchmark of 0.05. Because of this, we are able to reject the null hypothesis which assumed that there is no statistically significant relationship between the variables of popularity and tempo as attributes of an Anime song. This result of hypothesis testing revealed that an increase in tempo has a negative impact on the anime song’s popularity. These results confirmed the relationship that was shown in the scatterplot of Anime songs with the variables of popularity and tempo.

Additional single linear regression tests were done on the Reggae, Sleep, and Turkish songs that looked at the relationships between energy and popularity, tempo and popularity, and speechiness and popularity (respectively). The resulting p-values were 0e+00, 4e-07, and 4e-07 (respectively). All of these p-values fell under our predetermined benchmark of 0.05 which allowed us to reject the three null hypotheses which assumed that there is no statistically significant relationship between the variables of popularity and energy, tempo, or speechiness (depending on the given genre being tested). These findings indicate that in the Reggae genre, an increase in energy has a negative impact on a song’s popularity; in the Sleep genre, an increase in tempo has a negative impact on the popularity; and in Turkish music, an increase in speechiness has a negative impact on the song’s popularity. These results for each genre confirm the relationship that was demonstrated in their respective graph of popularity versus energy, tempo, or speechiness (respectively).

While the calculated p-values all allow us to reject the null hypotheses that there is no relationship between the above genres and their strongest variable, it is also important to note the weak nature of all the calculated R squared values above. R-squared values communicate to us how much of the variability in the response variable, in this case popularity, can be explained by the model. We will address this issue later in the testing process.

Multiple Linear Regression

We then ran multiple linear regression for the top two strongest variables.

Genre Strongest Variable Second Strongest Variable P-Value Adjusted R-Squared
Anime tempo speechiness 0.0009839 0.1861
Reggae energy valence 0.0003214 0.2729
Sleep tempo liveness 0.0000061 0.3682
Turkish speechiness key 0.0000004 0.4468

A multiple linear regression was conducted on Anime songs, incorporating popularity as the dependent variable and tempo and speechiness as the independent variables. With a p-value of 0.0009840 which was significantly below our predetermined benchmark of 0.05. This indicates that we can reject the null hypothesis that suggested there is no statistically significant relationship between the popularity of an Anime song and the variables of speechiness and tempo. Therefore, there is evidence to suggest that when considered together, the variables of speechiness and tempo have a negative relationship with the popularity of an Anime song. This evidence confirms the negative relationship as seen in the scatterplot of the Anime songs with multiple variables.

Similarly, three additional multiple linear regression analyses were conducted on the Reggae, Sleep, and Turkish songs with popularity as the dependent variable and the following independent variables: energy and valence for Reggae, tempo and liveliness for Sleep, and speechiness and key for Turkish music. The resulting p-values were 0.0003214, 0.0000061, and 0.0000004 (respectively). All of these values fall under our predetermined benchmark of 0.05 which indicates that we can reject the null hypotheses for each regression model, which assumed that, when considered together, these predictor variables do not affect a song’s popularity in their respective genres. This testing has indicated that when considered together, energy and valence have a negative relationship with the popularity of Reggae songs, tempo and liveliness have a negative relationship with the popularity of Sleep songs, and speechiness and key have a negative relationship with the popularity of Turkish songs. This pattern confirms that the relationships demonstrated in the various scatterplots of the Reggae, Sleep, and Turkish genres with multiple variables.

Again, it is important to note the weak adjusted R squared values. This problem will be addressed in the next and final set of tests.

And finally, we ran multiple linear regression or linear regression within ALL genres against variables with correlations with popularity greater than or equal to 0.5.

✅ Running model for genre: children | Vars: duration_ms, energy, loudness, acousticness, year
✅ Running model for genre: folk | Vars: danceability
✅ Running model for genre: funk | Vars: speechiness, acousticness
✅ Running model for genre: hardcore | Vars: danceability, energy, acousticness, year
✅ Running model for genre: punk-rock | Vars: mode
✅ Running model for genre: reggae | Vars: energy
✅ Running model for genre: salsa | Vars: year
✅ Running model for genre: sleep | Vars: tempo
✅ Running model for genre: trance | Vars: duration_ms, instrumentalness, tempo
✅ Running model for genre: turkish | Vars: speechiness
Genre Number of Variables (Excluding Popularity) P-Value Adjusted R-Squared HC P-Value
children 5 0.0000000 0.6255 0.0000002
folk 1 0.0000000 0.4798 0.6228000
funk 2 0.0006304 0.3777 0.0014410
hardcore 4 0.0000012 0.3938 0.0000008
punk-rock 1 0.0005594 0.2587 0.0223000
reggae 1 0.0002206 0.2685 0.4523000
salsa 1 0.0015240 0.2727 0.0126300
sleep 1 0.0002500 0.2390 NA
trance 3 0.0000000 0.4385 0.0000000
turkish 1 0.0000434 0.2978 NA

Looking at the results of the tests above, we are now seeing higher quantities for the adjusted R-squared values. The most notable genres are the children’s music genre and the folk music genre, with adjusted R-Squared values of 0.6255 and 0.4798 (respectively). The variables in the children’s music genre that had correlation coefficients with the variable popularity were the variables: duration of song, energy of song, loudness of song, the acousticness of the song, and the year that the song was released. In contrast, the variable in the folk music genre that had correlation coefficients with the variable popularity was the variable danceability. The p-values for the two genres were both 0, below the threshold of 0.05, allowing us to reject both null hypotheses that there is no relationship between the popularity of the two genres and the variables listed above. These tests serve the purpose of demonstrating that the various genres of music have starkly different components that define the popularity of songs in the genre.

However, there is now a new calculated column in the dataset–the Harvey Collier p-value. The HC test is a test of linearity of the data, where the null hypothesis states that the two variables have a linear relationship. However, for almost all of the genres in this table, excluding folk and reggae, we have a p-value low enough (< 0.05) to reject this null hypothesis. This further means that we have enough evidence to conclude that the variables used in the testing of those genres against popularity do not have a linear relationship with popularity. To further examine this, we will take the model for children’s songs and run multiple polynomial regression for different degrees. We used the same five variables used in the table above (duration of song, energy of song, loudness of song, the acousticness of the song, and the year that the song was released).

Multiple Polynomial Regression

Polynomial Degree Adjusted R-Squared P-Value
1 0.6254674 0
2 0.7391847 0
3 0.7634318 0
4 0.7919307 0
5 0.8118535 0

In the table above, we can see that as the degree of the polynomial increases, the adjusted R squared value also changes. If we were to train a model and test its accuracy with a train-test split, we would be wary of overfitting, and likely not use the highest degree polynomial despite its R squared value.

Final Conclusion

It is important to emphasize that our analysis does not aim to make predictions or develop a predictive model. Rather, our use of the data was strictly to uncover inferential trends and generalizations. In terms of extending this research, it would be valuable to incorporate additional datasets that include the same songs but with a more comprehensive range of variables. For instance, having access to revenue-related data could help attach a more tangible and quantifiable measure to the abstract concept of popularity. Speechiness, key, tempo, energy, valence, and liveness are only a few of the infinite components that constitute a song. Thus, further exploration with a broader range of diverse variables would be necessary in order to find more patterns that could support the discovery of inferential trends about popularity within each respective genre of music.

If chosen to pursue the path of predictive analysis rather than statistical inference, using a polynomial regression in each of the genres would be essential. The test would involve running every variable against popularity for each respective genre at varying degrees. This testing would better meet the conditions for predictive modeling to occur. Predictive modeling would allow for the creation of a formula for maximizing the popularity of a song based on a diversity of variables in the respective genre. Ultimately, whether through inferential analysis or predictive modeling, the richness of musical data holds immense potential for uncovering the complex dynamics behind a song’s popularity.