SQL Analysis of Spotify Song Dataset

This SQL project explores a rich dataset of songs that includes a wide range of acoustic characteristics such as danceability, energy, valence, tempo, loudness, and instrumentalness, among others. These attributes provide valuable insights into the sonic qualities and emotional tone of music. By using SQLite for data analysis, this project demonstrates how structured queries can be used to uncover patterns in musical features, compare characteristics across different artists, and identify standout tracks based on specific traits. Whether you’re interested in finding the most danceable songs, comparing tempo across time signatures, or discovering artists with high-energy music, this project highlights the power of SQL in audio data exploration.


Idea 1: Find the Top 5 Most Danceable Songs

Why it’s interesting: Songs with the highest danceability scores are great for upbeat playlists or dance compilations.

SELECT song_title, artist, danceability
FROM user
ORDER BY danceability DESC
LIMIT 5;

Output:

song_title artist danceability
Flashwind - Radio Edit Ben Remember 0.984
SexyBack Justin Timberlake 0.967
Check Me Out Like Blaqstarr 0.962
Best Friend Young Thug 0.959
Ice Ice Baby Vanilla Ice 0.959

Idea 2: Compare Average Loudness by Mode (Major/Minor)

Why it’s interesting: This shows whether major or minor songs tend to be louder.

SELECT mode, AVG(loudness) AS avg_loudness
FROM user
GROUP BY mode;

Output:

mode avg_loudness
0 -6.87
1 -7.21

Idea 3: Find Artists with the Most High-Energy Songs

Why it’s interesting: Shows which artists produce the most energetic music (energy > 0.8).

SELECT artist, COUNT(*) AS high_energy_songs
FROM user
WHERE energy > 0.8
GROUP BY artist
ORDER BY high_energy_songs DESC
LIMIT 5;

Output:

artist high_energy_songs
FIDLAR 9
Fall Out Boy 8
Skrillex 7
WALK THE MOON 6
*NSYNC 6

Idea 4: Find the Top 5 Most “Valence” Songs (Happiest Songs)

Why it’s interesting: Songs with high valence tend to sound more cheerful.

SELECT song_title, artist, valence
FROM user
ORDER BY valence DESC
LIMIT 5;

Output:

song_title artist valence
Abataka - Original Mix Ida Engberg 0.992
I’m Walkin’ - 2002 Digital Remaster Fats Domino 0.975
To Roz Bikini (Itsy, Bitsy, Teenie, Weenie) Polina 0.974
Azon de ma gnin kpevi T. P. Orchestre Poly-Rythmo 0.973
Look at You George McCrae 0.973

Idea 5: Analyze the Average Tempo of Songs by Time Signature

Why it’s interesting: Highlights how rhythmic structures (time signatures) correlate with tempo.

SELECT time_signature, AVG(tempo) AS avg_tempo, COUNT(*) AS song_count
FROM user
GROUP BY time_signature
ORDER BY avg_tempo DESC;

Output:

time_signature avg_tempo song_count
3 130.181387096774 93
4 121.21439873083 1891
5 120.36003125 32
1 98.981 1

Idea 6: Find Artists with the Most “Instrumental” Songs

Why it’s interesting: Reveals artists who favor instrumental compositions (instrumentalness > 0.8).

SELECT artist, COUNT(*) AS instrumental_count
FROM user
WHERE instrumentalness > 0.8
GROUP BY artist
ORDER BY instrumental_count DESC
LIMIT 5;

Output:

artist instrumental_count
Parquet Courts 3
deadmau5 2
Washed Out 2
Todd Terje 2
Partial Arts 2