As this Vox article explains, the history of pop song length has been based on the mechanics of getting the song on the radio by having it fit on a 45 record. This means that songs would have to be around 3 minutes long. I decided to use the Spotify song dataset Mark Hageman and I are using for our final project in order to see how long the roughly 33,000 songs in the data set tend to be. Looking at the plot it seems to fit with the narrative that most songs are 3-4 minutes long, including both popular and unpopular songs. For the x axis, the original track length data is in milliseconds, so I converted it to minutes.
songs <- read.csv('spotify_songs.csv', stringsAsFactors = FALSE)
songs %>% ggplot(aes(x = duration_ms/60000, y = track_popularity)) +
geom_point(alpha = .25) +
scale_y_continuous(name = "Track Popularity") +
scale_x_continuous(name = "Track Duration in Minutes") +
geom_smooth() +
ggtitle("Spotify Songs", subtitle = "Popularity of songs vs. the length of the song")
## `geom_smooth()` using method = 'gam' and formula 'y ~ s(x, bs = "cs")'