Introduced in 2018, TikTok is a social media app where users can post videos between five and 120 seconds. One major draw of the platform is the ability to add music to their videos, allowing users to dance or lip-sync as the song plays in the background. Furthermore, TikTok also makes it easier to users to discover new and trending songs in a multitude of ways. From the Discover tab to the For You page, it’s no wonder songs go viral seemingly out of nowhere. But even though its time on the marketplace has been short, has TikTok actually made an impact on the music industry? For my passion project, I decided to see if there was a relationship between what songs do well on the Billboard Charts and what songs go viral on TikTok.
To determine what songs I analyzed, I looked at two articles to help me find songs that went viral on TikTok that also did well on Billboard. One was from Seventeen, the other was from Influencer Marketing Hub. In total, I found 30 songs that either made Billboard’s Top 10 chart for 2022 or made the Top 100 Year-End chart for 2020 or 2021. I complied all of those songs into a playlist called “Post-TikTok,” meant to indicate these songs were released after the app gained popularity. This playlist was compiled in November 2022, so some viral songs that saw Billboard success may not be included if it occurred after this date.
I then found the corresponding songs from 2016 or 2017’s Year-End Chart. For example, a song that was ranked #21 in 2021 on “Post-TikTok” meant I added the song that was ranked #21 in 2017 to this playlist. If a song was released in 2022, then I considered how many weeks it spent in the Billboard Top 10 as its “ranking.” These songs were compiled into a playlist called “Pre-TikTok.”
Finally, to determine how exactly I would compare the two playlists, I used the factors listed in this LA Times article and matched it as closely as I could to the spotifyr features. The article determined that there are four factors that make it more likely for a song to become popular: sounding happier, “less relaxed,” more “party-like” and being sung by a female artist. I will explain more about what each of these factors mean in their corresponding sections. Once the playlists and metrics were finalized, it was then time to start coding using R. Here are the packages I used for this project:
library(tidyverse)
## ── Attaching packages ─────────────────────────────────────── tidyverse 1.3.0 ──
## ✓ ggplot2 3.3.3 ✓ purrr 0.3.4
## ✓ tibble 3.1.1 ✓ dplyr 1.0.6
## ✓ tidyr 1.1.3 ✓ stringr 1.4.0
## ✓ readr 1.4.0 ✓ forcats 0.5.1
## ── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
## x dplyr::filter() masks stats::filter()
## x dplyr::lag() masks stats::lag()
library(knitr)
library(spotifyr)
And here is my code for getting the playlists:
spotifyid <- "a255dd0e282247babbe606f8aadde492"
spotifyclientsecret <- "0e3dac9afe3449ca82fb9018ef17df4d"
Sys.setenv(SPOTIFY_CLIENT_ID = spotifyid)
Sys.setenv(SPOTIFY_CLIENT_SECRET = spotifyclientsecret)
access_token <- get_spotify_access_token()
pre_id <- "0oieopOLNNaZndupPRu2m0"
post_id <- "19hcvxDaSuxijckzfRQ8xR"
pre_tiktok <- get_playlist_audio_features(personalspotifyid, pre_id, authorization = access_token)
post_tiktok <- get_playlist_audio_features(personalspotifyid, post_id, authorization = access_token)
First, I determined the gender of the artist(s) of each song. Songs by a male artists were marked with M and female artists were marked with F. If a song featured both a male and female artist, it was marked with a B. The results are shown in this table below:
gendertable <- matrix(c(19,4,7,12,15,3), ncol=3, nrow=2, byrow=TRUE)
colnames(gendertable) <- c("M","F", "B")
rownames(gendertable) <- c("Pre-TikTok", "Post-TikTok")
As you can see, there are almost four times as many songs featuring female artists popular in the post-TikTok playlist than the pre-TikTok playlist. This includes both songs sung by a solo female artist and collaborations between two female artists. Songs by male artists are still fairly popular, while the number of songs by both male and female artists decreased by over half between the two playlists. Therefore, having a song sung by a female artist appears to help a song perform better on TikTok.
Next, I will be determining if songs are more or less relaxed. Based on the definition provided by the LA Times, I will use what I think is the corresponding spotifyr sentiment analysis. Though the article did not clearly define what makes a song more or less relaxed, I decided the closest match was the energy. Energy is a value from 0.0 to 1.0 and describes the intensity of the song. The higher the value, the more intense the song sounds and the less relaxed the listener should feel. I calculated the average energy of both playlists and showed the top 5 songs as well. Here are the values for the pre-TikTok playlist:
pre_tiktok %>% summarize(mean(energy))
## # A tibble: 1 x 1
## `mean(energy)`
## <dbl>
## 1 0.621
pre_tiktok %>% select(track.name, energy) %>%
top_n(5) %>%
arrange(-energy)
## Selecting by energy
## # A tibble: 5 x 2
## track.name energy
## <chr> <dbl>
## 1 Rolex 0.886
## 2 Treat You Better 0.819
## 3 Congratulations 0.804
## 4 24K Magic 0.803
## 5 Cheap Thrills (feat. Sean Paul) 0.8
And here are the results for the post-TikTok playlist:
post_tiktok %>% summarize(mean(energy))
## # A tibble: 1 x 1
## `mean(energy)`
## <dbl>
## 1 0.665
post_tiktok %>% select(track.name, energy) %>%
top_n(5) %>%
arrange(-energy)
## Selecting by energy
## # A tibble: 5 x 2
## track.name energy
## <chr> <dbl>
## 1 Big Energy 0.807
## 2 Dior 0.805
## 3 positions 0.802
## 4 Beggin' 0.8
## 5 Up 0.795
The energy of the post-TikTok songs did increase, but only slightly. It increased by about 7.1%. However, the most energetic song came from the pre-TikTok playlist and was almost 0.08 more than the most energetic song on the post-TikTok playlist. So while the algorithm does appear to slightly favor those types of songs, it is not always a guarantee a song will go viral.
The third factor I analyzed was if songs are more “party-like” post-TikTok. Like with the relaxed analysis, there was no clear definition as to what made a song more “party-like.” The closest sentiment analysis I could find was danceability, which determines how well someone can dance along to a given song. The higher the value, the more likely you can dance and the more likely you could also hear it at a party. Given how frequently TikTok dances seem to go viral, I would assume the average for the post-TikTok is higher. I also displayed the top five songs for each playlist. Here are the pre-TikTok results:
pre_tiktok %>% summarize(mean(danceability))
## # A tibble: 1 x 1
## `mean(danceability)`
## <dbl>
## 1 0.707
pre_tiktok %>% select(track.name, danceability) %>%
top_n(5) %>%
arrange(-danceability)
## Selecting by danceability
## # A tibble: 5 x 2
## track.name danceability
## <chr> <dbl>
## 1 Bad and Boujee (feat. Lil Uzi Vert) 0.926
## 2 Bodak Yellow 0.926
## 3 HUMBLE. 0.908
## 4 Hotline Bling 0.891
## 5 Both (feat. Drake) 0.85
And here are the post-TikTok results:
post_tiktok %>% summarize(mean(danceability))
## # A tibble: 1 x 1
## `mean(danceability)`
## <dbl>
## 1 0.731
post_tiktok %>% select(track.name, danceability) %>%
top_n(5) %>%
arrange(-danceability)
## Selecting by danceability
## # A tibble: 5 x 2
## track.name danceability
## <chr> <dbl>
## 1 WAP (feat. Megan Thee Stallion) 0.935
## 2 Big Energy 0.935
## 3 First Class 0.902
## 4 The Box 0.896
## 5 Sunday Best 0.878
Despite being an app known for dance trends, danceability increased less than energy did. It only increased by 3.4%. The top five also look nearly identical in terms of values, showing how danceability surprisingly does not impact the likelihood of a song going viral as much as I thought it did.
Finally, I examined if songs are happier since the release of TikTok. I used valence for this analysis, since it measures how positive a song sounds. The higher the value, the more upbeat and happy the song must be. Here are the results for the pre-TikTok playlist:
pre_tiktok %>% summarize(mean(valence))
## # A tibble: 1 x 1
## `mean(valence)`
## <dbl>
## 1 0.508
pre_tiktok %>% select(track.name, valence) %>%
top_n(5) %>%
arrange(-valence)
## Selecting by valence
## # A tibble: 5 x 2
## track.name valence
## <chr> <dbl>
## 1 I'm the One (feat. Justin Bieber, Quavo, Chance the Rapper & Lil Wayn… 0.817
## 2 Rolex 0.789
## 3 Treat You Better 0.747
## 4 My House 0.74
## 5 Unforgettable 0.733
And here are the results for the post-TikTok playlist:
post_tiktok %>% summarize(mean(valence))
## # A tibble: 1 x 1
## `mean(valence)`
## <dbl>
## 1 0.597
post_tiktok %>% select(track.name, valence) %>%
top_n(5) %>%
arrange(-valence)
## Selecting by valence
## # A tibble: 5 x 2
## track.name valence
## <chr> <dbl>
## 1 INDUSTRY BABY (feat. Jack Harlow) 0.894
## 2 Up 0.819
## 3 Supalonely 0.817
## 4 Big Energy 0.813
## 5 Say So 0.779
Valence had the biggest increase out of the three averages, increasing by 17.5% between pre-TikTok and post-TikTok. Only one of the top 5 pre-TikTok songs had a valence higher than 0.8, whereas post-TikTok only had one song with a valence lower than 0.8. It appears to have the biggest impact on determining whether or not a song has a chance of going viral.
Valence having the biggest impact may also explain some of the other factors did not perform as well as I expected. If a song sounds happier, it is more likely to also have high energy and danceability values. A sad song is not going to sound intense nor will the average person be willing to dance to it. All of those factors go hand-in-hand, with valence being the most impactful of the three.
One other trend I noticed in this analysis is that songs are getting shorter in length. Both playlists had 30 songs, yet the post-TikTok playlist was shorter by about 15 minutes. I decided to calculate the average length of the songs to see if this could potentially be a trend or just a result of the songs I picked.
pre_tiktok %>% summarize(mean(track.duration_ms))
## # A tibble: 1 x 1
## `mean(track.duration_ms)`
## <dbl>
## 1 220698.
post_tiktok %>% summarize(mean(track.duration_ms))
## # A tibble: 1 x 1
## `mean(track.duration_ms)`
## <dbl>
## 1 192513.
I was given the song length in milliseconds, so I converted the values back to minutes and seconds to understand the averages better. The average song length in the pre-TikTok playlist was about 221 seconds, or 3 minutes and 41 seconds. The average song length in the post-TikTok playlist was about 193 seconds, or 3 minutes and 13 seconds. Using the millisecond averages, song length decreased by 14.6% since TikTok has become more popular. This trend also seems to indicate shorter songs also perform better, which makes sense since videos can be no longer than two minutes. Short sound bites are easier to share and create, so shorter songs that have viral-sound potential are favored by TikTok’s algorithm.
Though TikTok has only been on the market for four years, it has already started making an impact on the music industry. There were already some metrics to help decide if a song was likely to perform well, and TikTok appears to be upholding those standards. All metrics increased, some more drastically than others. Gender and valence appeared to have the biggest impact on determining a song’s potential success on TikTok, whereas energy and danceability did not have as big of an impact as I predicted.
There were some limitations to this project. I only analyzed 60 songs in total across the span of six years. The LA Times, on the other hand, used an algorithm to analyze almost half a million songs released over a span of 30 years. While I did my best to collect as much data as possible, I understand I still have a limited amount to work with. Furthermore, TikTok does not release statistics about how well songs perform, which is why I cross-referenced the articles listed above with the Billboard performance. Neither article also referenced how well the song performed on TikTok, just that the song was popular. Having an official list from TikTok may have informed me better about what makes a song go viral, since I could confirm a song’s performance rather than trust a secondary source.
Finally, while a female artist singing a high valence song may have a better chance, having those factors does not automatically mean the song will go viral. There are a multitude of other potential factors that can impact virality, from how relatable the lyrics are to how well known the artist is. There are some commonalities between viral songs, but there can always be exceptions to trends. For now, TikTok appears to be making its mark on the music industry, and it will be interesting to see what the future holds as the app continues to grow.