1 Introduction

K-pop has become a truly global phenomenon thanks to its distinctive blend of addictive melodies, slick choreography, and production values, and an endless parade of attractive South Korean performers. They spend years in grueling studio systems learning to sing and dance in synchronized perfection. As a music streaming and media services provider, Spotify gives millions of streamers across the globe to listen to this extraordinary music genre. According to Spotify*, the top most Top-streamed K-Pop artists on Spotify include BTS, BLACKPINK, EXO, TWICE, and Red Velvet. Hence, we will dig the insight based on these 5 idols.

2 Digging The Data

I will extract the data through Spotify’s API with the following steps. To do this, we need to get Spotify client id and secret password from Spotify for developer* with the necessary library.

library(spotifyr)   #to connect R studio with the developer features
library(tidyverse)
library(lubridate)
library(cluster)
library(factoextra)
library(ggforce)
library(GGally)
library(scales)
library(cowplot)
library(FactoMineR)
library(factoextra)
library(plotly)
options(scipen = 100, max.print = 101)
Sys.setenv(SPOTIFY_CLIENT_ID = 'e78eff363c1f46e5a243dcf01842e0bd')
Sys.setenv(SPOTIFY_CLIENT_SECRET = 'e811403428c840b0a62ac5a35f5a3e42')

access_token <- get_spotify_access_token()
BLACKPINK <- get_artist_audio_features('BLACKPINK')
TWICE <- get_artist_audio_features('TWICE')
RED_VELVET <- get_artist_audio_features('Red Velvet')
BTS <- get_artist_audio_features('BTS')
EXO <- get_artist_audio_features('EXO')
kpop <- rbind(BTS, BLACKPINK, EXO, TWICE, RED_VELVET)
rmarkdown::paged_table(kpop)

3 Feature Selection

From the data frame above, we will make a feature selection in our analysis. The feature selection is based on audio features, and since each artist varies in terms of the number of songs they have, we will use only ten tracks from each artist based on top tracks category stored in Spotify’s library.

BTS_10 <- get_artist_top_tracks('3Nrfpe0tUJi4K4DXYWgMUX')
BLACKPINK_10 <- get_artist_top_tracks('41MozSoPIsD1dJM0CLPjZF')
EXO_10 <- get_artist_top_tracks('3cjEqqelV9zb4BYE3qDQ4O')
TWICE_10 <- get_artist_top_tracks('7n2Ycct7Beij7Dj7meI4X0')
RED_VELVET_10 <- get_artist_top_tracks('1z4g3DjTBBZKhvAroFlhOM')

kpop_10 <- rbind(BTS_10, BLACKPINK_10, EXO_10, TWICE_10, RED_VELVET_10)
rmarkdown::paged_table(kpop_10)
bts_name <- BTS %>% 
  head(10) %>%
  select(artist_name)
blackpink_name <- BLACKPINK %>% 
  head(10) %>%
  select(artist_name)
exo_name <- EXO %>% 
  head(10) %>%
  select(artist_name)
twice_name <- TWICE %>% 
  head(10) %>%
  select(artist_name)
red_velvet_name <- RED_VELVET %>% 
  head(10) %>%
  select(artist_name)

kpop_name <- rbind(bts_name, blackpink_name, exo_name, twice_name, red_velvet_name)

kpop_data <- cbind(get_track_audio_features(kpop_10$id),kpop_name, kpop_10 %>% select(name))
rmarkdown::paged_table(kpop_data)
kpop_data <- kpop_data %>% 
  select(danceability, energy, loudness, speechiness, acousticness, instrumentalness, liveness, valence, tempo, artist_name, name)

The explanantion for each variable are:

  • artist_name: Name of the artists
  • name: Name of the song
  • danceability: Danceability describes how suitable a track is for dancing based on a combination of musical elements including tempo, rhythm stability, beat strength, and overall regularity. A value of 0.0 is least danceable and 1.0 is most danceable.
  • energy: Energy is a measure from 0.0 to 1.0 and represents a perceptual measure of intensity and activity. Typically, energetic tracks feel fast, loud, and noisy. For example, death metal has high energy, while a Bach prelude scores low on the scale. Perceptual features contributing to this attribute include dynamic range, perceived loudness, timbre, onset rate, and general entropy.
  • loudness: The overall loudness of a track in decibels (dB). Loudness values are averaged across the entire track and are useful for comparing relative loudness of tracks. Loudness is the quality of a sound that is the primary psychological correlate of physical strength (amplitude). Values typical range between -60 and 0 db.
  • speechiness: Speechiness detects the presence of spoken words in a track. The more exclusively speech-like the recording (e.g. talk show, audio book, poetry), the closer to 1.0 the attribute value. Values above 0.66 describe tracks that are probably made entirely of spoken words. Values between 0.33 and 0.66 describe tracks that may contain both music and speech, either in sections or layered, including such cases as rap music. Values below 0.33 most likely represent music and other non-speech-like tracks.
  • acousticness: A confidence measure from 0.0 to 1.0 of whether the track is acoustic. 1.0 represents high confidence the track is acoustic.
  • instrumentalness: Predicts whether a track contains no vocals. “Ooh” and “aah” sounds are treated as instrumental in this context. Rap or spoken word tracks are clearly “vocal”. The closer the instrumentalness value is to 1.0, the greater likelihood the track contains no vocal content. Values above 0.5 are intended to represent instrumental tracks, but confidence is higher as the value approaches 1.0.
  • liveness: Detects the presence of an audience in the recording. Higher liveness values represent an increased probability that the track was performed live. A value above 0.8 provides strong likelihood that the track is live.
  • valence: A measure from 0.0 to 1.0 describing the musical positiveness conveyed by a track. Tracks with high valence sound more positive (e.g. happy, cheerful, euphoric), while tracks with low valence sound more negative (e.g. sad, depressed, angry).
  • tempo: The overall estimated tempo of a track in beats per minute (BPM). In musical terminology, tempo is the speed or pace of a given piece and derives directly from the average beat duration.
colSums(is.na(kpop_data))
#>     danceability           energy         loudness      speechiness 
#>                0                0                0                0 
#>     acousticness instrumentalness         liveness          valence 
#>                0                0                0                0 
#>            tempo      artist_name             name 
#>                0                0                0

The data absensce from NA value. We will use this data for our analysis.

4 Explanatory Analysis

4.1 Posibility for clustering

We will try to see the posibility for segmenting the song based on several features with graphics.

ggplot(kpop_data, aes(energy, tempo, color = artist_name, size = danceability)) + 
    geom_point(alpha = 0.8) + theme_minimal()

We see here some songs tend to form a group. Songs from BTS tend to separate from TWICE, while EXO’s relatively form into two groups.

4.2 Posibility for Principle Component Analysis (PCA)

We want to see whether there is a high correlation between numeric variables. Strong relationships in some variables imply that we can reduce the dimensionality or number of features using the Principle Component Analysis (PCA).

ggcorr(kpop_data, label = TRUE, label_size = 2.9, hjust = 1, layout.exp = 2)

Some features have a high correlation, such as energy with loudness. Based on this result, we will try to reduce the dimension using PCA.

5 Data Pre-processing

We will scale each numerical feature to avoid any problem since we will use distance in k-means clustering.

kpop_z <- scale(kpop_data[,-c(10:11)])

Before we do clustering, we need to find optimum k to determine optimum cluster. We seek to minimize the total within-cluster sum of squares (meaning that the distance is minimum between observation in the same cluster).

fviz_nbclust(kpop_z , kmeans, method = "wss")

fviz_nbclust(kpop_z , kmeans, method = "silhouette")

From the plots, we can see that 2 is the optimum number of K. After k=3, increasing the number of K does not result in a considerable decrease of the total within the sum of squares (strong internal cohesion) nor a significant increase of between sum of square and between/total sum of squares ratio (maximum external separation).

6 K-Means Clustering

# k-means clustering
set.seed(11)
kpop_k <- kmeans(kpop_z, 2)

# result analysis
kpop_k
#> K-means clustering with 2 clusters of sizes 39, 11
#> 
#> Cluster means:
#>   danceability     energy   loudness speechiness acousticness instrumentalness
#> 1    0.2480285  0.3749792  0.2613429   0.1676063   -0.2028869       0.07662187
#> 2   -0.8793738 -1.3294716 -0.9265793  -0.5942407    0.7193264      -0.27165937
#>     liveness    valence       tempo
#> 1  0.1156823  0.2895943  0.04897683
#> 2 -0.4101464 -1.0267436 -0.17364514
#> 
#> Clustering vector:
#>  [1] 1 1 2 2 2 1 2 2 2 2 1 1 1 1 1 1 1 1 1 2 2 1 1 1 2 1 1 1 1 1 1 1 1 1 1 1 1 1
#> [39] 1 1 2 1 1 1 1 1 1 1 1 1
#> 
#> Within cluster sum of squares by cluster:
#> [1] 283.54718  78.53108
#>  (between_SS / total_SS =  17.9 %)
#> 
#> Available components:
#> 
#> [1] "cluster"      "centers"      "totss"        "withinss"     "tot.withinss"
#> [6] "betweenss"    "size"         "iter"         "ifault"

We see that the number of clusters falls into two forms with respective characteristics. The first cluster has more energy and danceability, while the second cluster forms a considerable score from the variable of acousticness.

6.1 Data Preparation for Visualization & Profiling

kpop_data$cluster <- as.factor(kpop_k$cluster)

rmarkdown::paged_table(kpop_data)
fviz_cluster(object = kpop_k, 
             data = kpop_z)

kpop_data %>%
  group_by(cluster) %>% 
  summarise_all(.funs = "mean") %>% 
  select(-c(name, artist_name))
prop.table(table(kpop_data$artist_name, kpop_data$cluster))
#>             
#>                 1    2
#>   BLACKPINK  0.18 0.02
#>   BTS        0.06 0.14
#>   EXO        0.16 0.04
#>   Red Velvet 0.18 0.02
#>   TWICE      0.20 0.00

From the profiling we find that:

  1. Cluster 1 consist for more energy, danceability, loudness while cluster number two is the cluster with more “slow” and “more relaxing” song.

  2. Female groups (Blackpink, Red Velvet, and Twice) fall into groups with more dance.

7 Building Biplot Using Principal Component Analysis

kpop_pca <- PCA(kpop_data %>% 
                  select(-c(artist_name, name,cluster)), scale.unit = T, ncp = 31, graph = F)

summary(kpop_pca)
#> 
#> Call:
#> PCA(X = kpop_data %>% select(-c(artist_name, name, cluster)),  
#>      scale.unit = T, ncp = 31, graph = F) 
#> 
#> 
#> Eigenvalues
#>                        Dim.1   Dim.2   Dim.3   Dim.4   Dim.5   Dim.6   Dim.7
#> Variance               2.784   1.356   1.159   0.980   0.899   0.744   0.508
#> % of var.             30.936  15.065  12.877  10.894   9.993   8.271   5.645
#> Cumulative % of var.  30.936  46.001  58.878  69.772  79.765  88.035  93.681
#>                        Dim.8   Dim.9
#> Variance               0.410   0.159
#> % of var.              4.557   1.763
#> Cumulative % of var.  98.237 100.000
#> 
#> Individuals (the 10 first)
#>                      Dist    Dim.1    ctr   cos2    Dim.2    ctr   cos2  
#> 1                |  1.785 |  0.401  0.115  0.050 | -0.995  1.461  0.311 |
#> 2                |  1.572 |  0.339  0.082  0.046 | -0.324  0.155  0.042 |
#> 3                |  2.483 | -0.875  0.550  0.124 | -0.448  0.296  0.033 |
#> 4                |  5.296 | -4.905 17.283  0.858 |  1.077  1.712  0.041 |
#> 5                |  2.132 | -1.370  1.348  0.413 | -0.676  0.675  0.101 |
#> 6                |  2.461 |  0.444  0.142  0.033 | -1.797  4.766  0.533 |
#>                   Dim.3    ctr   cos2  
#> 1                 0.687  0.813  0.148 |
#> 2                 0.160  0.044  0.010 |
#> 3                 0.157  0.042  0.004 |
#> 4                -0.181  0.057  0.001 |
#> 5                 0.664  0.761  0.097 |
#> 6                 0.781  1.053  0.101 |
#>  [ reached getOption("max.print") -- omitted 4 rows ]
#> 
#> Variables
#>                     Dim.1    ctr   cos2    Dim.2    ctr   cos2    Dim.3    ctr
#> danceability     |  0.543 10.598  0.295 | -0.401 11.865  0.161 |  0.362 11.289
#> energy           |  0.890 28.420  0.791 |  0.168  2.086  0.028 | -0.179  2.750
#> loudness         |  0.638 14.611  0.407 |  0.274  5.545  0.075 | -0.594 30.474
#> speechiness      |  0.317  3.618  0.101 |  0.325  7.780  0.105 |  0.532 24.391
#> acousticness     | -0.601 12.975  0.361 |  0.225  3.731  0.051 |  0.027  0.062
#> instrumentalness |  0.385  5.337  0.149 |  0.393 11.370  0.154 |  0.586 29.618
#> liveness         |  0.389  5.434  0.151 | -0.184  2.502  0.034 | -0.075  0.479
#>                    cos2  
#> danceability      0.131 |
#> energy            0.032 |
#> loudness          0.353 |
#> speechiness       0.283 |
#> acousticness      0.001 |
#> instrumentalness  0.343 |
#> liveness          0.006 |
#>  [ reached getOption("max.print") -- omitted 2 rows ]
fviz_eig(kpop_pca, ncp = 15, addlabels = T, main = "Variance explained by each dimensions")

80% of the variances can be explained by only using the first 5 dimensions, with the first two dimensions can explain 46% of the total variances.

fviz_pca_var(kpop_pca, select.var = list(contrib = 31), col.var = "contrib", 
    gradient.cols = c("#00AFBB", "#E7B800", "#FC4E07"), repel = TRUE)

This plot explains 46% variance from the data and describes each variable characteristics. A song with high danceability tends to have less acoustics, while loudness and energy share similar characteristics.

kpop_x <- data.frame(kpop_pca$ind$coord[,1:3])
kpop_xc <- cbind(kpop_x, cluster = kpop_data$cluster, song = kpop_data$name, kpop_name)

plot_ly(kpop_xc, x = ~Dim.1, y = ~Dim.2, z = ~Dim.3, color = ~cluster, colors = c('black', 'red', 'green'), text = ~paste(kpop_xc$artist_name, kpop_xc$song)) %>%
  add_markers() %>%
  layout(scene = list(xaxis = list(title = 'Dim.1'),
                     yaxis = list(title = 'Dim.2'),
                     zaxis = list(title = 'Dim.3')))