Introduction

The music industry is constantly growing and evolving. This continuous transformation pushes artists to create original yet audience-appealing songs. My goal is to determine if there are trends between musical elements and level of popularity. I have chosen to look at the works of Charlie Puth and Billie Eilish for this project. Both of these artists are known for their passion of music theory and innovative thinking. All data for this analysis was gathered through the spotifyr packaged. Access to the Spotify API is needed for many of the functions. More information on the package can be found at: https://www.rdocumentation.org/packages/spotifyr/versions/2.2.3

Required Libraries

Charlie Puth

Biography

Charlie Puth

Charlie Puth is a well accomplished singer, songwriter, and music producer. At only 30 years old, he has received numerous awards and worked with Selena Gomez, Pitbull, Meghan Trainer, Wiz Khalifa, James Taylor, Kehlani, and Blackbear, among several other musicians. He frequently writes songs for fellow singers, some of the most notable being “So Good” for Zara Larsson, “Lips on You” for Maroon 5, “Bedroom Floor” for Liam Payne, “Slow Motion” for Trey Songs, and “I Feel Good” for Thomas Rhett. Puth began studying music in his hometown of Rumson, New Jersey. He started to play piano at only four years old. Charlie attended the Manhattan School of Music Pre-College where he focused in jazz piano and classical studies. Puth later graduated the Berklee College of Music in 2013 with a degree in music production and engineering. He first gained popularity on YouTube in 2011, which then led to Ellen DeGeneres signing Puth for her label. In 2015 he signed with Atlantic Records. This is where Charlie co-created one of his most famous songs “See You Again”. It was number one on the US Billboard Top 100 for 12 weeks in a row. “Attention”, which is part of the Voicenotes album, has over 1,317,000,000 streams on Spotify. He has around 36,000,000 monthly listeners on Spotify alone. Charlie has announced on social media that he is planning on releasing a new album this year.

All Songs

I created a playlist of all of Charlie Puth’s songs in Spotify. Remixes and repeats from different versions of albums were excluded so each song only appears once. The playlist Spotify ID and access to the API are necessary to import the songs into R.

all <- get_playlist_audio_features('charlie',
                                   '7qa39szrp4n42SXxT0VgwP',
                                   access_token)
charlie <- all %>% 
  select(track.name, key_name, mode_name, key_mode, key, mode, tempo,
         time_signature, track.popularity, danceability, energy, loudness,
         speechiness, acousticness, instrumentalness, liveness, valence,
         track.album.name, track.album.release_date, track.album.total_tracks,
         track.duration_ms, track.explicit, track.track_number,
         track.album.album_type)

The get_playlist_audio_features() function produces 61 variables. I pulled variables I believe will be useful for this analysis and put them into their own data frame, charlie. My goal is to find if there are trends in the characteristics of popular songs. For example, if songs with a certain tempo are higher in popularity.

Key Signatures

keys <- charlie %>% 
  count(key_mode, sort = TRUE)
ggplot(keys, aes(x = key_mode, y = n, fill = key_mode)) + geom_col() +
  labs(title = "Key Signatures", x = "key", y = "song count")+
  scale_fill_hue(h = c(180, 270)) +
  theme_minimal() +
  theme(axis.text.x = element_text(angle = 45, hjust = 1)) +
  theme(legend.position = "none")

Charlie Puth has released songs in a wide variety of key signatures. Below is a picture of the circle of fifths. I included it because it is an organized way to see which key signatures contain sharps and which contain flats. It is not necessary to understand how it works, but it will help with my explanation.

The Circle of Fifths

There is not one key signature that Puth strongly favors. But he does use significantly more sharp key signatures than flat ones (this is where the circle of fifths come in handy). There are some key signatures that appear in the graph but are not shown on the circle of fifths. I am guessing this is because it is easier to clearly denote a sharp than a flat. While typing, “b” is often used to represent flat as it is the most similar looking symbol that is accessible on a keyboard without any special characters. I am guessing that to avoid any confusion, Spotify decided to not use flats and instead list their enharmonic equivalent. For example, G-sharp major and A-flat major are technically the same key. They both include the same tones, but just represent them differently. G-sharp major is actually a theoretical key, but was listed for clarity purposes. (Note that this is only speculation, in order to confirm my suspicion I would need to analyze the sheet music for the songs.)

Charlie Puth has a very equal amount of songs in the major and minor modes, only differing by two. Half of the top 10 songs are major and the other half are minor. Major keys generally sound happy and upbeat. Minor keys have more of a melancholy and sad sound. Charlie’s ability to produce songs that cater to both sides of listeners speaks to his talent.

Time Signatures

This graph shows an obvious preference for a time signature. Only four songs are in 3/4: “Mother”, “Through It All”, “Dangerously”, and “Suffer”. The limitation of the Spotify API for this variable must be noted. It assumes that every song is in quadruple meter (x/4) and restricts the top to numbers 3-7. This excludes all compound meters. 6/8 is a very common time signature, and it cannot be accurately represented in this data.

Fun Fact: Charlie Puth has many similarities to Mozart. Both were playing piano by four years old, have/had perfect pitch, a musician parent and started intensely studying music at a young age.

Averages

The ten variables in the table below measure different aspects of the way a song sounds. Acousticness, danceability, energy, speechiness, instrumentalness, liveness, and valence are all measured on a scale from 0.0 to 1.0. Loudness has a range of -60 to 0 dB. Tempo and duration do not have set ranges. Definitions for all of these terms can be found at: https://developer.spotify.com/documentation/web-api/reference/#/operations/get-audio-features

variable average
Acousticness 0.3955
Tempo 115.7482
Danceability 0.6765
Energy 0.5853
Loudness -6.3362
Speechiness 0.0885
Instrumentalness 0.0020
Liveness 0.1177
Valence 0.5281
Track Duration (ms) 199631.2000

The ranges for these variables in the data can be gauged from the following boxplots:

More visualizations and further analysis for the above features are included later on in this report.

Billie Eilish

Biography

Billie Eilish is 20-year-old singer-songwriter superstar. She rose to fame in 2015, at only 13, with the song “Ocean Eyes”. It was written by older brother, Finneas, who remains to be Billie’s collaborator, co-writer, and producer. They were born into a LA-based musical family and began pursuing creative activities from a young age. Billie is the youngest artist to reach 1 billion streams on Spotify. She also broke the record for youngest musician to produce the title track for a James Bond movie. Eilish has a very unique and distinct sound. She often explores ‘darker’ topics, contributing to her original style. Billie is iconic in the fashion world as well. Her signature oversized look eventually developed into its own apparel line. One of Billie’s most popular songs is a collaboration with Khalid. “Lovely” has just under 1,780,500,000 streams on Spotify. Soon following that release Billie was named Billboard ’s Woman of the Year. In 2020 Eilish once again made history by becoming the first woman, and second person ever, to sweep the big four Grammys: Album of the Year, Record of the Year, Song of the Year, and Best New Artist (plus the Best Pop Vocal Album Grammy). Her documentary Billie Eilish: The World’s a Little Blurry came out in 2021. Her ethereal voice consistently sells out stadiums as she continues to dominate the music industry.

Billie Eilish

All Songs

all2 <- get_playlist_audio_features('billie',
                                   '36H5WwesgD0XtYkxTcfj18',
                                   access_token)
billie <- all2 %>% 
  select(track.name, key_name, mode_name, key_mode, key, mode, tempo,
         time_signature, track.popularity, danceability, energy, loudness,
         speechiness, acousticness, instrumentalness, liveness, valence,
         track.album.name, track.album.release_date, track.album.total_tracks,
         track.duration_ms, track.explicit, track.track_number,
         track.album.album_type)

The process for creating the billie data frame is exactly the same as the charlie one.

Key Signatures

keysb <- billie %>% 
  count(key_mode, sort = TRUE)
ggplot(keysb, aes(x = key_mode, y = n, fill = key_mode)) + geom_col() +
  labs(title = "Key Signatures", x = "key", y = "song count")+
  scale_fill_hue(h = c(180, 270)) +
  theme_minimal() +
  theme(axis.text.x = element_text(angle = 45, hjust = 1)) +
  theme(legend.position = "none")

Billie has songs in even more key signatures than Charlie Puth. This is an interesting finding because many artists often favor one or a few, then create most of their songs in that key. The diversity emphasizes the musical expertise of Billie and her fellow songwriter Finneas.

Billie also has a very even numbers of major and minor songs. Overall, there are four more songs in the major mode. In the top ten there is one more major key song than there is minor. This is slightly surprising to me, as many of Billie’s songs have portray emotions that are more typical of minor keys.

Time Signatures

86% of Billie Eilish songs are in 4/4. The exceptions are “Everybody Dies”, “Happier Than Ever”, “!!!!!!!”, “listen before i go”, “goodbye”, “idontwannabeyouanymore”, and “hostage”. The same limitation that was mentioned in the Charlie Puth applies to this data.

Fun Fact: Billie is a huge fan of The Office, part of her song “My Strange Addiction” features clips from one of the episodes (“Threat Level Midnight”).

Averages

A quick reminder: the ten variables in the table below measure different aspects of the way a song sounds. Acousticness, danceability, energy, speechiness, instrumentalness, liveness, and valence are all measured on a scale from 0.0 to 1.0. Loudness has a range of -60 to 0 dB. Tempo and duration do not have set ranges.

variable average
Acousticness 0.6631
Tempo 112.0870
Danceability 0.6014
Energy 0.3143
Loudness -12.5209
Speechiness 0.1224
Instrumentalness 0.1422
Liveness 0.1688
Valence 0.2638
Track Duration (ms) 199844.0000

The ranges for these variables in the data can be gauged from the following boxplots:

Comparisons

Key Signatures

Both artists have released songs in a large variety of key signatures; Charlie Puth in 16 and Billie Eilish in 20. There are some utilized by one musicians and not the other. Exclusive to Charlie is C minor, D minor, D# minor, and F major. Exclusive to Billie are A major, A minor, B minor, D# major, F minor, G major, G minor, and G# major.

Time Signatures

The majority of Charlie Puth and Billie Eilish songs are in 4/4. Only simple meters are possible options in these data frames, ideally the time_signature variable would be more inclusive.

Correlations

Correlation is a measure of the strength of the relationship between variables. It is represented by a number called the correlation coefficient. They range from -1 to 1, with 0 meaning there is no relationship and higher values indicate a strong relationship. The positive and negative depict the slope of the line. Side-by-side scatter plots for each artist have been created. These types of graphs allow for quick recognition of relation. Each plot has a caption underneath with the correlation coefficient.

Acousticness

c1 <- ggplot(charlie, aes(x = acousticness, y = track.popularity,
                 color = factor(acousticness))) +
  geom_point() +
  labs(title = "Charlie Puth",
       subtitle = "Acousticness vs. Popularity",
       x = "acousticness", y = "popularity",
       caption = "Correlation Coefficient: -0.1475382") +
  scale_color_hue(h = c(180, 270))+
  theme_minimal() +
  theme(legend.position = "none")
b1 <- ggplot(billie, aes(x = acousticness, y = track.popularity,
                 color = factor(acousticness))) +
  geom_point() +
  labs(title = "Billie Eilish",
       subtitle = "Acousticness vs. Popularity",
       x = "acousticness", y = "popularity",
       caption = "Correlation Coefficient: 0.0357542") +
  scale_color_hue(h = c(180, 270))+
  theme_minimal() +
  theme(legend.position = "none")
grid.arrange(c1, b1, nrow =1)

Tempo

c2 <- ggplot(charlie, aes(x = tempo, y = track.popularity,
                 color = factor(tempo))) +
  geom_point() +
  labs(title = "Charlie Puth",
       subtitle = "Tempo vs. Popularity",
       x = "tempo", y = "popularity",
       caption = "Correlation Coefficient: 0.04719725") +
  scale_color_hue(h = c(180, 270))+
  theme_minimal() +
  theme(legend.position = "none")
b2 <- ggplot(billie, aes(x = tempo, y = track.popularity,
                 color = factor(tempo))) +
  geom_point() +
  labs(title = "Billie Eilish",
       subtitle = "Tempo vs. Popularity",
       x = "tempo", y = "popularity",
       caption = "Correlation Coefficient: 0.4273662") +
  scale_color_hue(h = c(180, 270))+
  theme_minimal() +
  theme(legend.position = "none")
grid.arrange(c2, b2, nrow = 1)

Danceability

c3 <- ggplot(charlie, aes(x = danceability, y = track.popularity,
                 color = factor(danceability))) +
  geom_point() +
  labs(title = "Charlie Puth",
       subtitle = "Danceability vs. Popularity",
       x = "danceability", y = "popularity",
       caption = "Correlation Coefficient: 0.2165476") +
  scale_color_hue(h = c(180, 270))+
  theme_minimal() +
  theme(legend.position = "none")
b3 <- ggplot(billie, aes(x = danceability, y = track.popularity,
                 color = factor(danceability))) +
  geom_point() +
  labs(title = "Billie Eilish",
       subtitle = "Danceability vs. Popularity",
       x = "danceability", y = "popularity",
       caption = "Correlation Coefficient: 0.2796994") +
  scale_color_hue(h = c(180, 270))+
  theme_minimal() +
  theme(legend.position = "none")
grid.arrange(c3, b3, nrow = 1)

Energy

c4 <- ggplot(charlie, aes(x = energy, y = track.popularity,
                 color = factor(energy))) +
  geom_point() +
  labs(title = "Charlie Puth",
       subtitle = "Energy vs. Popularity",
       x = "energy", y = "popularity",
       caption = "Correlation Coefficient: -0.07762088") +
  scale_color_hue(h = c(180, 270))+
  theme_minimal() +
  theme(legend.position = "none")
b4 <- ggplot(billie, aes(x = energy, y = track.popularity,
                 color = factor(energy))) +
  geom_point() +
  labs(title = "Billie Eilish",
       subtitle = "Energy vs. Popularity",
       x = "energy", y = "popularity",
       caption = "Correlation Coefficient: 0.014831") +
  scale_color_hue(h = c(180, 270))+
  theme_minimal() +
  theme(legend.position = "none")
grid.arrange(c4, b4, nrow = 1)

Loudness

c5 <- ggplot(charlie, aes(x = loudness, y = track.popularity,
                 color = factor(loudness))) +
  geom_point() +
  labs(title = "Charlie Puth",
       subtitle = "Loudness vs. Popularity",
       x = "loudness", y = "popularity",
       caption = "Correlation Coefficient: -0.07762088") +
  scale_color_hue(h = c(180, 270))+
  theme_minimal() +
  theme(legend.position = "none")
b5 <- ggplot(billie, aes(x = loudness, y = track.popularity,
                 color = factor(loudness))) +
  geom_point() +
  labs(title = "Billie Eilish",
       subtitle = "Loudness vs. Popularity",
       x = "loudness", y = "popularity",
       caption = "Correlation Coefficient: 0.3713202") +
  scale_color_hue(h = c(180, 270))+
  theme_minimal() +
  theme(legend.position = "none")
grid.arrange(c5, b5, nrow = 1)

Speechiness

c6 <- ggplot(charlie, aes(x = speechiness, y = track.popularity,
                 color = factor(speechiness))) +
  geom_point() +
  labs(title = "Charlie Puth",
       subtitle = "Speechiness vs. Popularity",
       x = "speechiness", y = "popularity",
       caption = "Correlation Coefficient: 0.04769728") +
  scale_color_hue(h = c(180, 270))+
  theme_minimal() +
  theme(legend.position = "none")
b6 <- ggplot(billie, aes(x = speechiness, y = track.popularity,
                 color = factor(speechiness))) +
  geom_point() +
  labs(title = "Billie Eilish",
       subtitle = "Speechiness vs. Popularity",
       x = "speechiness", y = "popularity",
       caption = "Correlation Coefficient: 0.1011319") +
  scale_color_hue(h = c(180, 270))+
  theme_minimal() +
  theme(legend.position = "none")
grid.arrange(c6, b6, nrow = 1)

Instrumentalness

c7 <- ggplot(charlie, aes(x = instrumentalness, y = track.popularity,
                 color = factor(instrumentalness))) +
  geom_point() +
  labs(title = "Charlie Puth",
       subtitle = "Instrumentalness vs. Popularity",
       x = "instrumentalness", y = "popularity",
       caption = "Correlation Coefficient: -0.2687716") +
  scale_color_hue(h = c(180, 270))+
  theme_minimal() +
  theme(legend.position = "none")
b7 <- ggplot(billie, aes(x = instrumentalness, y = track.popularity,
                 color = factor(instrumentalness))) +
  geom_point() +
  labs(title = "Billie Eilish",
       subtitle = "Instrumentalness vs. Popularity",
       x = "instrumentalness", y = "popularity",
       caption = "Correlation Coefficient: -0.04022779") +
  scale_color_hue(h = c(180, 270))+
  theme_minimal() +
  theme(legend.position = "none")
grid.arrange(c7, b7, nrow = 1)

Liveness

c8 <- ggplot(charlie, aes(x = liveness, y = track.popularity,
                 color = factor(liveness))) +
  geom_point() +
  labs(title = "Charlie Puth",
       subtitle = "Liveness vs. Popularity",
       x = "liveness", y = "popularity",
       caption = "Correlation Coefficient: 0.02117672") +
  scale_color_hue(h = c(180, 270))+
  theme_minimal() +
  theme(legend.position = "none")
b8 <- ggplot(billie, aes(x = liveness, y = track.popularity,
                 color = factor(liveness))) +
  geom_point() +
  labs(title = "Billie Eilish",
       subtitle = "Liveness vs. Popularity",
       x = "liveness", y = "popularity",
       caption = "Correlation Coefficient: -0.4432858") +
  scale_color_hue(h = c(180, 270))+
  theme_minimal() +
  theme(legend.position = "none")
grid.arrange(c8, b8, nrow = 1)

Valence

c9 <- ggplot(charlie, aes(x = valence, y = track.popularity,
                 color = factor(valence))) +
  geom_point() +
  labs(title = "Charlie Puth",
       subtitle = "Valence vs. Popularity",
       x = "valence", y = "popularity",
       caption = "Correlation Coefficient: 0.3442071") +
  scale_color_hue(h = c(180, 270))+
  theme_minimal() +
  theme(legend.position = "none")
b9 <- ggplot(billie, aes(x = valence, y = track.popularity,
                 color = factor(valence))) +
  geom_point() +
  labs(title = "Billie Eilish",
       subtitle = "Valence vs. Popularity",
       x = "valence", y = "popularity",
       caption = "Correlation Coefficient: 0.2283104") +
  scale_color_hue(h = c(180, 270))+
  theme_minimal() +
  theme(legend.position = "none")
grid.arrange(c9, b9, nrow = 1)

Track Duration

c10 <- ggplot(charlie, aes(x = track.duration_ms, y = track.popularity,
                 color = factor(track.duration_ms))) +
  geom_point() +
  labs(title = "Charlie Puth",
       subtitle = "Duration vs. Popularity",
       x = "duration", y = "popularity",
       caption = "Correlation Coefficient: -0.2407805") +
  scale_color_hue(h = c(180, 270)) +
  scale_x_continuous(labels = scales::comma) +
  theme_minimal() +
  theme(legend.position = "none")
b10 <- ggplot(billie, aes(x = track.duration_ms, y = track.popularity,
                 color = factor(track.duration_ms))) +
  geom_point() +
  labs(title = "Billie Eilish",
       subtitle = "Duration vs. Popularity",
       x = "duration", y = "popularity",
       caption = "Correlation Coefficient: 0.6260782") +
  scale_color_hue(h = c(180, 270)) +
  scale_x_continuous(labels = scales::comma) +
  theme_minimal() +
  theme(legend.position = "none")
grid.arrange(c10, b10, nrow = 1)

The three variables with the strongest correlation in Charlie Puth’s music are valence(0.344), instrumentalness(-0.269), and duration(-0.241). In Billie’s music the three aspects with the strongest relationship to popularity are duration(0.626), liveness(-0.443), and tempo(0.427).

Conclusion

The main conclusion that can be drawn from this analysis is that diversity in music is valuable. Charlie Puth and Billie Eilish have songs in an incredible array of key signatures. Their discographies are well balanced between modes. A sense of familiarity results from the frequent use of 4/4 as the time signature. This allows their music to be refreshing, yet not too out-of-the-box that it would deter listeners. There are some limitations on the analysis. The Spotify API does not give information on key changes or time signature changes. I think that would have been a super interesting aspect to investigate. They add such a fascinating element to a song. I for one get much more invested in songs with unique features like switches in tonality. This analysis was very thought-provoking and allowed me to develop an even greater appreciation for these artists. Charlie and Billie both have produced multiple chart-topping songs by exploring their own creative interests instead of following one basic mold.

Sources

Content:

https://www.allmusic.com/artist/charlie-puth-mn0003345212/biography

https://www.thefamouspeople.com/profiles/charlie-puth-29863.php

https://college.berklee.edu/people/charlie-puth

https://en.wikipedia.org/wiki/Charlie_Puth

https://www.biography.com/musician/billie-eilish

https://www.thefamouspeople.com/profiles/billie-eilish-42253.php

https://news.uchicago.edu/explainer/what-is-perfect-pitch

Images:

https://img.era.id/B-Zja5a24SpAAv0p1UScg_dBQMfZ6_cbdfTnV4dgG3s/rs:fill:1280:720/g:sm/bG9jYWw6Ly8vcHVibGlzaGVycy82NDk2Ni8yMDIxMDYxNjEyMDItbWFpbi5jcm9wcGVkXzE2MjM4MTk3ODAuanBn.jpg

https://www.nhme.org/_media/circle-of-fifths-treble-clef-v2.pdf

https://www.vox.com/culture/2019/4/18/18412282/who-is-billie-eilish-explained-coachella-2019