I want to find out more about the top songs enjoyed by music lovers. For example, who are the artists of the most popular songs? In which albums do these songs reside? What are they about? Finding out this information about different songs will help in comparing their characteristics to each other. More specifically, in this assignment I will answer questions pertaining to whether the Spotify and Genius stats for the top songs compliment each other, which top artists had the most hits, and what gender dominated the charts this year.
How I Will Answer the Question
I will use data from Genius.com to answer these questions. The Genius website includes a multitude of information about various songs. I will scrape the Genius page for the top 10 songs on the Spotify charts in America for 2023, and create tables about them. These tables will include the song title, the artist, the producers, a description of the song, the number of contributors to the song’s Genius page, the song’s album, and the other songs in that album.
The top 10 songs nationwide for 2023 on Spotify are as follows:
“Last Night” by Morgan Wallen
“Kill Bill” by SZA
“Flowers” by Miley Cyrus
“Ella Baila Sola” by Eslabon Armado & Peso Pluma
“Boy’s a liar Pt. 2” by PinkPantheress & Ice Spice
“Cruel Summer” by Taylor Swift
“Something in the Orange” by Zach Bryan
“You Proof” by Morgan Wallen
“Creepin’ (with The Weeknd & 21 Savage)” by Metro Boomin, The Weeknd & 21 Savage
“Anti-Hero” by Taylor Swift
Therefore, these are the songs that I will be collecting data on.
Data Wrangling
I performed the necessary data wrangling on a separate R Script. Below is the final dataset for my analysis.
library(readr)library(tidyverse)
── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ──
✔ dplyr 1.1.1 ✔ purrr 1.0.1
✔ forcats 1.0.0 ✔ stringr 1.5.0
✔ ggplot2 3.4.2 ✔ tibble 3.2.1
✔ lubridate 1.9.2 ✔ tidyr 1.3.0
── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
✖ dplyr::filter() masks stats::filter()
✖ dplyr::lag() masks stats::lag()
ℹ Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to become errors
library(rvest)
Attaching package: 'rvest'
The following object is masked from 'package:readr':
guess_encoding
library(xml2)library(httr)library(magrittr)
Attaching package: 'magrittr'
The following object is masked from 'package:purrr':
set_names
The following object is masked from 'package:tidyr':
extract
New names:
Rows: 10 Columns: 12
── Column specification
──────────────────────────────────────────────────────── Delimiter: "," chr
(9): title, artist, producers, description, viewers, contributors, album... dbl
(3): ...1, viewerscount, contributorscount
ℹ Use `spec()` to retrieve the full column specification for this data. ℹ
Specify the column types or set `show_col_types = FALSE` to quiet this message.
• `` -> `...1`
Analysis & Results
Visualization 1
The bar chart below shows the number of viewers each of the top 10 songs on U.S. Spotify charts has on Genius. I created this chart by scraping the Genius site for information on the top 10 songs, and then made a bar chart with the song titles on the x-axis and the number of viewers on the y-axis.
This shows us that “Kill Bill” has the most viewers (2.5M) on its Genius page, followed by “Anti-Hero,” “Cruel Summer,” and “Flowers” tied in second place with 1.9M. “Ella Baila Sola” has the fewest viewers at 411.9K views. This is contrary to what I would have thought because “Kill Bill” was rated second overall, “Anti-Hero” was last, “Cruel Summer” was sixth, “Flowers” is third, and “Ella Baila Sola” was fourth. None of these songs view counts matched their overall rating from Spotify.
Visualization 2
This bar chart examines the number of contributors to each hit’s Genius page. I used the data from my Genius page scrape function to compile the number of contributors for every song.
The above chart shows that “Anti-Hero” has the most contributors, followed by “Cruel Summer” and then “Kill Bill.” “Anti-Hero” was last on the top 10 charts but has the most contributors, which is unexpected. “Something in the Orange” and “You Proof” have the least contributors, though they are seventh and eighth respectively on the charts.
Visualization 3
The chart below depicts the number of hits each artist has on the top 10 chart. I used the data I gathered from Genius using my web scraping function for this visualization.
All artists had one hit besides Morgan Wallen and Taylor Swift, who each had two. This could be because both Morgan Wallen and Taylor Swift had big years in 2023. Morgan Wallen came out with the album “One Thing at a Time” and toured this album on the “One Night at a Time” tour. On the other hand, Taylor came out with the “Taylor’s Version” of her past albums and toured on the “Eras” tour. They were both very successful in their respective genres this past year.
Visualization 4
The below visualization shows the relationship between each song’s viewers and contributors on their Genius page.
It appears that there is a positive correlation between viewers and contributors to a song’s Genius website. As a song gets more views, it also tends to have more contributors. I think it’s important to note that both of Taylor Swift’s songs have a high amounts of viewers and contributors, while both of Morgan Wallen’s songs have very few. This might give us a little bit of insight into their fans.
Visualization 5
To create this visualization, I made a new gender column in the dataset that tells us whether the main artist of each song is male or female. For example, “Creepin’” is technically by Metro Boomin but features The Weeknd and 21 Savage, so Metro Boomin is the main artist of that song.
The above chart shows that there are an equal number of male and female artists in the top 10 songs in America this year. It doesn’t give much insight into the gender norms of popular music this past year in America.