These charts will provide an in-depth analysis of features found on Spotify’s Top 200 Chart from 2020-2021. Ever wonder what song appeared on the Top 200 Chart the most times in 2020-2021 or what artist had the most songs found on the Top 200 Chart? These questions and more will be answered in the graphs featured in the tabs below.
The dataset used to create the following charts featured a broad range of descriptive information regarding the songs found on Spotify’s Top 200 Chart from 2020-2021. The categories we will be focusing on in this analysis are Song Names, Artist Names, the Duration of Songs in milliseconds, the Amount of Followers an Artist has on Spotify, and the Year a Song was released.
This dataset will provide the top Songs and Artists in multiple categories of the dataset. Tab 1 will provide the Top 10 Artists who had the most songs on Spotify’s Top 200 Chart from 2020-2021. Tab 2 builds off the results from Tab 1 and shows the percentage of the Top 200 Chart occupied by the Top 3 artists with the most songs on the chart in 2020-2021. Tab 3 shows the Top 10 longest songs (in milliseconds). Tab 4 displays 10 songs that had the most appearances on the Top 200 Chart in 2020-2021 and the number of followers those artists have on Spotify. Tab 5 shows the percentage of release years that make up the Top 200 Chart (2020-2021).
This bar chart shows the Top 10 Artists with the most songs on Spotify’s Top 200 Chart (2020-2021). The graph shows how Taylor Swift was the Artist with the most songs featured (52).
artistcount <- data.frame(count(spotify_dataset, Artist))
artistcount <- artistcount[order(artistcount$n, decreasing = TRUE),]
artistcount$n <- as.numeric(artistcount$n)
top_artists <- artistcount[1:10,]
ggplot(top_artists, aes(x= reorder(Artist, -n), y=n)) +
geom_bar(colour = "black", fill = "lightblue", stat = "identity") +
labs(title="Top 10 Artists with Most Songs on the Spotify Top 200 Chart (2020-2021)", x= "Artist", y= "Number of Songs") +
theme_light()+
theme(plot.title = element_text(hjust = 0.5))+
geom_label(aes(label=n),
size=4,
color='purple')
Based on the results from Tab 1, this pie chart displays the Top 3 Artists with the Most Songs on Spotify’s Top 200 Chart (Taylor Swift, Justin Bieber, and Lil Uzi Vert). The percentages are based on the number of times their songs were found on the Top 200 Chart compared to all other artists on the chart.
top_3 <- spotify_dataset %>%
select(Artist) %>%
dplyr::mutate(artists = ifelse(Artist=="Taylor Swift", "Taylor Swift", ifelse(Artist=="Justin Bieber", "Justin Bieber", ifelse(Artist=="Lil Uzi Vert", "Lil Uzi Vert", "Other")))) %>%
group_by(artists) %>%
dplyr::summarise(n= length(artists), .groups='keep') %>%
group_by(artists) %>%
mutate(percent_of_total = round(100*n/sum(artistcount$n), 2)) %>%
ungroup() %>%
data.frame()
top_3$artists = factor(top_3$artists, levels=c("Taylor Swift", "Justin Bieber", "Lil Uzi Vert", "Other"))
ggplot(data = top_3, aes(x="", y= n, fill = artists)) +
geom_bar(stat= "identity", position= "fill")+
coord_polar(theta= "y", start=0)+
labs(fill = "Artist Name", x= NULL, y= NULL, title= "Percentage of Spotify's Top 200 Chart Occupied by the Top 3 Most Featured Artists (2020-2021)")+
theme_light()+
theme(plot.title = element_text(hjust = 0.5),
axis.text = element_blank(),
axis.ticks = element_blank(),
panel.grid = element_blank())+
scale_fill_brewer(palette = "Pastel1") +
geom_text(aes(x=1.5, label= paste0(percent_of_total, "%")),
size=3.4,
position=position_fill(vjust = 0.5))
This line graph displays the Top 10 Longest Songs in Milliseconds (ordered alphabetically) and displays the length of each song. The chart shows that the longest song featured on Spotify’s Top 200 Chart from 2020-2021 was SWEET/ I THOUGHT YOU WANTED TO DANCE at 588, 139 milliseconds.
songlength <- data.frame(count(spotify_dataset, Song.Name, Duration..ms.))
No_song_length <- which(is.na(songlength$Duration..ms.))
drop_rows <- c(No_song_length)
songlength <- songlength[-drop_rows,]
songlength <- songlength[order(songlength$Duration..ms., decreasing= TRUE),]
ggplot(songlength[1:10,], aes(x= Song.Name, y=Duration..ms., group=1)) +
geom_line(colour = "darkgreen", size= 1)+
geom_point(shape=21, size=4, color='red', fill='white')+
coord_flip()+
labs(x="Song Name", y= "Duration in ms", title= "Top 10 Longest Songs on Spotify's Top 200 Chart (2020-2021)")+
scale_y_continuous(labels=comma)+
theme_light()+
theme(axis.text.x = element_text(angle=-45, hjust=0, size=10,colour="black"))+
theme(plot.title = element_text(hjust=0.5))+
geom_label_repel(aes(label=scales::comma(Duration..ms.)),
box.padding = 1.3,
point.padding = 1,
size=4,
color='Grey50',
segment.color = 'green')
This dual axis graph displays the Top 10 songs that appeared the most times on Spotify’s Top 200 Chart from 2020-2021. The line featured on the graph shows the amount of followers each of the artists who created these songs have on Spotify. The song that appeared the most times on the Top 200 Chart was Falling by Harry Styles (142 times). Ed Sheeran, who wrote the song Perfect which appeared on the Top 200 chart the 3rd highest amount of times (83), has the most followers out of the featured artists on this graph at just under 90 million.
charted <- data.frame(count(spotify_dataset, Artist, Song.Name, Number.of.Times.Charted, Artist.Followers))
charted <- charted[order(charted$Number.of.Times.Charted, decreasing= TRUE),]
top_charted <- charted[1:10,]
top_charted$song.name.artist <- paste0(top_charted$Song.Name, " (", top_charted$Artist, ")")
#top_charted$Artist.Followers <- as.factor(top_charted$Artist.Followers)
top_y <- seq(0, max(top_charted$Artist.Followers)/1e6, 5)
my_labels_first_axis <- paste0(seq(0,100,10),"M")
ggplot(top_charted, aes(x= reorder(song.name.artist, Number.of.Times.Charted, sum), y= Number.of.Times.Charted, fill=NULL)) +
geom_bar(colour = "black", fill = "purple", stat= "identity")+
geom_text(data= top_charted, aes(x= song.name.artist, y= Number.of.Times.Charted, label= Number.of.Times.Charted, fill= NULL), hjust = -0.1, size=4, color= "black")+
coord_flip()+
theme_light()+
scale_fill_brewer(palette= "Paired", guide= guide_legend(reverse= TRUE))+
geom_line(inherit.aes = FALSE,
data=top_charted,
aes(x= song.name.artist, y= Artist.Followers/150e3, colour= "Artist's Followers",
group=1),
size=1)+
scale_color_manual(NULL, values= "black")+
labs(title= "Top 10 Songs with Most Appearances on Spotify's Top 200 Chart (2020-2021) and Number of Followers per Artist", x= "Song Name", y= "Number of Times on Spotify's Top 200 Chart", fill= "Number of Followers per Artist")+
theme(plot.title = element_text(hjust=0.5)) +
scale_y_continuous(labels=comma, limits=c(0,650),
sec.axis=sec_axis(~. *150e3, name="Followers",
breaks = seq(0,100e6,10e6),
labels=my_labels_first_axis))
This donut chart shows the percentage of release years that make up the Top 200 Chart (2020-2021) and focuses on the 4 most recent years (2018-2021). The chart shows the release year of 2020 appeared on 51.2% (or a total of 783) of the songs on the Top 200 Chart from 2020-2021.
songdate <- data.frame(dplyr::count(spotify_dataset, Song.Name, Release.Date))
release_date <- spotify_dataset %>%
select(Release.Date) %>%
dplyr::mutate(years = year(ymd(Release.Date))) %>%
group_by(years) %>%
dplyr::summarise(n= length(Release.Date), .groups = 'keep') %>%
data.frame()
NoYearRows <- which(is.na(release_date$years))
rows_drop <- c(NoYearRows)
release_date <- release_date[-rows_drop,]
top_4 <- release_date %>%
select(years, n) %>%
dplyr::mutate(recentyears = ifelse(years>= 2018, years, "Other")) %>%
group_by(recentyears) %>%
dplyr::summarise(tot= sum(n), .groups='keep') %>%
data.frame()
plot_ly(top_4, labels= ~recentyears, values= ~tot) %>%
add_pie(hole=0.6) %>%
layout(title= "Total Appearances of Song Release Years on Spotify's Top 200 Chart (2020-2021)") %>%
layout(annotations=list(text=paste0("Total Appearances of Release Years: \n", scales::comma(sum(top_4$tot))),
"showarrow"=F))
Based on this analysis, we are able to determine which songs and artists were the most successful on Spotify’s Top 200 Chart in 2020-2021 in a broad range of categories. I hope that these visualizations were insightful and allowed for you to get an in-depth look at the successful songs and artists heavily dominant in the music industry today.