Assignment 1

Author

Mohamed Ali

Introduction

The dataset used in this analysis started with the Rolling Stone Top 500 Albums list, which features some of the most influential albums in music history. Details about the artist, album title, year of release, and genre are included, showing how musical trends have evolved over time.

The purpose of this study is to see which ten genres and subgenres are most common in the dataset. By examining how musical genres and subgenres have developed, it can determine which ones have had the biggest impact on the industry.

The frequency of the top 10 genres is shown in a bar chart to help visualize these data and make it simple to compare how often each is. A more creative visual picture of the same data is provided by a bubble chart, which uses bubble sizes to show the number of albums in each genre and subgenres.

By highlighting the most influential musical genres and subgenres in Rolling Stone’s Top 500 Albums list, these visualizations present an expanded view of the main musical patterns across time.

First Categorical Analysis

Code

library(tidyverse)
top_albums <- read_csv(
  "https://jsuleiman.com/datasets/Rolling_Stones_Top_500_Albums.csv",
    locale = locale(encoding = "ISO-8859-2", asciify = TRUE))

Code

top_genres <- top_albums |> 
  count(Genre, sort = TRUE) |> 
  top_n(10, n)
ggplot(top_genres, aes(x = reorder(Genre, n), y = n)) +  
  geom_bar(stat = "identity", fill = "skyblue") +  
  labs(title = "Top 10 Most Common Genres", x = "Genre", y = "Count") +  
  theme_minimal() +  
  theme(axis.text.x = element_text(angle = 45, hjust = 1))

Analysis & Reflection

I noticed that several genres are predominant in Rolling Stone’s list of the Top 500 albums. The most popular genres—Rock, Pop, and Hip-Hop, for example have a much higher number than the others. That means that these genres are more commonly represented in the dataset and have had a significant impact on music history. Also, I found a sharp decline in popularity following the top genres, signaling that fewer albums from other genres are included in the rankings. This supports an idea that a small number of popular styles have the greatest influence on the company’s image.

Since a bar chart is one of the best tools for comparing categorical data, I decided to use it for the genre visualization. I only had ten categories to deal with, so it was simple to show differences in frequency using a bar chart. To present the most common genres first, I used reorder(Genre, n) to sort the bars by count. By limiting the design to only one color (skyblue), I also made it easier to concentrate on the genre breakdown without being distracted. I stuck to the Gestalt principle of proximity and similarity, which improves the audience’s ability quickly organize and compare data pieces. The chart was kept clear and clutter-free with the help of theme_minimal(), which made it even simpler to understand.

The overall number of genres in the dataset can be easily communicated using the genre bar chart. Sorting the bars guarantees that the most important data appears first, and the simple structure makes it simple to identify which genres overpowered. The simple layout keeps the message and simplifies any and all interpretations. Although the genre rankings appear in this way, it doesn’t offer a deeper understanding of the reasons why the popularity of certain genres. I also noticed that by allowing easier comparisons, placing percentage labels above the bars would improve readability. Other than that, this graph doesn’t depict any historical trends, even though it’s helpful for illustrating the image of a genre overall.

Second Categorical Analysis

Code

library(tidyverse)
top_albums <- read_csv(
  "https://jsuleiman.com/datasets/Rolling_Stones_Top_500_Albums.csv",
    locale = locale(encoding = "ISO-8859-2", asciify = TRUE))

Code

top_albums <- read_csv(
  "https://jsuleiman.com/datasets/Rolling_Stones_Top_500_Albums.csv",
  locale = locale(encoding = "ISO-8859-2", asciify = TRUE)
)

Rows: 500 Columns: 6
── Column specification ────────────────────────────────────────────────────────
Delimiter: ","
chr (4): Album, Artist, Genre, Subgenre
dbl (2): Number, Year

ℹ Use `spec()` to retrieve the full column specification for this data.
ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.

Code

top_subgenres_count <- top_albums |>  
  count(Subgenre, sort = TRUE) |>  
  slice_max(n, n = 10)  

ggplot(top_subgenres_count, aes(x = Subgenre, y = n, size = n, fill = Subgenre)) +  
  geom_point(alpha = 0.7, shape = 21, color = "black") +  
  scale_size(range = c(5, 20)) +  
  labs(title = "Top 10 Most Common Subgenres (Bubble Chart)", 
       x = "Subgenre", 
       y = "Count", 
       size = "Subgenre Count") +  
  theme_minimal() +  
  theme(axis.text.x = element_text(angle = 45, hjust = 1, vjust = 1))

Analysis & Reflection

I saw that the subgenre bubble chart highlights unique musical styles, which helps me understand how subgenres fit into the larger genre classifications. The graphic probably displays subgenres of hip-hop (East Coast, West Coast Hip-Hop) and rock (Classic Rock, Alternative Rock). The fact that the bubble sizes ranged implies some subgenres are much more common than others. This shows that while some styles are more common, others are still important while being used less frequently. My understanding of the tiers in the dataset that a genre-level study would not have picked up was enhanced by looking at the subgenre level.

I chose to use a bubble chart for the subgenre display rather than a bar chart since it gives a more engaging approach to illustrate relative size differences. A bubble chart was a great choice for illustrating the diversity and distribution of musical styles since subgenres frequently overlap and share influences with other genres. In order to highlight which styles were most popular, I set the bubble size according to how often they occur of each subgenre. Aside from helping in separation, the choice of different colors for each subgenre improved the chart’s aesthetic appeal. The human brain instantly notices size differences in recognizing order, which is in line with the idea of preattentive processing.Using angle = 45, I rotated the x-axis labels to make them easier to read while keeping the visibility of longer subgenre titles.

By highlighting the subgenres that are most common, the subgenre bubble chart delivers a more thorough understanding of the distribution of musical styles. Seeing which subgenres dominate is made simple by the clear visual order created by the diversity in bubble sizes. Colors give an additional degree of distinction, keeping the chart from looking too similar. Even though visually appealing, I found that this depiction is not as accurate as a bar chart when it comes to comparing actual quantities. When bubbles vary in size, it can occasionally be challenging to discern subtle number changes. Some subgenre names are also lengthy, and reading might still be enhanced despite the x-axis rotation. Some subgenre names are also lengthy, and reading might still be enhanced despite the x-axis rotation. It would be easier to see how subgenres fit into the broader musical landscape if they were grouped according to their primary genres.

Conclusion

Gestalt concepts helps in by simplifying the data used in this study. By applying the same color scheme to similarities the bar chart makes it easier to see patterns and compare the frequency of different genres. To see the most popular music genres, the bubble chart uses proximity, that lets the size and placement of bubbles automatically cluster related genres together. Genre distribution is shown in full by combining these two images. In contrast to the bar chart, showing a comparison, the bubble chart highlights genre dominance. Together, they show major patterns in the Rolling Stone Top 500 Albums list and shows a look at the evolution of musical preferences throughout time.

AI Use

To help me with my assignment, I used AI. I used quotation marks in my code for my first visualization, for instance: aes(x = reorder(“Genre”, n), y = n)), which turned my visual into a single block. I was able to get rid of the quotation marks with AI’s help. On top of that, the visual was pretty hard to read at first. I had asked for AI to help me make my code more aesthetically pleasing and easier to read, and with it’s help I was able to display the top ten most popular genres, which drastically simplified my visual. I was initially unable to create a heatmap for my second code since it was not displaying a traditional heatmap for some reason. Because of this, I changed it to a bubble chart and with a little help from AI. I also adjusted it so that only the top 10 genres were displayed, but in a different visual. I finished my code with the aid of the paid chatgpt version.