Click the Original, Code and Reconstruction tabs to read about the issues and how they were fixed.
Objective
This data visualisation titled “Distribution (%) of Lego brick colours in current Lego sets” uses a stylised minimalistic 3D bar chart to display colour distribution of Lego bricks. This bar chart uses stacks of Lego bricks at different heights to represent colour distribution percentage. The target audience for this chart are Lego enthusiasts and people who enjoy looking at interesting data visualisations.
The visualisation chosen had the following three main issues:
Reference
The following code was used to fix the issues identified in the original.
# Required libraries
library(readr)
library(ggplot2)
library(dplyr)
# Read in data
# Data downladed from https://rebrickable.com/downloads/
inventory_parts_df <- read_csv(file = "inventory_parts.csv")
colors_df <- read_csv(file = "colors.csv")
# Generate summary data frame that holds distinct color counts
summary_df <- inventory_parts_df %>%
count(color_id)
# Change "color_df" data frame "id" variable name e to "color_id" to match the "color_id" variable name in the "summaty_df" data frame
names(colors_df)[1] <- "color_id"
# Add color details to "summary_df" data frame (join on "color_id")
summary_df <- merge(summary_df, colors_df, by = "color_id")
# Calculate and add new "percent" variable to the "summary_df" data frame
summary_df$percent = 100/sum(summary_df$n) * summary_df$n
# Round "percent" variables 2 decimal places
summary_df$percent <- as.numeric(format(round(summary_df$percent, 2), nsmall = 2))
# Reorder "summary_df" data frame using "n" (count)
summary_df <- summary_df[order(decreasing = TRUE, summary_df$n),]
# Create new "chart_df" data frame that only holds top 34 observations to match original chart
chart_df = summary_df[1:34,]
# Generate Bar Chart
p1 <- ggplot(chart_df, aes(x = reorder(name, percent), y = percent, fill=name), show.legend = FALSE) +
geom_bar(stat = "identity") +
geom_col(colour = "black") +
scale_fill_manual(values=c(
"[No Color/Any Color]" = "#05131D",
"Black" = "#05131D",
"Blue" = "#0055BF",
"Bright Green" = "#4B9F4A",
"Bright Light Orange" = "#F8BB3D",
"Bright Pink" = "#E4ADC8",
"Brown" = "#583927",
"Dark Blue" = "#0A3463",
"Dark Bluish Gray" = "#6C6E68",
"Dark Brown" = "#352100",
"Dark Gray" = "#6D6E5C",
"Dark Pink" = "#C870A0",
"Dark Purple" = "#3F3691",
"Dark Red" = "#720E0F",
"Dark Tan" = "#958A73",
"Flat Silver" = "#898788",
"Green" = "#237841",
"Light Bluish Gray" = "#A0A5A9",
"Light Gray" = "#9BA19D",
"Lime" = "#BBE90B",
"Medium Azure" = "#36AEBF",
"Medium Dark Flesh" = "#CC702A",
"Orange" = "#FE8A18",
"Pearl Gold" = "#AA7F2E",
"Red" = "#C91A09",
"Reddish Brown" = "#582A12",
"Tan" = "#E4CD9E",
"Trans-Clear" = "#FCFCFC",
"Trans-Light Blue" = "#AEEFEC",
"Trans-Orange" = "#F08F1C",
"Trans-Red" = "#C91A09",
"Trans-Yellow" = "#F5CD2F",
"White" = "#FFFFFF",
"Yellow" = "#F2CD37"
))+
geom_text(aes(label = percent), hjust = -0.2, size = 3.5) +
labs(title = "Distribution (%) of brick colours in current Lego sets",
x = "Colour",
y = "Percentage") +
ylim(0, 20) +
coord_flip() +
theme_classic() +
theme(legend.position = "none")
Data Reference
The following plot fixes the main issues in the original.