Click the Original, Code and Reconstruction tabs to read about the issues and how they were fixed.

Original


Source: https://www.visualcapitalist.com/gen-z-unemployment-rate-chart/. (2021)


Objective

The above data visualisation shows a significant gap between Generation Z (someone born in between mid or late 1990s till early 2010s) and Older generation’s (born before Generation Z) unemployment rate in OECD countries during the year 2020. The main objective of the above data visualisation is to show how the global pandemic COVID-19 has impacted Generation Z’s unemployment rate and it is targeted to make general public aware about the situation.

The visualisation chosen had the following three main issues:

  • There’s a note mentioned in the data visualisation, “The data for Italy’s 25-74 year old unemployment rate is unavailable.” However, that particular data is missing for Mexico, not Italy. This could be issue with the data pre-processing step because both the datasets contained information for Italy but for Mexico 25-74 year old unemployment rate was found to be missing, during my data pre-processing step.
  • A line (with a sense of color scale changing from blue to red or vice versa) which connects the unemployment rate of 2 age group doesn’t make sense in the above data visualisation. It sort of creates a deception of changing the unemployment rate rather than showing the gap between the unemployment rate of 2 age groups.
  • The data visualisation must be self explanatory but in the above data visualization it has lots of explanation just to inform a basic information.

Reference

Code

The following code was used to fix the issues identified in the original.

# importing required libraries
library(readr)
library(magrittr)
library(tidyr)
library(ggplot2)

# importing dataset
oecd_unemployment <- read_csv("Unemployment Rate.csv")

# transforming data from wide to long using gather() function
oecd_unemployment <- oecd_unemployment %>% gather(`15-24 year olds`, `25-74 year olds`, key = "Age Group", value = "Unemployment Rate")

# Using ggplot to reconstruct the data visualization
p1 <- ggplot(data = oecd_unemployment, mapping = aes(y = `Country Name`, x = `Unemployment Rate`, fill = `Age Group`))

p1 <- p1 + geom_bar(position=position_dodge(), width = 0.75, 
                    stat = "identity") +
  labs(
    title = "Generation Z vs Older Generation Unemployment rate (Year 2020)", 
    y = "", 
    x = "Unemployment Rate",
    caption = "* The data for Mexico 25-74 year olds unemployment rate is unavailable."
  )

Data Reference

Reconstruction

The following plot fixes the main issues in the original.