Click the Original, Code and Reconstruction tabs to read about the issues and how they were fixed.

Original


Source: Workplace Gender Equality Agency (WGEA).


Objective

The objective of the visualized data is to compare the gender equality in PPG Industries Australia as a sample Australian workplace. The targeted audience are researchers and curious employees of this company who are interested in gender equality in their workplace.

The visualization chosen had the following three main issues:

  • This visualization fails to answer the question “are genders equal in my workplace?” while:

    1. Numbers shown in the plot are not defined clearly, and
    2. Genders are not compared side-by-side.
  • Data used for the plot is not integer as it is just referring to “Permanent Full-time staff”. By this, 79 people consist of 9.2% of the workers are excluded.

  • The graph has perceptual issues. Effective Data Visualization score (by Stephanie Evergreen) for this visualization is 63%.

Reference

  • Workplace Gender Equality Agency, WGEA Data Explorer. Available at: https://www.wgea.gov.au/ (Accessed: November 19, 2022).

Code

The following code was used to fix the issues identified in the original.

library(ggplot2) #to draw plot
library(readxl) #to read csv files
ppg <- read.csv('PPG_Data_c.csv',header = TRUE) #importing data from the data source
#categorize, label and order data
ppg$Type <- factor(ppg$Type, levels = c("Organisation value",
                                        "Industry comparison","All organisations"),
                   ordered = TRUE)
#creating plot area layer
po <- ggplot(data = ppg, aes(x = Year, y = Percentage, fill = Type)) 
po + 
  geom_bar(stat = "identity", position = "dodge") + #add the plot
  #selecting fill colour
  scale_fill_manual(values=c("#ffcc00",
                             "#fa9619",
                             "#4985EB")) +
  #selecting shape colour
  scale_colour_manual(values=c("#ffcc00",
                             "#fa9619",
                             "#4985EB")) +
  #inserting the separating vertical lines
  geom_vline(xintercept= c(0.495, 1.495, 2.495,3.495,4.495,5.495,6.495, 7.495, 8.495)) +
  #setting the main and sub titles and adding caption
  labs(title = "GENDER EQUALITY insights across time",
       subtitle = "Full-time permanent staff",
       caption = "np = Not publishable due to small sample size") +
  #assigning axis labels
  ylab("Percentage") + xlab("") +
  #scale y axis
  ylim(0,100) +
  #format y axis
  scale_y_continuous(labels = scales::percent_format(accuracy = 1),
                     n.breaks = 6) +
  #placing x axis tick mark labels on the top
  scale_x_discrete(position = "top") +
  #setting heights of the bars to represent values
  geom_col(aes(color = Type), position = "dodge") +
  #placing the bars' values and customize
  geom_text(aes(label=paste(Percentage * 100, " %")),
            stat = "identity", color = "black", angle = 90,
            position = position_dodge(0.9),
            vjust = 0.4, hjust = 1.1, size = 3) +
  #customize legends, caption and background
  theme(legend.position = "bottom", legend.title = element_blank(),
        legend.direction = "vertical", legend.justification = "left",
        plot.caption = element_text(hjust = 0), 
        panel.background = element_rect(fill = "white", colour = "white"))

Data Reference

Reconstruction

The following plot fixes the main issues in the original.