Click the Original, Code and Reconstruction tabs to read about the issues and how they were fixed.

Original


Source: Visualizing gender gap by country in 2020


Objective

The main objective of this visualization is to exhibit the countries which are having the highest and smallest gender gap and also their respective gap levels in different sub categories like Health & Survival, Educational attainment,Political empowerment and economic participation. The visualization depicts the top and bottom 10 countries each in a diamond shape where their respective individual sub-indexes are mapped along their 4 corners based on their percentage values with the least value referring to highest gender gap and vice versa.

Target Audience The target audience for this visualization is the governing bodies and also experts who are advocating and working on equalizing the gender gap in these varied countries.

The visualization chosen had the following three main issues:

Data Presentation - The data could have been presented in a more succinct format, so that it would have been simpler to compare and interpret as well. Currently, the visualization makes it difficult to understand any trend or pattern observed in any of the categories between the high or low performing countries. Representing the data in bar charts for different countries would have made it easier to compare and make inferences.

Color - The different color points used to represent the different categories is not sufficient and it is hard to distinguish the values every time.

Irrelevant data - The visualization aims to display the details of the smallest and widest gender gap countries, however it also included US data which is irrelevant to the context since US score does not fall under top or bottom 10.It creates a deception that the data of US is significant for this comparison as a separate layer has been added to display US field data.

Reference

Code

The following code was used to fix the issues identified in the original.

library(ggplot2)
library(tidyr)
library(dplyr)
world_countries <- read.csv("worldeconomicforum_data.csv")
top_countries <- world_countries %>% slice_max(Overall,n=10) 
bottom_countries <- world_countries %>% slice_min(Overall,n=10)

top_countries <- top_countries %>% mutate(range = 'Top10')
bottom_countries <- bottom_countries %>%  mutate(range = "Bottom10")

#Merge the dataframes 

countries <- rbind(top_countries,bottom_countries) %>% arrange(Rank)

#factorising range to create different groups 

levels(factor(countries$range))
## [1] "Bottom10" "Top10"
countries$range <- countries$range %>% factor(levels=c("Top10","Bottom10"),ordered=TRUE)
levels(factor(countries$range))
## [1] "Top10"    "Bottom10"
range_names <- levels(countries$range)
#convert the data to long format 

countries_long <- gather(countries,subindexes,score,Health:Political,factor_key=TRUE)

#create labeller function 

variable_names <- list(
  "Health" = "Health \n & Survival" ,
  "Education" = "Educational \n Attainment",
  "Economy" = "Economic \n Participation",
  "Political" = "Political \n Empowerment"
)

variable_labeller <- function(variable,value){
  if (variable=='subindexes') {
    return(variable_names[value])
  } else {
    return(range_names)
  }
}

# Factorising the countries based on their Ranks 

countries_long$Countries <- factor(countries_long$Countries, levels= unique(countries_long$Countries[order(countries_long$Rank)]))

#plot the data 

p1 <- ggplot(countries_long, aes(x=Countries, y=score, fill=range))+
  geom_bar(stat='identity')+ scale_fill_manual(values =c('aquamarine3','coral','darkorchid1'))+
  geom_col(alpha = 0.8, width = 0.3)+
  facet_grid(subindexes~range, scales="free_x", space="free_x",labeller= variable_labeller)+
  labs(title = "Countries with highest and lowest gender gap based on different Categories", captions = "Source: World economic forum - https://www3.weforum.org",
       y = "Percentage score" , x = "Countries") +ggpubr::rotate_x_text() + theme(legend.position = "None")

Data Reference

Reconstruction

The following plot fixes the main issues in the original.