Click the Original, Code and Reconstruction tabs to read about the issues and how they were fixed.

Original


Source: Quora (2016).


Objective

The original visualization was used to compare average adult male heights in (feets and inches) from different countries against the average adult male heights in the US. The visualization focuses on identifying the tallest male population based on countries and figuring out if there are significant climate, geographical and ethnic background influences on the findings. The target audience are generally anyone interested in the average male height data. Such diagram can also be of interest of health organizations, nutritional experts and scientists.

The visualisation chosen had the following three main issues:

  • Colour issue. As you can see from the diagram, the colour chosen are darker for taller average male height and lighter for shorter average male heights. However, it is fairly difficult to differentiate all the heights in between as the choice of shades from a dark black to a light blueish hue makes it hard to compare. When the shades are placed on a world map, it makes it almost impossible to visually communicate the objectives of the data. The choice of the colour palette can be tricky especially for people with achromatopsia (monochromatic vision) when trying to differentiate the grayish palettes and the light blueish ones.

  • The second issue is the lines used to label each country. This adds unnecessary complexity to the visualization. Using the world map to represent this dataset is misleading since some countries have bigger borders and we are not trying to compare continental average adult male heights.

  • The original dataset has not been referenced properly in the visualization and only contains the website where the visual was sourced from. This raises a few concerns which are unknown source, data integrity and misleading to audience.

Reference

Code

The following code was used to fix the issues identified in the original.

library(ggplot2)
library(dplyr)


## Load the data and display
df = read.csv("data.csv")
head(df)
##   place  pop2023 growthRate   area                country cca3 cca2 ccn3 region
## 1   528 17618299    0.00309  41850            Netherlands  NLD   NL  528 Europe
## 2   499   626485   -0.00095  13812             Montenegro  MNE   ME  499 Europe
## 3    70  3210847   -0.00701  51209 Bosnia and Herzegovina  BIH   BA   70 Europe
## 4   352   375318    0.00649 103000                Iceland  ISL   IS  352 Europe
## 5   208  5910913    0.00487  43094                Denmark  DNK   DK  208 Europe
## 6   203 10495295    0.00013  78865         Czech Republic  CZE   CZ  203 Europe
##         subregion landAreaKm  density densityMi Rank year meanHeightMale
## 1  Western Europe    33670.0 523.2640 1355.2538   72 2019       183.7824
## 2 Southern Europe    13450.0  46.5788  120.6391  169 2019       183.3022
## 3 Southern Europe    51200.0  62.7119  162.4237  137 2019       182.4740
## 4 Northern Europe   100830.0   3.7223    9.6407  179 2019       182.1016
## 5 Northern Europe    40000.0 147.7728  382.7316  115 2019       181.8927
## 6  Eastern Europe    77198.5 135.9521  352.1158   89 2019       181.1866
##   meanHeightFemale rank
## 1         170.3612    1
## 2         169.9609    2
## 3         167.4704    3
## 4         168.9135    4
## 5         169.4706    5
## 6         167.9635    6
## Choosing only columns relevant to original visualisation
Male_height = subset(df, select = c(country, meanHeightMale))

## Choosing only rows/country in the original visualisation
Male_height_sel <- filter(Male_height, country %in%  c("India", "China", "Mexico", "Japan", "Brazil", "Australia","Canada","Italy", "France", "Russia", "United States", "United Kingdom", "Spain", "Greece", "Germany"))


## Reconstructed Visualisation
plot  <- ggplot(Male_height_sel, aes(x = reorder(country, -meanHeightMale), y = meanHeightMale))

plot <- plot + geom_bar(stat = "identity", position = "dodge", colour = "skyblue", fill = "steelblue", width = 0.6) +  geom_text(aes(label = round(meanHeightMale, 1)), size = 3, vjust=-0.3)+labs(title = "Average Adult Male Height by Country",x = "Country", y = "Heights in (cm)") + theme_minimal() + theme(axis.text.x = element_text(angle = 45, hjust = 1), plot.title = element_text(hjust = 0.5, size = 15, face = "bold"), axis.title.x = element_text(face = "bold"), axis.title.y = element_text(face = "bold")) + coord_cartesian(ylim = c(150, 185))

Data Reference

Reconstruction

The following plot fixes the main issues in the original.