Original

an image caption source: Avg Fem Height

*Source: Reddit (2020).*

Objective

The objective of the original graph is to depict the average height of women across 6 countries, namely Latvia, Australia, Scotland, Peru, South Africa, and India. The graph was shared on Reddit on 19th January 2020, which listed morethanmyheight.com as the original source, but that website was not accessible due to the website being not secured and posed a potential threat to my device. However, some information on the website was available. More Than My Height (MTMH) is a company built to promote body positivity movement targeted to improve self confidence of tall women, as well as sell clothes designated for taller women. Hence, this graph is probably used to show women that the average height of women differs across different countries, hence is okay to not be of similar height compared to their peers.

Based on the graph, the targeted audience is highly likely to be females, especially of taller in height.

The visualisation chosen had the following three main issues:

  • Misleading scale: The Y-axis did not start from 0, which showed a distorted and exaggerated difference between the average height of women across the countries. In the graph, the average height of Latvian women was portrayed as approximately 4 times the height of Indian women, which was highly inaccurate as the actual height difference between Latvian women and Indian women was 5 inches.

  • Symbol and colour issues: The female symbols used to plot the average height of female samples were inappropriate as all symbols were overlapping each other, making the graph hard to read. The colours used for the symbols were different shades of pink but the different shades did not represent any information.

  • Data integrity: The graph did not mention the source of data. Besides, the year of measurements obtained and age of female samples were not stated. Both factors can influence the measurements.

Reference

Code

The following code was used to fix the issues identified in the original.

#load packages
library(ggplot2)
library(dplyr)

#import and filter data
height <- read.csv("ahow.csv")
height1 <- height %>% filter(Entity %in% c("Australia", "India", "Latvia", "Peru", "South Africa", "United Kingdom"),  Year=="1996")
height1$Code <- NULL
height1$Year <- NULL

#convert into data frame
as.data.frame(height1)
##           Entity Mean.female.height..cm.
## 1      Australia                165.8582
## 2          India                152.5896
## 3         Latvia                169.7979
## 4           Peru                152.9325
## 5   South Africa                158.0261
## 6 United Kingdom                164.4021
#draw plot
p <- ggplot(data = height1, mapping = aes(x = height1$Entity, y = height1$Mean.female.height..cm.))+
  geom_bar(stat = "identity", fill = "pink3") + geom_text(aes(label = round(height1$Mean.female.height..cm., digits = 2)),  position=position_dodge2(width=0.9), vjust=-0.25)

#axis labels and title
p <- p+labs(x = "Country", y = "Mean Height (cm)", title = "Average Height of Females Born in 1996 by Country", subtitle = "Mean Height of 18 Years Old Females Measured in 2014", caption = "Data Source: Our World in Data" )
p <- p+theme(plot.title = element_text(hjust = 0.5), plot.subtitle = element_text(hjust = 0.5))

Data Reference

Reconstruction

The following plot fixes the main issues in the original.

* Scale: The Y-axis was fixed to start from 0 to show a more appropriate difference in scale.

  • Symbol and colour: Non-overlapping bars were used to plot the average height of female samples, and the mean heights of females from each country were stated above each bar to increase readability. Only one colour was used as more colours were not necessary

  • Data integrity: This reconstructed graph mentioned the source of data, year of measurements obtained, and age of female samples.