Click the Original, Code and Reconstruction tabs to read about the issues and how they were fixed.
Objective
From my point of view, the objective of the original data visualisation is to see the difference height of female between countries more clearly and efficiently.
In this case, the visualization is not complex. The audience can be a non-technical person, a complex visualization may confuses a non-technical audience and prevents them from extracting value. In the most of situation, we can find this kind of visualization online.
The visualisation chosen had the following three main issues:
It is not clear where the y-axis starts in this visualization
The visualization confuses people about the actual difference between these contries. For example, compare the height of Latvia and India in the visualization, it has 0.5 foot of difference. But the visualization shows Indian height just reach Latvia female’s knees which is impossible.
The use of colour scale is inappropriate
This visualization uses three different colours represent 6 different countries of average height which is unnecessary and misleading. For example, Latvia and South Africa are using the same colour, different from other countries, but it does not seen to have any connection. It is important to use colour to differentiate important features.
The information of data is not accurate
We can see from the visualization, it uses simple images of women to represent their average heights with non-standard unit foot. People may not know the height straight from the visualization. For example, the average female height in Australia, Scotland and Peru seem to be equal, But they are not. And the choice of country is not very strict and representative, Scotland is not really a country.
Reference
The following code was used to fix the issues identified in the original.
library(ggplot2)
library(readxl)
average_height <- read_excel("~/Desktop/DV/Assignment/assignment2template1950/average height.xlsx")
average_height$Country<-as.factor(average_height$Country)
average_height$Height<-as.numeric(average_height$Height)
barplot <- ggplot(data = average_height, aes(x=average_height$Country, y=average_height$Height,fill=Gender))
barplot <- barplot + geom_bar(stat="identity", position="dodge") +geom_text(aes(label = paste(Height,"(cm)",sep="")))+
labs(title = "Average Height by Countries", x = "Country", y = "height(cm)") + theme_minimal() + scale_y_continuous(limits = c(0,200))
Data Reference
Height > Data Download > NCD-RisC. Ncdrisc.org. (2020). Retrieved 17 September 2020, from http://www.ncdrisc.org/data-downloads-height.html.
The following plot fixes the main issues in the original.
change the unit to centimeter
Extend y-axis from 0 to 200 centimeter
Adding male average height to have a better comparison with different color scale from female
Adding some more representative countries to have a better comparison