Original


Source: Visual Capitalist (2022).


Objective

The objective of the original visualisation was to communicate the world’s top rice producers, and compare the amount (in tonnes) they produce within that top ten. The intended audience are readers of the Visual Capitalist and associated outlets - a well educated but general audience who are unlikely to be subject matter experts in rice production.

The visualisation chosen had the following three main issues:

  • The tonnage of rice produced is indicated by area, which is low in ranking for perception accuracy for primary quantitative variables (Cleveland & McGill 1985). Due to the indicators of area being 3d the ratio is not 1:1, which would be best practice if using a area as an indicator.
  • Connectedness is used but not in association with a recorded relationship or connection (it is seen between Indonesia and India, Bangladesh and Vietnam and the lowest four countries). As connectedness is a stronger indicator that proximity, colour, size and shape (Baglin 2020) it is deceptive. Readers will more easily perceive a non-existent relationship between these rice producing countries over the area indicator for production.
  • In this context, use of colour detracts from understanding the information. Colour is used to add shadow, shading and aspect to the information rather than highlighting differences in rice production or a secondary variable. As such, it makes the area indicator for tonnage more challenging to read and detracts the readers attention away from the purpose of the visualisation.

References

  • Baglin, J 2020, Data Visualisation: From Theory to Practice.

  • Cleveland, W & McGill, R 1985, Graphical Perception and Graphical Methods for Analyzing Scientific Data, Science vol. 229, no. 4716, pp.828-833.

  • Wallach, O & Realey, A 2022, Visualizing the World’s Biggest Rice Producers, Retrieved March 19, 2022 from Visual Capitalist website: https://www.visualcapitalist.com/worlds-biggest-rice-producers/

Code

The following code was used to fix the issues identified in the original.

#libaries

library(ggplot2)
library(dplR)
library(tidyverse)


## copied data from data table on https://www.visualcapitalist.com/worlds-biggest-rice-producers/ to CSV and imported

rice <- read_csv("Rice.csv", col_types = cols(Country = col_factor(levels = c("China", "India", "Indonesia", "Bangladesh", "Vietnam", "Thailand", "Myanmar", "Philippines", "Pakistan", "Brazil", "Others"))))

 View(rice)     
 
 # rename variables
 
 names(rice)[names(rice) == 'Million Tonnes Rice Produced'] <- 'Produced'

  names(rice)[names(rice) == '% of Total'] <- 'Proportion'

# add colour variable for clearer visualisation
  
rice <- rice %>% mutate(category = case_when(
    Country == "Others" ~ "#c7eae5",
    Proportion >=0.25 ~ "#01665e",
    Proportion < 0.25 ~ "#5ab4ac"
    
  )) 

rice$category <- as.factor(rice$category)
  


 # generating barplot
 r <- ggplot(data = rice, aes(x=Produced, y=Country, fill = rice$category))
 r <- r + geom_bar( stat= "identity") + geom_text(aes(label = scales::percent(Proportion)), stat = "identity", hjust = 0, colour = "black", size=2) +theme_minimal()+scale_fill_manual(values = levels(rice$category))+ guides(fill="none")+xlab("Million tonnes produced (2019)")+ylab("Country")+ggtitle("China is leading rice producer making over a quarter of total exports", subtitle = "9 of 10 top rice producing countries are in Asia")

Data Reference

Reconstruction

The following plot fixes the main issues in the original.