Click the Original, Code and Reconstruction tabs to read about the issues and how they were fixed.

Original


Source: Elements (2022).


Objective

To inform the reader about some of the largest Iron Ore producers in the world through an interesting visualisation (given the target audience and what Elements is known for).

Supporting note: Elements is a resources focused visualisation website which appears to be a sub-division of Visual Capitalist who specialise in data-driven visuals.

Targeted audience

  • Given the other visualisations covered within Elements, the audience are potentially investors looking at resources (commodities) for investment or out of personal interest with a high level understanding of the subject matter.
  • With the lack of jargon and simplicity otherwise, we could also assume that these could be presented to the general population (the visuals are often re-shared through general social media).
  • With regards to time availability, it is not entirely straight forward to confirm as looking at how many of their other visuals are posted on Social Media and the characteristics of media there, it could be assumed that the audience is time poor (we could also assume investors generally fall into this bucket).

Issues

The visualisation chosen had the following three main issues:

  1. Data used are estimates and not actuals: Reading the original USDS report, there is a small notation next to the heading for 2021 figures noting that figures are estimates. However no where on the visualisation was this made clear, and without knowing this - it would appear the figures were “rounded up” given the unlikelihood of some of the figures. While the figures could be within a certain tolerance of the actual values and still gives a reasonable view of the reality - this detail still needs to be known. This may be due to the fact that the visualisation could be seen as less credible if the figures displayed were only estimates. This could present issues for the viewer when accuracy is important, eg. Reporting to shareholders, government bodies, or prospective investors (which is likely the target audience). On the other hand this probably won’t have serious implications for a general audience just trying to understand the basics of varied production levels between countries, but is generally good practice to note when there are estimates used regardless.

Supporting note: When cross-checked against S&P Global Market Intelligence figures (DeCoff, S. 2022) the country figures differ to varying degrees (Australia 35% vs 37%, Brazil 15% vs 17%, China 14% vs 11% and so on).

  1. Unable to easily visualise ranking with random placement: While not meant to be deceptive (as the priority appears to be aesthetics), the visualisation does make it easy to tell the largest Iron Ore producers but from there - it becomes a time consuming search for the next largest Iron Ore producer as there is no logical ordering to the country placement on the sphere shape (even if we consider each country’s location on the planet at a stretch) and leaves no intuitive aspect to the design (In fact on the original website, it publishes an additional table listing production volumes sorted from largest to smallest in order to compensate for the visual).

  2. Inconsistent sizing and shapes: While we can immediately tell “outlier”/larger producing countries, once the size becomes similar - visually they are difficult to tell apart as no sections/shapes are identical. For example, ignoring the label text for MT we might mistake Peru as being larger than Turkey or Mexico producing more than Chile because the outer border is not clearly visible. This also makes it visually difficult to draw comparisons between production amounts between countries. This extends to country font sizes - looking at Brazil’s font size vs Australia’s based off this alone, one might think Brazil produces more Iron Ore (where the reality is significantly different). In this specific example, it is not as though Australia’s area doesn’t allow for a larger font size which makes this design choice even more confusing.

References

Code

The following code was used to fix the issues identified in the original.

library(ggplot2)
library(dplyr)
library(readr) # Useful for importing data
library(magrittr) # Enables the use of pipe operators for readability
library(tidyr) # Reshapes and splits data/cells
library(stringr) # Handy string operations
library(ggthemes) #Additional ggplot themes
library(openxlsx) #To import xlsx
#install.packages("countrycode")
library(countrycode) #To look up the ISO code for each country
#install.packages("ggimage")
library(ggimage) #For adding the background image
#install.packages("jpeg")
library(jpeg) #As above
#install.packages("ggpubr")
library(ggpubr) #As above

IronOreProd2021 <- read.xlsx("USGS_mcs2022_p90_extract.xlsx", colNames = TRUE, startRow = 2)

colnames(IronOreProd2021)[1] <- "Country"
colnames(IronOreProd2021)[2] <- "Mt_Produced"

#Manual sort instead of reorder(Country,Mt_Produced)
IronOreProd2021$Country <- factor(IronOreProd2021$Country, levels = c("Other countries", "Peru", "Turkey", "Mexico", "Chile", "Sweden", "United States", "Iran", "South Africa", "Kazakhstan", "Canada", "Ukraine", "Russia", "India", "China", "Brazil", "Australia"))

IronOreProd2021 %<>%
  mutate(Prop = Mt_Produced/sum(Mt_Produced))

#summary(IronOreProd2021)

IronOreProd2021$ISO2 <- countrycode(IronOreProd2021$Country, "country.name", "iso2c")
IronOreProd2021$Continent <- countrycode(IronOreProd2021$ISO2, "iso2c", "continent")

# Code for using Continent dimension instead but creates potential colour blindness weaknesses. Will also need to update aes fill to Continent.
# IronOreProd2021[IronOreProd2021$Country == "Other countries",]$Continent <- "NA"
# Colors <- c("Americas" = '#b2182b',"Africa" = '#ef8a62', "Europe" = '#fddbc7', "Asia" = '#d1e5f0', "Oceania" = '#67a9cf', "NA" = '#2166ac')
# scale_fill_manual(values = Colors)

img <- readJPEG("Background.jpg")

p1 <- ggplot(data=IronOreProd2021, aes(x=Mt_Produced/1000, y=Country, fill=Prop))
p1 <- p1 + background_image(img) +
  geom_vline(xintercept = c(0,100,200,300,400,500,600,700,800,900), alpha = 0.2, colour = "white", size = 1) +
  geom_bar(stat = "identity", color = "black", size = 1.2, position = position_dodge(), lwd=0.2, alpha = 0.9) +
  labs(x='Estimated Million Tonnes produced', y='Country', title='Australia is estimated to dominate global iron ore production for 2021', subtitle = 'Iron ore comprised roughly 93% of the 2.7 billion tonnes of metals mined in 2021. Its primary use is to make steel. \nThis is mined in more than 50 countries, but seven of these account for 82% of total world production.\n\nData is based off usable ore estimates for 2021 with labels indicating estimated percentage of global production.\n', fill = "Percentage of\nTotal Global Production", caption  = "Source: USDS Annual Summary for 2021. \nBackground image: Edited from Elements - Visualizing the World’s Largest Iron Ore Producers 2022.") +
  geom_flag(x = -50, aes(image = ISO2), size = 0.06, by = height) +
  # expand_limits(x = -50000) +
  geom_text(aes(label = paste0(round(Prop*100,2),'%')), hjust = -0.1, color = "white", size = 8.5, position = position_dodge(width = .9)) +
  theme_stata(base_size = 20) + 
  theme(plot.title = element_text(size = 38, face="bold", colour = "white", hjust = 0),
        plot.title.position = "plot",
        plot.subtitle = element_text(size = 28, colour = "white", hjust = 0),
        plot.caption = element_text(face = "italic", size = 20, colour = "white", hjust = 0,  vjust = -7),
        axis.text=element_text(size=25, colour = "white"),
        axis.title=element_text(size=26, colour = "white"),
        axis.text.y = element_text(angle = 0, colour = "white"),
        axis.text.x = element_text(colour = "white"),
        axis.title.y = element_text(colour = "white"),
        axis.title.x = element_text(vjust = -2, colour = "white"),
        panel.grid.major.y = element_blank(),
        panel.border = element_blank(),
        # panel.grid.major.x = element_line(size = 1),
        plot.background = element_rect(fill = '#901e04'),
        # panel.background = element_rect(fill = '#fed39e'),
        legend.position = "none") +
        # legend.position=c(1,1),
        # legend.direction="horizontal",
        # legend.justification=c(1, -0.1),
        # legend.text = element_text(colour = "black"),
        # legend.title = element_text(colour = "black", size = 22),
        # legend.key.width = unit(1.5, 'cm'),
        # legend.background=element_rect(fill = alpha("white", 0.8))) +
  scale_x_continuous(breaks = c(0,100,200,300,400,500,600,700,800,900), limits = c(-50,950), labels=function(x) format(x, big.mark = ",", decimal.mark = ".", scientific = FALSE)) +
  # scale_fill_gradientn(colours = c("#d73027","#fc8d59","#fee090","#e0f3f8","#91bfdb","#4575b4"), labels = scales::percent)
  scale_fill_gradientn(colours = c('#f7fcfd','#e0ecf4','#bfd3e6','#9ebcda','#8c96c6','#8c6bb1','#88419d','#810f7c','#4d004b'), labels = scales::percent) +
  annotate("text", x = 950, y = "Brazil", label = "Australia and Brazil together dominate \nthe world's iron ore exports, each having \nabout one-third of total exports.", size = 9, colour = "white", hjust = "right", vjust = 0.8) 
  # + annotate("text", x = 985000, y = "Other countries", label = "Source: USDS Annual Summary for 2021.", size = 8, colour = "white", hjust = "right")
  # + annotate("text", x = 100000, y = "Other countries", label = "3.53%", size = 8, colour = "white", hjust = "left")

#library(colorblindr)
#cvd_grid(p1)
#Acceptable colour palette, may be hard to differentiate for "Other Countries" however but bar size should compensate to draw comparisons to Russia or Ukraine using the vertical grid lines. Labels in place to ensure this does not become an issue regardless.

Data Reference

Note: No table published data is available for 2022 and has been manually extracted from the PDF table. Format has been kept original to demonstrate pre-processing.

Other References

Reconstruction

The following plot fixes the main issues in the original.

Issues resolved

  1. Data used are estimates and not actuals: Where appropriate, the visualisation text has been marked with “estimated” to ensure the audience has the correct understanding of the figures to avoid misuse of the information. Unfortunately locating actual values seems commercially sensitive and not consistently made public which would have made the visualisation more credible for a targeted audience of investors.

  2. Unable to easily visualise ranking due to random placement: Through the use of an ordered-bar chart we are now immediately able to see the ranking of the countries. Supporting this are the flags of each of these countries, making it easier for readers to quickly identify their own countries instead of through text (Unfortunately borders could not be added to improve visibility). “Other countries” has been added intentionally to the bottom as we may get distracted while reading through the known countries at the top (unfortunately detail on what included in this categorization is unknown).

  3. Inconsistent sizing and shapes:: A consistent scale has been used across the bars and font size so we are able to visually draw immediate comparisons to other countries. For example, we can immediately see visually (without reading) that Australia produces more than Brazil and China combined. The colour gradient fill of the bar further supports this by scaling to percentage of total global production (while taking in colour blindness requirements to contrast against the background, with the legend has been removed due to the label presence).

In addition to the font size consistency, a hierarchy of sizing is largely followed along with other guidelines as per Evergreen Data’s rating guidelines (Only deviating slightly with the use of a background image to try and retain aesthetic appeal as per the original and grid lines retained to relate to a different variable Mt instead of percentage).

Reference for background image (Edited)