Assignment 2

Click the Original, Code and Reconstruction tabs to read about the issues and how they were fixed.

Original

Source: The Observatory of Economic Complexity (2020).

Objective

The visualisation that I have selected to critique was taken from the Observatory of Economic Complexity (OEC), an online data visualization and distribution platform focused on the geography and dynamics of economic activities (OEC 2022).

Specifically, I have selected a visualisation that charts the evolution of market concentration of exports of tomato ketchup and other tomato sauces from around the globe and aims to demonstrate trends over the past 10 years (OEC 2022).

Based on information found on SimilarWeb, a website and audience insights company, we can see that the main audience for the OEC website are visitors aged between 25 - 34, who are referred from news and government websites (Similarweb 2022).

But specific to the visualisation I have selected, I believe that the audience would be business analysts looking for market trends, tomato producers looking for growth markets and business futurists looking to predict what the next 10 years might have in store for the value of exports in tomato ketchup and other tomato sauces.

Visualiation Critique

There are three key issues with the visualisation.

Stacked Area Graph

The author has chosen to use a stacked area graph to display the data, which can have many disadvantages. The biggest disadvantage of a stacked area graph is that the reader will tend to focus on the overall uppermost values which can result in poor interpretation of patterns that are visible. This is because each area of the stacked area graph accompanies the baseline of each preceding set of values and directly resembles its trend (VizWiz 2012).

Furthermore, another disadvantage of the stacked area graph is that it’s incredibly difficult to translate the size of each country/continent over time to form distinct patterns.

Lastly, a stacked area graph with so many variables severely restricts your audience to those that can best understand the complexity of the data you’re hoping to present.

List of Countries

At first glance, it is apparent that the visualisation is showing the value of exports in tomato ketchup and other tomato sauces by country but grouped by continent, with United States leading exports globally. However, the font sizes used are inconsistent and only present for 3 countries, making it impossible to compare values with other countries.

Legend

The legend used in the visualisation is just 7 coloured boxes, with no context whatsoever, making it near impossible to determine what the legend is trying to demonstrate. Given that we see United States is one colour, and both Netherlands and Italy are are different colour, we can assume that the legend is demonstrating continents, but for countries at the top of the visualisation, we cannot ascertain what continents those values belong to.

Solution

To remedy the issues that I have identified above, I have decided to create an individual line graph that aggregates the trade value by continent, instead of looking at the data by individual countries. This will condense the amount of visual information down to a more manageable level, while still demonstrating global value of tomato ketchup and tomato sauce exports.

Please refer to the tab Code to see the code that I created and implemented to create the new line graph.

Reference

(OEC) The Observatory of Economic Complexity (2020) Tomato ketchup and other tomato sauces, OEC website, accessed 11 November 2022. https://oec.world/en/profile/hs/tomato-ketchup-and-other-tomato-sauces
(OEC) The Observatory of Economic Complexity (2022) About the Site, OEC website, accessed 19 November 2022. https://oec.world/en/resources/about
Similarweb (2022) oec.world, Similarweb website, accessed 19 November 2022. https://www.similarweb.com/website/oec.world/
VizWiz (2012) Stacked area chart vs. line chart - The Great Debate, VizWiz website, accessed 19 November 2022. https://www.vizwiz.com/2012/10/stacked-area-chart-vs-line-chart-great.html

Code

The below code was created to remedy the issues that were identified in the original data visualisation.

As mentioned in the Original tab, I have decided to create an individual line graph that aggregates the trade value by continent, instead of looking at the data by individual countries.

In order to achieve this, I first imported the data using read.csv() function and then used the aggregrate() function to create a new data frame that grouped the Trade.Value values by Continent and Year.

After this, I then used ggplot2 to create an individual line graph that takes the Year values as the x-axis, Trade Value as the y-axis and displayed the data by Continent. Using scale_x_continuous(), scale_y_continuous(), labs() and theme(), I was able to manipulate the overall design of the graph to ensure that the final visualisation was easy to read.

# Open library packages 'ggplot2' and 'scales'
library(ggplot2)
library(scales)

# Import csv file into data frame
df <- read.csv('Value-of-Exports-in-Ketchup.csv')

# Aggregate data by continent
new_df <- aggregate(df$Trade.Value, by = list(df$Continent, df$Year), FUN = sum)

# Add headers to each column
colnames(new_df) <- c("Continent", "Year", "TradeValue")

# Create line graph using ggplot2 package
df_plot <- ggplot(new_df, aes(x = Year, y = TradeValue, group = Continent)) + 
  geom_line(aes(color = Continent), size = 1) + 
  geom_point(aes(color = Continent), size = 2) +
  scale_y_continuous(labels = unit_format(unit = "M", scale = 1e-6)) +
  scale_x_continuous(breaks = scales::pretty_breaks(n = 10)) +
  labs(x = "Year", y = "Value of Global Exports ($USD)", title="Value of Exports in Tomato Ketchup and other Tomato Sauces", subtitle="by Continent from 2010-2020", caption="Data Sourced from: The Observatory of Economic Complexity (2020)") +
  theme(plot.title = element_text(size=12, face="bold",hjust = 0.5), plot.subtitle = element_text(size=10,hjust = 0.5), plot.caption = element_text(size=9,hjust = 0.5))

Data Reference

(OEC) The Observatory of Economic Complexity (2022) Tomato ketchup and other tomato sauces, OEC website, accessed 11 November 2022. https://oec.world/en/profile/hs/tomato-ketchup-and-other-tomato-sauces
CEPII (2022) BACI, CEPII website, accessed 11 November 2022. http://www.cepii.fr/CEPII/en/bdd_modele/bdd_modele_item.asp?id=37
H Wickham (2016) ggplot2: Elegant Graphics for Data Analysis, R Studio website, accessed 16 November 2022. https://ggplot2.tidyverse.org/
H Wickham, D Seidel (2022), scales: Scale Functions for Visualization, R Studio website, accessed 16 November 2022. https://scales.r-lib.org

Reconstruction

In this final section, you will able to see the newly generated data visualisation that fixes all the issues from the original data visualisation listed in Original.

Assignment 2

Deconstruct, Reconstruct Web Report

Aaron Jewell s3207613

Original

Code

Reconstruction