Click the Original, Code and Reconstruction tabs to read about the issues and how they were fixed.

Original


Source: howmuch.net


Objective

The pandemic has affected the revenue of almost every industry. The original data visualisation shows the negative impact of Coronavirus Pandemic on the Tourism Economy for countries which are most frequented by travellers.The comparison is shown in the amount of revenue generated by the countries in 2019 (before the pandemic) with the revenue generated in 2020 (after the start of pandemic).

Targeted Audience: The visualization is an informative economic trend which is published for mass media as a fact, illustrating the crisis for tourism industry and a case study for other industries. So, its targeted audience include regular people along with industry experts and economists.

The visualisation chosen had the following three main issues:

  • Issue 1: The visualization is failing in showing a direct trend in revenue generated in year 2019 and 2020. The motto of the visualization is to provide a negative impact on the tourism economy but at the first glance you can only see Big circles with country’s flag on it. In order to understand the loss of tourism, the audience is forced to read the written numeric values for comparison, which is not a good feature for a visualization.

  • Issue 2: The color scheme used in the visualisation is failing to create a meaningful impact on the audience, as it is categorised in intervals which represent the value of percentage change in the revenue for the year 2019 and 2020. So, the countries within the same interval or color are hard to compare. For example, if person wants to compare the percentage change in the tourism income for Australia and Germany, they would have to dig in more into the visualisation to come to an answer.

  • Issue 3: One more issue with the visualization is the Area and Size as Quantity issue. The visualization shows the bigger the area of the circle the higher the revenue of the country in the previous year. But the area is not comparable always. We can easily understand which country has the highest revenue and which country has lowest revenue, but we cannot compare between the countries with similar revenues for 2019.

Reference

Code

The following code was used to fix the issues identified in the original.

# loading necessary libraries
library(ggplot2)
library(readr)
library(tidyr)
library(dplyr)
library(tidyverse)

# loading the data taken from the Data & Sources section of the Visualisation
datafile <- read_csv('data.csv')
datafile <- data.frame(datafile)
datafile <- datafile[,2:8]

# Data Preparation
data <- datafile[,c(1,4:6)]
colnames(data) <- c('Country', '2019', '2020', 'Economic_Decrement')
data$Economic_Decrement <- as.numeric(gsub("%","",data$Economic_Decrement))
data <- data %>% 
  arrange(Economic_Decrement) # To sort the data in the decreasing order of tourism income change

df <- data %>% 
  pivot_longer(names_to = "Year", values_to = "USD_Million", cols = 2:3 )

df <- as.data.frame(df) %>% 
  mutate(USD_Million = round(USD_Million, 2))

plot <- ggplot(df,aes(x = fct_inorder(Country) , y = USD_Million,
                      fill = Year)) +
  geom_bar(stat="identity", position = 'dodge') +
  theme_bw() +
  labs(title = "Negative Impact of Covid-19 on Tourism",
       x = 'Countries by Tourism Income Change (High to Low)',
       y = 'Tourism Income in USD(Million)') +
  scale_fill_manual(values = c('#29BAB0','#E64141')) +
  theme(axis.text.x = element_text(angle = 90, vjust = 0.4,
                                   hjust=0.8))

Data Reference

Reconstruction

The following plot fixes the main issues in the original.