Click the Original, Code and Reconstruction tabs to read about the issues and how they were fixed.
Objective
The purpose of the visualization is to show the growth in international travel after the second world war ended in 1950’s.The data sourced from United Nations World Tourism Organization (UNWTO) in the Tourism Barometer captures data for the travel period from 1950 to 2018, and it shows the boom in travel industry due to the sudden surge.
The visualization, sourced from Our World in Data (OWID) illustrates the distribution of international travelers arrivals by region.On quick glance at the visualization, it indicates Europe being the most favored region as it shows a larger area of the graph. In the early 1960’s Asia and Pacific had very small proportion of international travelers compared to Europe or America, however based on the increasing area of the graph, it shows there was steady increase in the number of international travelers visting Asis and Pacific since early 2000.
The targeted audience could be tourism related departments such as departments/government organisation such as Foreign Affairs and Trade, businesses related to inbound travel such as Tours,Travels & Expo organizers,Hotel Industry,Wine and Food industry, Travel insurance providers,border security, law enforcement authorities,however its not limited to the many followers of OWID, a scientific online publication that focuses on large global problems and has wide range of audience such as students, researchers,journalist, policy makers etc interested in OWID’s online publications.
The visualisation chosen had the following issues:
Incorrect choice of graph for visualization The visualization was published in the aim to show the readers the boom in travel industry since after the second war ended in 1950’s, however the choice of graph selected by the author does not simplify the information to the end user.The above visualization has been plotted using international travel numbers against the respected area, however given its a quantitative variable, the graph should have been plotted as a time series graph to show the increase in numbers over the years.
Therefore, the above visualization makes it difficult to understand the intended information/question the author was aiming to accomplish.
Choice of color( Hues) The hues used here are shades of red, green, blue and yellow, whilst the authors intention may have been to captivate the readers through catchy colors, based on Evergreen and Emery (2016) Visualization Checklist,its recommended to avoid red-green and yellow-blue combinations when those colors touch one another.
Therefore the choice of hues selected for this visualization is not color blind friendly as colour is legible for people with color-blindness.
Incorrect choice of visualization The visualization presented here does not meet the Trifecta check up.
Based on Fung K(2014), even though the author had access to a good source of data, the visualization has not been executed well, what is presented here is rather confusing to end user.
Therefore, based on all of the above visulization fails the Trifecta of Type QV, as per Kaiser Fung “The data has been properly collected and processed. However, the question being addressed has not been clearly defined, and the graphical design fails to bring out the key features of the data.”
Reference Max Roser (2017) - “Tourism”. Published online at OurWorldInData.org. Retrieved on 27 July 2021 from Our World in Data website: https://ourworldindata.org/tourism [Online Resource]
*Evergreen, Stephanie and Emery, Ann. “Data Visualization Checklist.” May 2016.Accessed on 28 July 2021,from website: http://stephanieevergreen.com/wp-content/uploads/2016/10/DataVizChecklist_May2016.pdf
Fung, K. 2014. “Junk Charts Trifecta Checkup: The Definitive Guide.” Junk Charts.Accessed on 31 July 2021, from website: https://junkcharts.typepad.com/junk_charts/junk-charts-trifecta-checkup-the-definitive-guide.html
About Our World in Data,Acceesed on 31 July 2021, from website https://ourworldindata.org/about
The following code was used to fix the issues identified in the original.
# install.packages
library(ggplot2)
library(readr)
library(tidyr)
library(dplyr)
library(scales)
library(viridis)
library(reshape2)
#load the dataset
international_tourist_arrivals_by_world_region_2 <- read_csv("international-tourist-arrivals-by-world-region 2.csv")
View(international_tourist_arrivals_by_world_region_2)
#rename the dataset
int_tourist_arrv_by_wrld_reg <- data.frame(international_tourist_arrivals_by_world_region_2)
str(int_tourist_arrv_by_wrld_reg)
## 'data.frame': 205 obs. of 4 variables:
## $ Entity : chr "Africa" "Africa" "Africa" "Africa" ...
## $ Code : chr NA NA NA NA ...
## $ Year : int 1950 1960 1965 1970 1975 1980 1981 1982 1983 1984 ...
## $ International.Tourist.Arrivals: int 500000 800000 1400000 2400000 4700000 7200000 8100000 7600000 8200000 8900000 ...
#removing column "Code"
keeps <- c("Entity","Year", "International Tourist Arrivals")
int_tourist_arrv_by_wrld_reg = international_tourist_arrivals_by_world_region_2[keeps]
#checking the data
head(int_tourist_arrv_by_wrld_reg)
## # A tibble: 6 x 3
## Entity Year `International Tourist Arrivals`
## <chr> <int> <int>
## 1 Africa 1950 500000
## 2 Africa 1960 800000
## 3 Africa 1965 1400000
## 4 Africa 1970 2400000
## 5 Africa 1975 4700000
## 6 Africa 1980 7200000
str(int_tourist_arrv_by_wrld_reg)
## tibble[,3] [205 × 3] (S3: tbl_df/tbl/data.frame)
## $ Entity : chr [1:205] "Africa" "Africa" "Africa" "Africa" ...
## $ Year : int [1:205] 1950 1960 1965 1970 1975 1980 1981 1982 1983 1984 ...
## $ International Tourist Arrivals: int [1:205] 500000 800000 1400000 2400000 4700000 7200000 8100000 7600000 8200000 8900000 ...
## - attr(*, "spec")=List of 2
## ..$ cols :List of 4
## .. ..$ Entity : list()
## .. .. ..- attr(*, "class")= chr [1:2] "collector_character" "collector"
## .. ..$ Code : list()
## .. .. ..- attr(*, "class")= chr [1:2] "collector_character" "collector"
## .. ..$ Year : list()
## .. .. ..- attr(*, "class")= chr [1:2] "collector_integer" "collector"
## .. ..$ International Tourist Arrivals: list()
## .. .. ..- attr(*, "class")= chr [1:2] "collector_integer" "collector"
## ..$ default: list()
## .. ..- attr(*, "class")= chr [1:2] "collector_guess" "collector"
## ..- attr(*, "class")= chr "col_spec"
#renaming the coloumns
names(int_tourist_arrv_by_wrld_reg)[names(int_tourist_arrv_by_wrld_reg) == "Entity"] <- "region"
names(int_tourist_arrv_by_wrld_reg)[names(int_tourist_arrv_by_wrld_reg) == "Year"] <- "year"
names(int_tourist_arrv_by_wrld_reg)[names(int_tourist_arrv_by_wrld_reg) == "International Tourist Arrivals"] <- "int_tourist_arrv"
#converting international tourist arrivals to numeric
int_tourist_arrv_by_wrld_reg$int_tourist_arrv<- as.numeric(int_tourist_arrv_by_wrld_reg$int_tourist_arrv)
#converting international tourist arrivals to factor
int_tourist_arrv_by_wrld_reg$year<-factor(int_tourist_arrv_by_wrld_reg$year,levels=c("1960","1965","1970","1975","1980","1985","1990","1995","2000","2005","2010","2015","2020"),ordered=TRUE)
levels(int_tourist_arrv_by_wrld_reg$year)
## [1] "1960" "1965" "1970" "1975" "1980" "1985" "1990" "1995" "2000" "2005"
## [11] "2010" "2015" "2020"
Data Reference
*Data published by the United Nations World Tourism Organization (UNWTO) in the Tourism Barometer, Retrived on 27 July 2021 from https://ourworldindata.org/tourism.
*Max Roser (2017) -Tourism. Published online at OurWorldInData.org. Retrieved on 27 July 2021 from: ‘https://ourworldindata.org/tourism’ [Online Resource]
The following plot fixes the main issues in the original.