Click the Original, Code and Reconstruction tabs to read about the issues and how they were fixed.
Objective
Objective of the visualisation is to show the Australian inbound travel/Inbound tourism statistics for Financial Year 2018-2019.
Audience:
Australian tourism related departments such as Department of Foreign Affairs and Trade, Australian Tourism Export Council ,Tourism Australia and Economists, businesses related to inbound travel such as Tours and Travels, Wine and food industry and Travel insurance provides.
The visualisation chosen had the following three main issues:
Objective of the visualisation can be effectively viewed by adding up the number of Adult visitors(Age >15)and their expenditure data to the existing number of visitors data by their country of residence.
For example New Zealand ranked 2nd as per the visitors and USA ranked 3rd with the difference of 6.4% almost,however when comes to expenditure USA tourists spent more money(3.1% more approximately) during their trip to Australia than that of New Zealand tourists.
Thus, just by looking at the number of visitors itself from a country,don’t met the objective of the inbound travel statistics of Australia for the considered year.
Unusual convention and use of colors with out purpose may confuse the audience.
For example using the same colors to represent different countries with out any relation between them in this context and unusual shell shape with area as representation of values is difficult to compare and checking the accuracy of the data shown.
Colors used are also not color blind safe, Hence people with color blindness specially protonomaly people are not able view the partitions between countries as they used area to represent the values and shell shape convention.
Percentages shown were mismatching with the numbers provided.
For example total visitors= 9344000
China visitors=1432800
China visitors_percentage=1432800/9344000*100=15.333%
where as it was shown as 15.8%
Hence, if we add all the percentages provided in the image it is making up to 103%, which indicates a data accuracy issue. The incorrect values are conveying a wrong message to the audience.
Reference
The following code was used to fix the issues identified in the original.
setwd("C:/Users/sneha/Desktop/DataVisualisation&Communication/Assignment2")
# Loading libraries
library(readxl)#To load data from excel files
library(dplyr)# For data manipulation
library(stringr)# For string manipulations
library(lubridate)# For date manipulations
library(tidyr)# To tidy up the data
library(ggplot2)#For generating the visualization
#Visitors Data Loading and Manipulation
myCols <- as.character(read_excel("340105.xls",sheet="Data1",
n_max = 1, col_names = FALSE))
myCols <- c("Date",myCols)
visitors<- read_excel("340105.xls",sheet="Data1", skip = 10,col_names =myCols)
#str(visitors)
colnames(visitors) <- colnames(visitors) %>%
str_replace_all(c("Number of movements ; " = "",
" ; Short-term Visitors arriving ;" = ""))
#str(visitors)
visitors_FY19 <- visitors %>% filter((year(Date)==2018 & month(Date) %in% c(7,8,9,10,11,12))
| (year(Date)==2019 & month(Date) %in% c(1,2,3,4,5,6)))
visitors_FY19_Long <- visitors_FY19 %>% gather("Country","Total_Visitors",2:70)
visitors_by_country <- visitors_FY19_Long %>% select(Country,Total_Visitors) %>%
group_by(Country) %>% summarise_all(sum)
visitors_by_country <- visitors_by_country %>%
mutate(Category= ifelse(((str_detect(visitors_by_country$Country,pattern="Total"))| (str_detect(visitors_by_country$Country,pattern="Other")))==TRUE,
visitors_by_country$Category<- "Group of Countries",
visitors_by_country$Category <- "Country"))
visitors_by_country$Category[visitors_by_country$Country=="Total (Country of stay/residence)"] <- "All Countries"
visitors_by_country$Country <- visitors_by_country$Country %>%
str_replace_all(c("UK, CIs & IOM"="UK",
"United States of America"="USA",
"Korea, South"="South Korea"))
visitors_by_country <- visitors_by_country %>%
arrange(Category,desc(Total_Visitors))
Total_Visitors=visitors_by_country$Total_Visitors[visitors_by_country$Category=="All Countries"]
Country_Visitors <- visitors_by_country %>%
filter(Category=="Country") %>%
select(Country,Total_Visitors)
#Expenditure Data loading and Manipulation
expenditure <- read_excel("IVS_TOURISM_RESULTS_YE_JUN_2019.xlsx",sheet="Table 1a",
col_names=c("Country"," Adult_Visitors('000)",
"Visitor_Nights('000)",
"Total_Trip_Expenditure($M)"),range="A34:D57")
expenditure$Country <- expenditure$Country %>%
str_replace_all(c("United States of America"="USA",
"United Kingdom"="UK",
"Korea"="South Korea"))
expenditure <- expenditure %>%
mutate(Category= ifelse(str_detect(expenditure$Country,pattern="Other ")==TRUE,
expenditure$Category<- "Group of Countries",
expenditure$Category <- "Country"))
expenditure$Category[expenditure$Country=="Total"] <- "All Countries"
expenditure <- expenditure %>% arrange(Category,desc(`Total_Trip_Expenditure($M)`))
Total_Adult_Visitors=expenditure$` Adult_Visitors('000)`[expenditure$Category=="All Countries"]
Total_Expenditure=expenditure$`Total_Trip_Expenditure($M)`[expenditure$Category=="All Countries"]
Country_Expenditure <- expenditure %>%
filter(Category=="Country") %>%
select(Country,` Adult_Visitors('000)`,
`Total_Trip_Expenditure($M)`)
#Joining Visitors and expenditure data sets
Visitors_Expenditure <- Country_Visitors %>% left_join(Country_Expenditure,by="Country")
Visitors_Expenditure <- Visitors_Expenditure[complete.cases(Visitors_Expenditure),]
Visitors_Expenditure$Total_Visitors <- Visitors_Expenditure$Total_Visitors/1000
colnames(Visitors_Expenditure)[which(names(Visitors_Expenditure) == "Total_Visitors")] <- "Total_Visitors('000)"
Visitors_Expenditure$Visitors_Percent <- Visitors_Expenditure$`Total_Visitors('000)` /(Total_Visitors/1000)*100
Visitors_Expenditure$Adult_Visitors_Percent <- Visitors_Expenditure$` Adult_Visitors('000)`/Total_Adult_Visitors*100
Visitors_Expenditure$Expenditure_Percent <- Visitors_Expenditure$`Total_Trip_Expenditure($M)`/Total_Expenditure*100
Visitors_Expenditure$Visitors_Rank <- rank(-Visitors_Expenditure$`Total_Visitors('000)`)
Visitors_Expenditure$Adult_Visitors_Rank <- rank(-Visitors_Expenditure$` Adult_Visitors('000)`)
Visitors_Expenditure$Expenditure_Rank <- rank(-Visitors_Expenditure$`Total_Trip_Expenditure($M)`)
Visitors_Expenditure_Long <- Visitors_Expenditure %>%
select(Country,`Total_Visitors('000)`,
` Adult_Visitors('000)`,
`Total_Trip_Expenditure($M)`)%>%
gather(key=Category,value=Value,`Total_Visitors('000)`,
` Adult_Visitors('000)`,
`Total_Trip_Expenditure($M)`)
Visitors_Expenditure_Long$Percentage <- c(Visitors_Expenditure$Visitors_Percent,
Visitors_Expenditure$Adult_Visitors_Percent,
Visitors_Expenditure$Expenditure_Percent)
Visitors_Expenditure_Long$Ranks <- c(Visitors_Expenditure$Visitors_Rank,
Visitors_Expenditure$Adult_Visitors_Rank,
Visitors_Expenditure$Expenditure_Rank)
#Generating Visualisation
Visitors_Expenditure_Long$Country <- Visitors_Expenditure_Long$Country %>%
factor(levels=rev(c(Visitors_Expenditure_Long$Country[1:19])))
Visitors_Expenditure_Long$Category <- factor(Visitors_Expenditure_Long$Category,
levels=c("Total_Visitors('000)",
" Adult_Visitors('000)",
"Total_Trip_Expenditure($M)"),
labels=c("Total Visitors('000)",
"Adult Visitors('000)",
"Total Trip Expenditure($M)"))
p1 <- ggplot(data=Visitors_Expenditure_Long,aes(x=Country,y=Value,fill=Percentage))
p2 <- p1+geom_bar(stat="identity") + coord_flip()+
facet_grid(.~Category,scales="free") +
labs(title = "Short term Visitors to Australia(FY:2018-19)",
subtitle="Visitors and their Expenditure By Country of Residence",
caption=c(paste(" Data Sources:\n",
"https://www.abs.gov.au/\n",
"https://www.tra.gov.au/"),
"Expenditure Estimates are for international Adult
visitors(15 years and over)",
paste("Total Visitors:",Total_Visitors,"\n",
"Total Adult Visitors",Total_Adult_Visitors*1000,"\n",
"Total Expenditure:",Total_Expenditure,"(Million Dollars)")),
fill="Percentage") +
theme(axis.title.x=element_blank(),
axis.title.y=element_blank(),
plot.caption = element_text(hjust =c(0,0.5,1)),
plot.title.position = "plot",
plot.caption.position = "plot") +
geom_text(aes(label=round(Percentage,2)),colour="#000700",
hjust = "inward",size = 3)
Data Reference
Australian Bureau of Statistics;Overseas Arrivals and Departures, Australia, Feb 2020. Date Published: 15 Apr 2020.Catalogue Number: 3401.0. Data downloads: Table 5 :Short-term movement, visitors arriving - selected countries of residence: original.
Tourism Research Australia.Home/Data and Research.Data for the year ending June 2019 from the international visitor survey (IVS).
website:https://www.tra.gov.au/Data-and-Research/publications
The following plot fixes the main issues in the original.