Click the Original, Code and Reconstruction tabs to read about the issues and how they were fixed.
Objective
The above data visualisation was published by HowMuch and was created to compare the most expensive skyscrapers built within the last 30 years, and let the audience member make comparisons based on height, location and cost to find possible trends in the data.
To understand the target audience of the visualisation, it is important to firstly look into where the chart was published. HowMuch.net is a financial literacy website. They have a focus of presenting facts and data visualisations to its audience to teach them about money related topics. This specific visualisation was found under the real estate tab of the website which will help us in understanding the target audience.
The actual data visualisation was created to catch your eye, rather than solely give out factual information to the audience, as seen by the intriguing graphics used in the chart and bright colours. If the purpose was to solely give out information to the audience, they would have used a much simpler bar chart, as the visuals make the chart a bit harder to understand. For this reason industry professionals and those seriously interested in skyscrapers wouldn’t be the main audience, as they would prefer a visualisation that values providing information, rather than distracting visuals being prioritised.
Utilising both of these factors, I concluded the intended audience for this chart are members of the public interested in money and real estate.
The visualisation chosen had the following three main issues:
The images of the skyscrapers are very distracting, and could be misleading to the viewer. The figures used make it seem like the height and size of the image represents the actual size of the building, however this is not the case, as the size of the image is dictated by the cost. The sizing of the images also make the graph seem very cluttered on the less expensive side, as the images are so small and close together.
The x axis tick mark labels on this graph are difficult to read on the right hand side of the chart, which is a combination of the angled text, x-axis tick marks being hard to see, and text being bunched together so close. On the expensive side, the labels are quite easy to read however when looking at labels such as the Antilia Skyscraper or The Burj Khalifa, it is hard to read, and difficult to see which text is assigned to which skyscraper. This is especially concerning as the target audience may not be familiar with the exact shape of the building to match the text, so labels of the buildings are very important here.
The colour on this graph holds no meaning, making the colour more distracting than helpful. The random colours used for each skyscraper is a bit distracting and overwhelming on first glance, and after understanding the graph it is evident that the colour was used purely to make the graph stand out more. Colour in graphs should hold meaning and be used to make the graph simpler to read, or to give the audience extra information.
Reference
The following code was used to fix the issues identified in the original.
library(ggplot2)
#Load in dataset
skyscraper <- read.csv("Skyscraper_data.csv", header = TRUE,)
#Rename columns
colnames(skyscraper) <- c("Location", "Name", "Construction_year", "Cost")
#Create vector with continent data
continent <- as.factor(c("Asia", "Asia", "Asia", "Asia", "Asia", "Europe", "Europe", "Asia", "North America", "Asia", "Asia", "North America", "North America", "North America", "Asia", "Asia"))
#Add coninent vector to skyscraper dataset
skyscraper$Continent <- continent
#Create plot
p1 <- ggplot(skyscraper, aes(x = Cost, y = reorder(paste(Name, Construction_year), Cost), fill = Continent) )
p1 <- p1 + geom_bar(stat="identity", width = 0.8) +
labs(title="16 Most Expensive Skyscrapers Built in the last 30 Years", x = "Cost ($Billions)", y = "Skyscraper") +
theme(plot.title = element_text(hjust = 1), panel.grid.major = element_blank(), panel.grid.minor = element_blank(),
panel.background = element_blank()) +
geom_text(aes(label = paste(Cost,"B")), position = position_dodge(width=0.9), hjust=-0.25, size = 3) +
xlim(c(0, 16)) +
scale_fill_manual(values = c("#39cedb","#dba839", "#be03fc"))
Data Reference
The following plot fixes the main issues in the original.