In this assignment, a data visualization with issues is taken and reconstructed to produce a neat visualization without the issues.
Click the Original, Code and Reconstruction tabs to read about the issues and how they were fixed.
Objective
The visualization aims to educate the audience on the characteristics of the world’s first all-electric passenger aircraft, the Alice, and see how they compare to those of the other popular aircrafts. The target audience for this visualization would range from the general public who is intrigued by this topic to investors who are keen to receive the complete information on a subject matter before making decisions on financial investments.
The visualisation chosen had the following three main issues:
Inconsistency: The type of visualization to depict information varies for each characteristic- there are polar graphs, arrow bar charts, circular charts, etc. This is unnecessary and inconvenient as the audience might need to spend time understanding how to read each chart. Also, the names of the aircrafts aren’t consistent across charts. For example, it is called “Heli” in some charts and “Helicopter 300” in others. Such inconsistent elements could be confusing to the reader and make the visualization look disorganized.
Low data-ink ratio: The usage of a picture of an airplane in the background is unnecessary. The maximum focus should be on the data. Similarly, the background colour contributes to the low data-ink ratio. Such unnecessary aesthetics add no value and distract the audience from the key insights that the visualization is trying to convey.
Poor choice of colours: In some cases, the colours used to highlight key information are muted and so light that they get missed. For example, at first glance, it looks like the chart for ‘Fuel cost’ doesn’t have any labels. It takes some time for the user to figure this out and follow the pale dotted white line, which is connected to the first chart containing labels. There is a clear lack of contrast between this line and the background. Another example is the bad usage of colour in the ’Passenger Capacity’ chart. There is no distinct separation between the fancy (and unnecessary) bar-lines and the lines indicating the actual values.
Reference
The following code was used to fix the issues identified in the original.
#Loading the required libraries
library(ggplot2)
library(tidytext)
library(dplyr)
#Reading the data extracted from the website mentioned in the Data Reference section
data <- read.csv('/Users/gauravprakashbharadwaj/Desktop/data.csv')
head(data)
## Feature Aircraft Type Value X X.1 X.2
## 1 Max cruising altitude (m) Boeing 747 Normal aircraft 10600 NA NA NA
## 2 Max cruising altitude (m) Airbus A320 Normal aircraft 10600 NA NA NA
## 3 Max cruising altitude (m) Cessna Normal aircraft 10600 NA NA NA
## 4 Max cruising altitude (m) Alice Electric aircraft 3000 NA NA NA
## 5 Max cruising altitude (m) Airbus H175 Helicopter 1768 NA NA NA
## 6 Fuel cost for 160km (USD) Boeing 747 Normal aircraft 644 NA NA NA
str(data)
## 'data.frame': 25 obs. of 7 variables:
## $ Feature : chr "Max cruising altitude (m)" "Max cruising altitude (m)" "Max cruising altitude (m)" "Max cruising altitude (m)" ...
## $ Aircraft: chr "Boeing 747" "Airbus A320" "Cessna" "Alice" ...
## $ Type : chr "Normal aircraft" "Normal aircraft" "Normal aircraft" "Electric aircraft" ...
## $ Value : int 10600 10600 10600 3000 1768 644 331 158 67 10 ...
## $ X : logi NA NA NA NA NA NA ...
## $ X.1 : logi NA NA NA NA NA NA ...
## $ X.2 : logi NA NA NA NA NA NA ...
#Performing the necessary type conversions
data$Type<- factor(data$Type , levels=c("Normal aircraft","Electric aircraft","Helicopter"))
data$Aircraft<-as.factor(data$Aircraft)
data$Feature<- factor(data$Feature, levels=c('Range (km)',
'Max cruising altitude (m)','Top speed (km/h)','Fuel cost for 160km (USD)', 'Passenger capacity'))
#Using the ggplot function to create the final visualization
s1<- ggplot(data=data, aes(x=reorder(Aircraft,Value),y=Value, fill=Type)) +
geom_bar(stat="identity") + coord_flip() + facet_wrap(vars(Feature),scales="free_x") +
theme(strip.text.x=element_text(size=7.5, face="bold")) +
geom_text(aes(label=Value), size=2.5, position = position_stack(vjust=0.5))
s2 <- s1 + labs(title="Comparison of Alice with non-electric aircrafts",
subtitle="Understanding the characteristics of Alice, the world's first electric airplane, versus popular non-electric aircrafts",
caption="*Cessna refers to Cessna Citation XLS
Source: Information is Beautiful- https://informationisbeautiful.net/beautifulnews/1063-all-electric-plane/
")
s3 <- s2 + theme(axis.text.x=element_text(size=7, angle=0, face="bold"),
axis.text.y=element_text(size=7, face="bold"),
panel.grid.major=element_blank(),
panel.grid.minor=element_blank(),
panel.background=element_rect(fill="white"))
s4 <- s3 + scale_fill_manual(values=c( "lightgreen","gold", "violet")) +
labs(fill="Type of aircraft")+
theme(legend.title=element_text(size=7.5, face="bold"),
legend.text=element_text(size=7),
legend.background=element_rect(fill="grey85"),
axis.title.x=element_blank(),
axis.title.y=element_blank(),legend.position=c(1,0),
legend.justification=c(1.3, -0.6)) +
theme(text=element_text(family="Helvetica"),
title=element_text(size=10, face="bold"),
plot.subtitle=element_text(size=9, face="plain"),
plot.caption=element_text(size=7, face="plain", hjust=1))
Data Reference
The following plot fixes the main issues in the original.