Click the Original, Code and Reconstruction tabs to read about the issues and how they were fixed.

Original


Source: Protovis: A graphical toolkit for visualization examples section.


Objective

Post World War II, antibiotics had gained the tag of “Wonder Drugs”. They were able to cure many diseases which were earlier considered to be incurable. Data was gathered to assess the performance of drugs against different bacterial infections. This was able to help both medical practitioners and scientists in the usage and development of drugs for different infections. Will Burtin in 1951, published this visualization to assess the performance of three popular antibiotics(Neomycin, Penicillin and Streptomycin) against 16 different bacteria(Gram Negative and Gram Positive), measured in terms of minimum inhibitory concentration(lowest concentration of a chemical, which prevents visible growth of bacterium).

The visualisation chosen had the following three main issues:

  • Confusing axis values- The axis values are represented in a descending order and units are not mentioned.
  • Improper scaling- Some values for instance that of Penicillin aginst Mycobacterium tuberculosis are hard to identify from the graph
  • Shape of the plot- Comparision of values are complicated from the circular representation

Reference

Code

The following code was used to fix the issues identified in the original.

library("readxl")
library(ggplot2)
library(scales)

#Solution for issue 1: Inversing the axis to ascending order
#Solution for issue 2: Rescaling the axis from microgram per mL to picogram per mL (performed in excel of the raw data(multiplied by 1000000))
#Solution for issue 3: Converted the chart to clustered column bar chart

d <- read_excel("C:/Users/vinay/Desktop/Data Visualisation/Assignment2/Data.xlsx", sheet = "Sheet1")

p<-ggplot(d, aes(fill=Antibiotic, y=picogram_per_milliliter, x=bacteria)) + 
  geom_bar(position="dodge", stat="identity") + 
  theme(axis.text.x = element_text(angle = 90))+
  scale_y_log10(breaks = trans_breaks("log10", function(x) 10^x),labels = trans_format("log10", math_format(10^.x)))+
  facet_wrap(~gram, strip.position = "bottom", scales = "free_x")+
  ggtitle("Burtin’s Antibiotics") +
  xlab("Bacteria") + ylab("Minimum inhibitory concentration (pg/ml)")+
  scale_fill_manual("Antibiotic", values = c("Neomycin" = "#929198", "Penicillin" = "#B2A1E9", "Streptomycin" = "#E1BCE4"))

Data Reference

Reconstruction

The following plot fixes the main issues in the original.