Click the Original, Code and Reconstruction tabs to read about the issues and how they were fixed.

Original


Source : United State Geological Survery (USGS).(2010).


Objective

The original data visualisation uses a “cylinder and pipe” layout to show the source which is either surface water or groundwater of North Carolina’s freshwater and for what purposesthe water was used in 2010.

The general point of this article is to investigate the stress on existing supplies and to improve and evaluate possible water-supply management options.

The visualisation chosen had the following three main issues:

  • The original diagram does not give a proper representation of how the source water is distributed across several categories. Looking at the diagram, we cannot interpret which source provided the maximum or minimum water supply across the 8 categories.

  • The diagram depicts flow of water from source in a hierarchical way, that is, the pipe leading out of surface water and groundwater cylinders to the top row and then flowing to the bottom row. This is a deceptive visualisation as it misinterpretes the water distribution.

  • The original image contains green and blue colours which should be avoided as these are potential nightmares to colour blind users.

Reference

United States Geological Survey (USGS) . (2010). Water Use in North Carolina, 2010 *. Retrieved September 17, 2019, from USGS.gov: https://www.usgs.gov/centers/sa-water/science/water-usenc-2010?qt-science_center_objects=0#qt-science_center_objects

Code

The following code was used to fix the issues identified in the original.

library(ggplot2)
df2 <- data.frame(Source_water = rep(c("Surface","Groundwater"), each = 8),
                  cat = rep(c("Public Supply","Domestic","Irrigation","Livestock","Aquaculture","Industrial","Mining","Thermal"),2),
                  len = c(766, 0.9, 279, 15, 1450, 188, 4.9, 7660, 194, 231, 88, 57, 12, 84, 28, 0.99))

head(df2)
##   Source_water           cat    len
## 1      Surface Public Supply  766.0
## 2      Surface      Domestic    0.9
## 3      Surface    Irrigation  279.0
## 4      Surface     Livestock   15.0
## 5      Surface   Aquaculture 1450.0
## 6      Surface    Industrial  188.0
plt <- ggplot(data=df2, aes(x=cat, y=log(len), fill=Source_water)) +
  geom_bar(stat="identity", position=position_dodge()) +
  geom_text(aes(label=len), vjust=1.6, color="black",
            position = position_dodge(0.9), size=3.5)+
  labs(title = "Source and Use of Freshwater in North Caroline, 2010",
       x = "categories",
       y = "log of Millions Gallons per day")  

Data Reference

code to plot bar chart was obtained from : http://www.sthda.com/english/wiki/ggplot2-barplots-quick-start-guide-r-software-and-data-visualization

Reconstruction

The following plot fixes the main issues in the original.

  • The reconstructed visualisation, that is, bar chart is one of the most intuitive ways to compare the different categories of water distribution from source. With the newly constructed bar chart we can easily spot the difference in the water supply. From groundwater, the maximum water was supplied for domestic uses and usage for thermoelectric power was minimal whereas the maximum surface water contributed to thermoelectric power uses and domestic uses had minimum supply.

  • The bar chart shows how the water in North Carolina is distributed across 8 categories eliminating the perception of water flowing from one row to the other.

  • In order to get rid of the colour blind error, the colours chosen for the bar plot are red and turquoise.