Click the Original, Code and Reconstruction tabs to read about the issues and how they were fixed.

Original


Source: ABC News (2019).


Objective

The objective of the original info-graphic was to display the number of influenza infections and their rate (per 100 000 people) in Australia compared by states for year 2019. The visualisation was targeting general Australia’s population.

The visualisation chosen had the following three main issues:

  • Dual axes were used to visualise the number and rate of confirmed infections on the same graph. This can be misinterpreted, especially because the left and right axes have different scales and increase by different number of units.
  • Colour issues. Use of different colours was not necessary and did not add value to the visualisation because the states were already defined by the x-axis labels. The graph also used colours that are too bright and saturated.
  • The scale of the left y-axes and the alphabetical order of x-axes together make it difficult to compare the states with low (ACT, NT, TAS) and similar (QLD, VIC) numbers of influenza cases.

Reference

Code

The following code was used to fix the issues identified in the original.

library(readr)
library(dplyr)
library(tidyr)
library(ggplot2)
library(cowplot)

number <- read_csv("number.csv")
rate <- read_csv("rate.csv")
number <- number[,-10] %>% gather(.,"state", "number", ACT:WA)
rate <- rate[,-10] %>% gather(.,"state", "rate", ACT:WA)

data <- merge(number, rate, by = c("Year","state"))
data <- data %>% filter(Year == 2019)

data$state <- data$state %>% factor(levels = data$state[order(data$rate)])

p1 <- ggplot(data, aes(x = state, y = rate))
p3 <- p1 + geom_bar(stat = "identity", fill = "thistle3") + 
  labs(title = "Rate and number of influenza infections (laboratory confirmed), Australia",
       subtitle = "By State, 2019",
       y = "Rate per 100 000 people", 
       x = "State") +
  coord_flip() +
  geom_text(aes(label = round(rate,2)), hjust = -.2, size = 3) +
  scale_y_continuous(limits = c(0,1500)) + 
  theme_minimal()

data$state <- data$state %>% factor(levels = data$state[order(data$number)])

p2 <- ggplot(data, aes(x = state, y = number))
p4 <- p2 + geom_bar(stat = "identity", fill = "slategray3") + 
  labs(y = "Number of infections", x = "State") + 
  coord_flip() +
  geom_text(aes(label = number), hjust = -.2, size = 3) +
  scale_y_continuous(limits=c(0,110000)) + 
  theme_minimal() 

p <- plot_grid(p3, p4, nrow = 2) 

Data Reference

Notifications of influenza by State and Territory and Year (2019). Retrieved 12th of September, 2019, from National Notifiable Diseases Surveillance System (NNDS) website: http://www9.health.gov.au/cda/source/rpt_4.cfm

Reconstruction

The following plot fixes the main issues in the original.