Visualisation Reconstruction

Author

Zachary Ferlazzo


Source: Creative Spirits (2022).


Objective and Intended Audience

This data visualisation from Creative Spirits aims to highlight the over representation of Indigenous Australians in the prison system across all Australian states. The visualisation compares the relatively small Indigenous population percentage in each state against their disproportionately high representation in prison populations. Creative Spirits designed this for the general Australian public to raise awareness about this critical social issue. Their website indicates they specifically target students seeking accurate information, organisations requiring evidence based data, and anyone interested in understanding challenges facing Indigenous Australians.

Key Visualisation Issues

  • Dual Y-Axis The most significant flaw is the dual y-axis design, which unnecessarily complicates interpretation despite all variables sharing similar value ranges. Viewers may easily overlook the second axis denoting state Indigenous populations, leading them to misread the data using only the left hand scale. The differing scales obscure the significant disparity the visualisation aims to highlight This confusion is particularly evident when comparing New South Wales and the Northern Territory. The visual distance between the yellow line and the top of the black bar appears similar in both states, yet the actual percentage difference is large. Approximately 20% for New South Wales but closer to 55% for the Northern Territory. Rather than clarifying the issue, the dual axis introduces ambiguity and increases the likelihood of misinterpretation.

  • Use of Colour The visualisation’s color palette, inspired by the Aboriginal flag, is aesthetically striking and symbolically meaningful. However, in data good visualisation practices, color should serve a strategic purpose beyond aesthetics, particularly when conveying important information. The highly saturated colors, while visually bold, prove taxing during extended viewing. More significantly, the colors miss crucial opportunities to enhance comprehension. They could have been strategically deployed to distinguish which population variable corresponds to which axis, or to emphasize the key disparities the visualisation aims to highlight. Instead, the color scheme treats all elements with equal visual weight, failing to guide the viewer’s attention to the most important information.

  • Use of Line Elements The line representing Indigenous population percentages across states undermines rather than supports the visualisation’s goal. While it successfully shows population variation between states, the primary objective of the plot is to reveal disparities within each state. Furthermore, the diagonal trajectory of the line means it intersects the bars at varying angles rather than at 90 degrees, making it difficult to accurately read population figures. This design element adds visual noise without contributing meaningful information, ultimately distracting from the intended message.

Conclusion

Despite addressing an important social issue, this visualisation’s technical execution restricts its effectiveness. Simplifying the axis structure, employing more purposeful color choices, and reconsidering the line element would significantly improve the chart’s ability to communicate its crucial message about Indigenous Australian incarceration rates,

References

The following code was used to read in ABS Census data, preprocess and create a new plot using ggplot2.

# Load necessary packages
library(tidyverse)  
library(readxl)
library(scales)
library(janitor)

# Configuration settings
ROUNDING_DIGITS <- 1
Y_AXIS_MAX <- 90
BAR_WIDTH <- 0.8
LABEL_SIZE <- 3

# Read population data
pop_df <- read_excel("data/population_16.xls", sheet = "Table 1", skip = 5)

# Preprocess population data
pop_df <- pop_df %>% 
  # Remove NA rows and transpose
  na.omit() %>%
  t() %>%
  # Convert to data frame
  as.data.frame() %>%
  # Convert states to a column 
  rownames_to_column("state") %>%
  # Set first row as column names
  janitor::row_to_names(1) %>%
  # Clean column names before slice
  janitor::clean_names() %>% 
  # Keep only the 8 states/territories
  slice(1:8) %>%
  # Select the columns needed
  select(state = 1, indigenous_total = 5, total = 8) %>%
  # Calculate indigenous population percentage
  mutate(
    across(c(indigenous_total, total), as.numeric),
    indigenous_percentage = round(
      indigenous_total / total * 100, 
      digits = ROUNDING_DIGITS
    )
  )

# Read prison population data
prs_df <- read_excel("data/prison_stats_16.xls", sheet = "Table_13", skip = 4)

# Preprocess prison data
prs_df <- prs_df %>% 
  # Remove NA rows and transpose
  na.omit() %>%
  t() %>%
  # Convert to data frame
  as.data.frame() %>%
  # Convert states to a column 
  rownames_to_column("state") %>%
  # Set first row as column names
  janitor::row_to_names(1) %>%
  # Clean column names before slice
  janitor::clean_names() %>% 
  # Keep only the 8 states/territories
  slice(1:8) %>%
  # Align state names with population data
  mutate(state = pop_df$state) %>%
  # Select the columns needed 
  select(
    state = 1,
    all_prisoners = 2,
    indigenous_prisoners = 5,
    non_indigenous_prisoners = 6,
    indigenous_prison_percentage = 20,
    non_indigenous_prison_percentage = 21
  ) %>%
  # Convert to numeric
  mutate(across(-state, as.numeric))


# Combine data for visualisation
plot_data <- tibble(
  State = pop_df$state,
  Population_Percentage = pop_df$indigenous_percentage,
  Prison_Percentage = prs_df$indigenous_prison_percentage
) %>%
  pivot_longer(
    cols = c(Population_Percentage, Prison_Percentage),
    names_to = "Population_Type",
    values_to = "Percentage"
  )

# Create visualisation
p <- ggplot(plot_data, aes(x = State, y = Percentage, fill = Population_Type)) +
  geom_col(
    width = 0.7,  
    position = position_dodge(width = 0.75),
    colour = NA 
  ) +
  geom_text(
    aes(label = percent(Percentage / 100, accuracy = 1)),
    size = 3.5,
    position = position_dodge(width = 0.75),
    hjust = -0.2,
    family = "sans",  
    fontface = "bold",
    colour = "#2c3e50"  
  ) +
  scale_fill_manual(
    values = c("Population_Percentage" = "#95a5a6",  
               "Prison_Percentage" = "#3498db"),     
    name = NULL, 
    labels = c("Share of total population", 
               "Share of prison population")
  ) +
  scale_y_continuous(
    limits = c(0, 90),
    breaks = seq(0, 80, 20),
    labels = label_percent(scale = 1),
    expand = c(0, 0)  
  ) +
  coord_flip(clip = "off") +  
  labs(
    y = NULL,  
    x = NULL,
    title = "Indigenous Australians are dramatically\nover represented in the justice system",
    subtitle = "Percentage of Indigenous population compared to prison population by state",
    caption = "Source: Australian Bureau of Statistics, 2016 Census"
  ) +
  theme_minimal(base_size = 12, base_family = "sans") +
  theme(
    # Grid and background
    panel.grid.major.y = element_blank(), 
    panel.grid.major.x = element_line(colour = "#ecf0f1", size = 0.3),
    panel.grid.minor = element_blank(),
    plot.background = element_rect(fill = "white", colour = NA),
    panel.background = element_rect(fill = "white", colour = NA),
    
    # Title alignment
    plot.title.position = "plot",
    plot.caption.position = "plot",
    
    # Text elements
    plot.title = element_text(
      size = 16, 
      face = "bold", 
      margin = margin(b = 8),
      colour = "#2c3e50",
      hjust = 0
    ),
    plot.subtitle = element_text(
      size = 11, 
      margin = margin(b = 20),
      colour = "#7f8c8d",
      hjust = 0
    ),
    plot.caption = element_text(
      size = 9,
      colour = "#95a5a6",
      hjust = 0,
      margin = margin(t = 15)
    ),
    
    # Axis text
    axis.text.x = element_text(colour = "#7f8c8d", size = 10),
    axis.text.y = element_text(colour = "#2c3e50", size = 10, face = "bold"),
    
    # Legend
    legend.position = "top",
    legend.justification = "left",
    legend.text = element_text(size = 10, colour = "#2c3e50"),
    legend.key.size = unit(0.4, "cm"),
    legend.margin = margin(b = 15),
    legend.box.spacing = unit(0, "pt"),
    
    # Margins
    plot.margin = margin(15, 40, 15, 15)  
  )