Click the Original, Code and Reconstruction tabs to read about the issues and how they were fixed.

Original



Objective

The original visualisation titled “Monthly hours worked in all jobs, Seasonally adjusted” (abs, 2021) is to show the yearly change in hours worked and the amount of hours worked by the Australian population in any given month from February 2011 to February 2021. The target audience is the government, workers, unemployed and job recruiters. Government can pull insights as to during which leadership presented the Australian public with higher rates of working. While workers, unemployed people and job recruiters can understand which months were the busiest times for hours worked and can use this information for understanding better times to find jobs or recruit people for up and coming busy periods.

The visualisation chosen had the following three main issues:

  • Dual Axes: Having dual axes creates confusion between each set of data, as you can easily confuse the line graph with the percentage axis and the bar graph with the hours worked axis. This can require a double take when examining the information and isn’t very intuitive on first glance.
  • Truncated Axes resulting in deception: Having the ‘Millions’ axis truncated results in some deception to the reader as it looks like in 2020 after February there is a massive drop in hours worked when in reality its a drop of 150,000 hours.
  • Poor use of bar chart to represent conintuous data: Using a bar chart to visualise continuous data over a time series is also not necessarily accepted in data visualisations and is more suited for discrete data. This is not intuitive to the reader and can be confusing on first glance.

Reference

Code

The following code was used to fix the issues identified in the original.

#needed libraries
library(ggplot2)
library(dplyr)
library(tidyr)
library(lubridate)
library(scales)

#import the data and skip first line
hoursWorked <- read.csv('Monthly hours worked in all jobs, Seasonally adjusted.csv', skip = 1, stringsAsFactors = FALSE)
#Set column names
names(hoursWorked) <- c("Date","Hours Worked in Millions", "Yearly Change %")  
#Remove last 2 lines as its a reference and blank
hoursWorked <- head(hoursWorked, -2)
#using gsub to remov commas
hoursWorked$`Hours Worked in Millions` <- gsub(",",'', as.character(hoursWorked$`Hours Worked in Millions`))
#fixing the format for hours worked
hoursWorked$`Hours Worked in Millions` <- as.numeric(hoursWorked$`Hours Worked in Millions`)
#create date
hoursWorked$Date <- my(hoursWorked$Date)                                                         

#create plot fix 1
plot <- ggplot(data = hoursWorked, aes(x = Date, y = `Hours Worked in Millions`)) +
  geom_line(colour = "#11ACC6", size= 1) + 
  ylim(0, max(hoursWorked$`Hours Worked in Millions`)+150) +
  scale_x_date(breaks =  date_breaks("1 year"), labels = date_format("%Y")) +
  xlab("Year") + ylab("Hours worked in Millions") + 
  ggtitle("Total Hours Worked by Australians", subtitle =  "(Seasonally Adjusted)") +
  labs(caption = "Australian Bureau of Statistics, Labour Force. (2021)")

plot2 <- ggplot(data = hoursWorked, aes(x = Date, y = `Yearly Change %`)) +
  geom_line(colour = "#0E23E4", size= 1) + 
  ylim(-10, 10) +
  scale_x_date(breaks =  date_breaks("1 year"), labels = date_format("%Y")) +
  xlab("Year") + ylab("Percentage of Change") + 
  ggtitle("Percentage Change in Hours Worked by Australians", subtitle =  "(Seasonally Adjusted)") +
  labs(caption = "Australian Bureau of Statistics, Labour Force. (2021)")

Data Reference

Reconstruction

The following plot fixes the main issues in the original.