Click the Original, Code and Reconstruction tabs to read about the issues and how they were fixed.

Original


Source: Australian unemployment drops sharply article written by Scutt (2018), Business Insider Australia.


Objective

The Australian labour market and economy has performed a fluctuating change in employment and unemployment trends over a decade. The objective of the original Australian Employment and Unemployment data visualisation from Business Insider Australia (2018) was to demonstrate the sudden drop in unemployment rate while the employment rate has gradually increased in May from 2013-2018.

Targetted audience

  • The data visualisation focused on the general Australian public with the interests in Australian job market.

The visualisation chosen had the following three main issues:

  • Dual axes: The data series for Employment change and Unemployment rate showed two totally different values and scales which caused the misleading read in relationship between employment and unemployment change rate.

  • Poor aspect ratio of x-axis: The date breaks between each major break in year variable were not equal and may lead to inaccurate interpretation of the overall change rate in employment and unemployment rate. Since the starting time point was May, the month as May should be the breaks in timeline.

  • Inappropriate use of column chart for time series plot: It was difficult to see the trend in employment among the columns hidden under line graph of unemployment rate. The trend should be displayed as line with the same value as unemployment rate to emphasize the change in labour trend.

Reference

Code

The following code was used to fix the issues identified in the original.

library(readxl)
library(stringr)
library(lubridate)
library(tidyr)
library(dplyr)
library(ggplot2)

#Importing excel sheet into R studio
#The data file has been ammended from original excel sheet from ABS to easily manipulate the data
#The values of employment and unemployment rate are percent
employment_unemployment <- read_excel("employment_unemployment_aus.xls", col_names = c("Timeline", "Employment", "Unemployment"), skip =10)

#Round the decimal points of employment and unemployment variables up to 2 digits
employment_unemployment <- employment_unemployment %>% 
 mutate_if(is.numeric, round, digits = 2)

#Since the timeline of original data visualisation focused on the period between May 2013 and May 2018, the year from 2013 to 2018 were filtered from dataframe
#convert Year column to year-month-day format
employment_unemployment$Timeline <- ymd(employment_unemployment$Timeline)

#Using May as an indicator for each period year from 2013-2018
fixed_df <- employment_unemployment %>%
          mutate(year = format(employment_unemployment$Timeline, "%Y"),
                 month = format(employment_unemployment$Timeline, "%b"))  %>%
          filter(year > 2012 & year < 2019,
                 Timeline > "2013-04-01")

#Change value 5 into character context as "May"
fixed_df$month <- as.character(fixed_df$month)
fixed_df$year <- as.character(fixed_df$year)

#Converting the dataframe into long format 
final_df <- gather(fixed_df, "Variable", "Value", Employment:Unemployment)

final_df$MonthYear <- format(final_df$Timeline, "%b-%Y")


#Creating a timeseries plot
graph <- ggplot(data =final_df, aes(x = Timeline, y = Value))

final_graph <- graph + geom_line(color = "blue", size = 1) + 
  facet_grid(Variable ~ ., 
             scales = "free",
             labeller = label_value) +
  labs(title = "Australian Employment and Unemployment rate (%)",
       subtitle = "Seasonally Adjusted, ABS",
       y = "Change rate (%)") +
  scale_x_date(date_labels = "%b-%Y", 
               breaks = c(as.Date("2013-05-01"), as.Date("2014-05-01"), as.Date("2015-05-01"), as.Date("2016-05-01"), as.Date("2017-05-01"),as.Date("2018-05-01")),
               limits = c(as.Date("2013-05-01"), as.Date("2018-05-01"))) +
  stat_smooth(colour="red") 

Data Reference

Reconstruction

The following plot fixes the main issues in the original.