Original


Source: Paid Parental Leave: U.S. vs. The World (INFOGRAPHIC) (2017) .

(https://www.huffingtonpost.com.au/entry/maternity-leave-paid-parental-leave-_n_2617284)


Objective

The above pie chart (infographic) was originally published by Huffington Post on their website on 5th of February 2013 and later got updated on 7th December 2016 to show its US-based readers that US is the only rich country in the world that does not have a national paid parental leave mandate. In the infographic, US is compared to 19 countries selected from Asia-Pacific region, North and South America region, Middle-east region and European region. The pie chart was constructed using data sourced from International Labour Organization and shows length of mandated parental (maternity and/ or paternity) paid leaves and rate of payment imposed during the leave in those selected countries.

The visualisation chosen had the following three main issues:

  • Data integrity. Data is the utmost important part of data visualisation, therefore, it is imperative to maintain its validity. Although Huffington Post revealed the source their data, i.e. International Labour Organization (ILO), they did not disclose to its reader what year the data was based on & the nama of dataset for readers quickly check their data validity.

  • Unclear message. Huffington Post did not explain what criteria they used in selecting countries to be compared with US. The lack of transparancy in the selection could lead to a confusion of what message actually Huffington Post wanted to convey to its readers. Did they want to show their readers the type of paid parental benefits offered in many countries (maternal or paternal or both) and the rate of payment applied on the leave? Or did they just want to tell their readers how stingy US is to their workers that it cannot afford to give paid parental leave whilst other countries that are not even rich can afford to give paid parental leave? Reading at the article, of which the infographic belongs to, it seems that the latter is the message they want to convey to their readers. In that case, they do not need to complicate things by using two variables that are measured in two different units in their infograph. They can just deliver their message by using the most common parental benefits offered to new parents in many countries: the length of mandated paid leave.

  • Difficult to compare. It is hard to compare the size of the area of each slice of the pie chart especially the small ones as there’s no scale for the pie chart. In quick glance, readers may think the payment rate -which is in percentage (%)- represents the size of proportion of each countries.In reality, the payment rate is only one of two variables used to determine the size of the pie chart area. Worse, readers are left to work out the total paid parental leave benefit (i.e. length of paid maternity or paternity leave + pay rate during leave) themselves in comparing parental leaves of those countries. It would be much easier for readers to make comparison if the total paid parental leave benefit is calculated.

Reference

Code

The following code was used to fix the issues identified in the original.

setwd("~/Desktop/RMIT/MATH2270 - Data Visualisation/Assignments/Ass. 2/assignment2template1950")

library(readr)
library(ggplot2)
library(tidyverse)
library(dplyr)
devtools::install_github("hrbrmstr/ggalt")
library(ggalt)



#read dataset
parental_leave <- read.csv("Parental_leave.csv")
View(parental_leave)

#create dataframe using data from parental_leave
parent.df <- data_frame(Country=c("Australia", "Brazil", "Canada", "China", "France", "Germany","India", "Indonesia", "Italy", "Japan", "Mexico", "Netherlands","Russia", "Saudi Arabia", "South Korea", "Spain", "Switzerland","Turkey","UK", "US"),
                        full_paid_maternity=c(parental_leave$ma_fullpay_in_weeks),
                        total_full_paid_parent=(parental_leave$total_fullpay_for_parent_inweeks),  full_paid_paternity=sprintf("%d",as.integer(total_full_paid_parent-full_paid_maternity)))

 #arrange data from the highest to the lowest total parental benefit, group by country 
parent.df <- arrange(parent.df, desc(total_full_paid_parent))
parent.df$Country <- factor(parent.df$Country, levels = rev(parent.df$Country))

#plot
p <- ggplot() 

#x and y axis gridline
p <- p + geom_segment(data = parent.df, aes(y=Country, yend=Country, x=0, xend=.5), color="#b2b2b2", size=0.15)

#dumbbell plot
p1 <- p + geom_dumbbell(data = parent.df, aes(y=Country, x=full_paid_maternity, xend=total_full_paid_parent),size=1.5, colour="gray60", size_x=2.5,colour_x ="cornflowerblue", size_xend=2.5, colour_xend = "dodgerblue4", dot_guide=TRUE, dot_guide_size=0.1) + theme()

#text above plots
p2 <- p1 + geom_text(data= filter(parent.df, Country=="Spain"), aes(x=full_paid_maternity, y=Country, label="full-pay maternity leave"), colour="cornflowerblue", size=2.1, vjust=-2.7, fontface="bold", family="Calibri") +
  geom_text(data =filter(parent.df, Country=="Spain"), aes(x=total_full_paid_parent, y=Country, label="total paid leave"), colour="dodgerblue4", size=2.1, vjust=-1.4, fontface="bold", family="Calibri")

#text next to plots
p3 <- p2 + geom_text(data = parent.df, aes(x=full_paid_maternity, y=Country, label=(full_paid_maternity)), colour="cornflowerblue", size=2.75,hjust=1.75 , family="Calibri") +
  geom_text(data = parent.df, aes(x=total_full_paid_parent, y=Country, label=(total_full_paid_parent)), colour="dodgerblue4", size=2.75,hjust=-0.75, family="Calibri")  


#title & labels
p4 <- p3 + labs(x=NULL, y=NULL, title="Mandated Paid Parental Leave in 2016: US vs. The World", subtitle="(total paid leave = full-pay maternity leave + full-pay paternity leave, measured in 'weeks')",caption = "Source: ILO 'Indicators related to Maternity Protection, Paternity Leave and Parental Leave' & OECD 'Parental Leave System' ")

p5 <- p4 + theme_bw(base_family="Calibri")
p6 <- p5 + theme(panel.grid.major=element_blank())
p7 <- p6 + theme(panel.grid.minor=element_blank())
p8 <- p7 + theme(panel.border=element_blank())
p9 <- p8 + theme(axis.ticks=element_blank())
p10 <- p9 + theme(axis.text.x=element_blank())
p11 <- p10 + theme(plot.title=element_text(face="bold"))
p12 <- p11 + theme(plot.caption=element_text(size=7, margin=margin(t=12), colour="gray0"))

Data Reference

Reconstruction

In order to fix the issues with the original pie chart, following actions are taken:

  • Data integrity.
    • Provide more details on the sources of data used to design the chart. We source our 2016 Parental Leave System data from Organisation for Economic Co-operation and Development (OECD) website for countries that are member of OECD such Australia, Canada, France, Germany, Italy, Japan, (South) Korea, Mexico, Netherlands, Spain, Switzerland, Turkey, UK and US. Data for the rest of selected countries have been sourced from International Labour Organization (ILO) website, section Indicators related to Paternity Leave and Parental Leave and Indicators related to Paternity Leave and Parental Leave. All data were retrieved on May 2, 2020.
  • Unclear message.
    • Disclose the criteria used in selecting countries to compare with US and the goal of constructing the charts. We maintain the selection of countries as it is but restructure the chart to show the type of paid parental leave available in the selected country and their length, i.e. what will be the length of the leave if the workers are paid at full rate (100%).
  • Difficult to compare.
    • Provide the readers with calculated fully-paid benefit for both mother and father in terms of weeks, using this formula:

      (length of maternal leave in weeks * weighted-average rate of payment during maternity leave) + (length of paternal leave in weeks * weighted-average rate of payment during paternity leave).

    • Rank the countries according to the value of the total paid parental benefit using Dumbbell plot instead of pie chart to give more accurate and straightforward comparison. A Dumbbell plot is basically a grouped bar chart that is transformed in line and dot. It has the same benefits of the bar chart such as easy to understand and clarify rank better than pie chart. However Dumbbell plot is very useful when you have several bars of the same height as it avoids cluttered figure, and lays more emphasis on measuring the difference between groups.

The following plot fixes the main issues in the original.