Click the Original, Code and Reconstruction tabs to read about the issues and how they were fixed.

Original


Source: https://ourworldindata.org/time-use (2020).


Objective

The visualisation is used in the article called “Time Use” by Esteban Ortiz-Ospina, Charlie Giattino and Max Roser. The objective of the data visualisation, is to indicate how time is used or segmented on a daily average, i.e. average use of time based on 1,440 minutes (a day), across various countries. The visualisation attempts to demonstrate differences and similarities on the usage of time, particularly on 3 categories/activities - work, sleep and leisure (fun).
The topic and visualisation are simple enough for general users to understand. However, the topic and visualisation appear more suited to audience having an interest, or in the field of social science.

The visualisation chosen had the following three main issues:

  • Issue 1 - With the key interest of the topic on showing the usage of time at a high level across work, sleep, eat and leisure, it is not necessary for the chart to list almost all captured categories, in detailed, for the usage of time in average per day across 23 countries. The listed 10 categories made the chart very busy and cluttered.
  • Issue 2 - The integrity of the data is questionnable. Countries such as New Zealand and UK listed exceeded the 1,440 mins per day but are still listed in the stacked bar chart. Mexico’s has less than 1,440 mins of data per day, and is listed. Whereas countries such as Australia with exact 1,440 mins per day are not listed. The authors must have noticed this as well, as the time usage for some categories are not listed. On first glance, the general assumption is that the remaining time is taken up by “Education in school & study” by subtracting the listed categories from 1,440 minutes, which is not correct.
  • Issue 3 - Although this is a time-usage chart, displaying it with a mix units of minutes and hours do not provide the right representation and can distort the view on the chart. Audience have to perform mental calculation to determine how much is the converted minutes for “Sleep”. It is not necessary to indicate the values.

Reference

Code

The following code was used to fix the issues identified in the original.

library(dplyr)
library(magrittr)
library(tidyr)
library(tidytext)
library(ggplot2)

# import data from the excel file into a dataset called <time_use>
time_use <- readxl::read_xlsx("Time-Use-in-OECD-Countries-OECD.xlsx")

# preparing the data
str(time_use) # explore structure to check the variables are correctly classified
## tibble [461 x 3] (S3: tbl_df/tbl/data.frame)
##  $ Country       : chr [1:461] "Australia" "Austria" "Belgium" "Canada" ...
##  $ Category      : chr [1:461] "Paid work" "Paid work" "Paid work" "Paid work" ...
##  $ Time (minutes): num [1:461] 211 280 194 269 200 ...
colnames(time_use)<- c("Country","Category","Time_minutes") #simplify column names

length(unique(time_use$Country)) # check how many unique countries
## [1] 33
# convert dataset from long to wide to tidy up the data first
time_use_wide <- pivot_wider(data = time_use, names_from = "Category", values_from = "Time_minutes")

# remove rows with NAs
time_use_wide %<>% na.omit(time_use_wide)

# rename Korea with South Korea
time_use_wide$Country[time_use_wide$Country == "Korea" ] <- "South Korea"

# reduce to 23 countries as per original chart
time_use_wide %<>% mutate(Total_time = rowSums(time_use_wide[,2:15])) # obtain total time time usage per day
time_use_wide %<>% filter(Total_time > 1439.699, Total_time < 1440.300) # extract only 23 countries

# no change in sleep and eat, but group the other activities into respective categories
# group all activities categorised as Work in filtered dataset. I.e. Paid work, Education, Other unpaid work & volunteering
time_use_wide %<>% mutate(Work = time_use_wide$`Paid work` + time_use_wide$Education + time_use_wide$`Care for household members` + time_use_wide$`Other unpaid work & volunteering`)

# group all activities categorised as Leisure in filtered dataset. I.e. TV and Radio, Seeing friends, Attending events, Sports, Other leisure activities
time_use_wide %<>% mutate(Leisure = time_use_wide$Sports + time_use_wide$`Attending events` + time_use_wide$`Seeing friends` + time_use_wide$`TV and Radio` + time_use_wide$`Other leisure activities`)

# new dataset with selected variables - Country, Work, Sleep, Leisure, Eat and Drink
time_use_new <- time_use_wide %>% select(Country, Work, Sleep, Leisure, `Eating and drinking`)

# convert wide to long
time_use_long <- pivot_longer(data = time_use_new, names_to = "Category", values_to = "Time_minutes", 2:5)

# factoring Category
time_use_long$Category <- time_use_long$Category %>% 
factor(levels = c("Work","Sleep","Leisure","Eating and drinking"), ordered = TRUE)

# ordering the time within each category
time_use_long2 <- time_use_long %>%
    group_by(Category) %>%
    ungroup %>%
    mutate(Category = as.factor(Category),
           Country = reorder_within(Country, Time_minutes, Category))

# prepare the plotting
p1title <- sprintf("How do people spend their time?") # same title as original
p1subtitle <- sprintf("Time spent daily against key activities for people aged 15 to 64 across countries, based on surveys mostly conducted between 2009 to 2016.") # This is based on time-use daries data extracted from OECD. Updated subtitle

#create the base of the plot and enter the dataset and variables
plot1 <- ggplot(data = time_use_long2, aes(fill = Category, y = Time_minutes, x = Country))

# facet subgroup and make the plot horizontal
plot1 <- plot1 + geom_col(stat = "identity", width = 0.6) +
    facet_wrap(~Category, scales = "free_y", ncol = 4) +
    coord_flip() + scale_x_reordered()

#adopt same title, update subtitle, add caption, change the colours for the Categories
plot1 <- plot1 + theme_minimal() + theme(strip.background = element_blank(), strip.text = element_blank(), axis.text.x=element_text(size=7), axis.title.x = element_text(vjust=-0.25), axis.title.y = element_text(vjust=0.25)) + labs(title = p1title, subtitle = p1subtitle, caption = "Data source: Our World in Data: Time Use, OECD Time Use Database, and OECD Gender Data Portal.", 
       x = "Countries", y = "Time spent against key activities in a day (mins)", fill = "Activities") + theme(
  plot.subtitle = element_text(size = 8)) + scale_fill_manual(values = c("#b2abd2","#5e3c99","#fdb863","#e66101"))

#updates to title and caption
plot1 <- plot1 +
  theme(plot.caption = element_text(size = 8, vjust = -0.25, face = "italic"), #move caption lower
        plot.title.position = "plot", #align title and subtitle to plot
        plot.caption.position =  "plot") #align caption to plot

Data Reference

Reconstruction

The following plot fixes the main issues in the original.
The visualisation displays the activities using a faceted chart which are of the most interest, in the order of “Work”, “Sleep”, “Leisure”, and “Eating and Drinking”. “Work” and “Leisure” have been grouped to make the chart less busy. Countries with more accurate data are used for the chart. The countries are still sorted based on “Work” from the highest to the lowest. Finally, the activities are separated for easier comparison of the similarities and differences within each activity across the countries, with the time spent against each activities standardised in minutes.