Click the Original, Code and Reconstruction tabs to read about the issues and how they were fixed.
Objective
The visualisation is used in the article called “Time Use” by Esteban Ortiz-Ospina, Charlie Giattino and Max Roser. The objective of the data visualisation, is to indicate how time is used or segmented on a daily average, i.e. average use of time based on 1,440 minutes (a day), across various countries. The visualisation attempts to demonstrate differences and similarities on the usage of time, particularly on 3 categories/activities - work, sleep and leisure (fun).
The topic and visualisation are simple enough for general users to understand. However, the topic and visualisation appear more suited to audience having an interest, or in the field of social science.
The visualisation chosen had the following three main issues:
Reference
The following code was used to fix the issues identified in the original.
library(dplyr)
library(magrittr)
library(tidyr)
library(tidytext)
library(ggplot2)
# import data from the excel file into a dataset called <time_use>
time_use <- readxl::read_xlsx("Time-Use-in-OECD-Countries-OECD.xlsx")
# preparing the data
str(time_use) # explore structure to check the variables are correctly classified
## tibble [461 x 3] (S3: tbl_df/tbl/data.frame)
## $ Country : chr [1:461] "Australia" "Austria" "Belgium" "Canada" ...
## $ Category : chr [1:461] "Paid work" "Paid work" "Paid work" "Paid work" ...
## $ Time (minutes): num [1:461] 211 280 194 269 200 ...
colnames(time_use)<- c("Country","Category","Time_minutes") #simplify column names
length(unique(time_use$Country)) # check how many unique countries
## [1] 33
# convert dataset from long to wide to tidy up the data first
time_use_wide <- pivot_wider(data = time_use, names_from = "Category", values_from = "Time_minutes")
# remove rows with NAs
time_use_wide %<>% na.omit(time_use_wide)
# rename Korea with South Korea
time_use_wide$Country[time_use_wide$Country == "Korea" ] <- "South Korea"
# reduce to 23 countries as per original chart
time_use_wide %<>% mutate(Total_time = rowSums(time_use_wide[,2:15])) # obtain total time time usage per day
time_use_wide %<>% filter(Total_time > 1439.699, Total_time < 1440.300) # extract only 23 countries
# no change in sleep and eat, but group the other activities into respective categories
# group all activities categorised as Work in filtered dataset. I.e. Paid work, Education, Other unpaid work & volunteering
time_use_wide %<>% mutate(Work = time_use_wide$`Paid work` + time_use_wide$Education + time_use_wide$`Care for household members` + time_use_wide$`Other unpaid work & volunteering`)
# group all activities categorised as Leisure in filtered dataset. I.e. TV and Radio, Seeing friends, Attending events, Sports, Other leisure activities
time_use_wide %<>% mutate(Leisure = time_use_wide$Sports + time_use_wide$`Attending events` + time_use_wide$`Seeing friends` + time_use_wide$`TV and Radio` + time_use_wide$`Other leisure activities`)
# new dataset with selected variables - Country, Work, Sleep, Leisure, Eat and Drink
time_use_new <- time_use_wide %>% select(Country, Work, Sleep, Leisure, `Eating and drinking`)
# convert wide to long
time_use_long <- pivot_longer(data = time_use_new, names_to = "Category", values_to = "Time_minutes", 2:5)
# factoring Category
time_use_long$Category <- time_use_long$Category %>%
factor(levels = c("Work","Sleep","Leisure","Eating and drinking"), ordered = TRUE)
# ordering the time within each category
time_use_long2 <- time_use_long %>%
group_by(Category) %>%
ungroup %>%
mutate(Category = as.factor(Category),
Country = reorder_within(Country, Time_minutes, Category))
# prepare the plotting
p1title <- sprintf("How do people spend their time?") # same title as original
p1subtitle <- sprintf("Time spent daily against key activities for people aged 15 to 64 across countries, based on surveys mostly conducted between 2009 to 2016.") # This is based on time-use daries data extracted from OECD. Updated subtitle
#create the base of the plot and enter the dataset and variables
plot1 <- ggplot(data = time_use_long2, aes(fill = Category, y = Time_minutes, x = Country))
# facet subgroup and make the plot horizontal
plot1 <- plot1 + geom_col(stat = "identity", width = 0.6) +
facet_wrap(~Category, scales = "free_y", ncol = 4) +
coord_flip() + scale_x_reordered()
#adopt same title, update subtitle, add caption, change the colours for the Categories
plot1 <- plot1 + theme_minimal() + theme(strip.background = element_blank(), strip.text = element_blank(), axis.text.x=element_text(size=7), axis.title.x = element_text(vjust=-0.25), axis.title.y = element_text(vjust=0.25)) + labs(title = p1title, subtitle = p1subtitle, caption = "Data source: Our World in Data: Time Use, OECD Time Use Database, and OECD Gender Data Portal.",
x = "Countries", y = "Time spent against key activities in a day (mins)", fill = "Activities") + theme(
plot.subtitle = element_text(size = 8)) + scale_fill_manual(values = c("#b2abd2","#5e3c99","#fdb863","#e66101"))
#updates to title and caption
plot1 <- plot1 +
theme(plot.caption = element_text(size = 8, vjust = -0.25, face = "italic"), #move caption lower
plot.title.position = "plot", #align title and subtitle to plot
plot.caption.position = "plot") #align caption to plot
Data Reference
Esteban Ortiz-Ospina, Charlie Giattino and Max Roser (2020), Time Use in OECD Countries OECD.xlsx, OurWorldInData.org, viewed 23 July 2021, ‘https://ourworldindata.org/uploads/2020/12/Time-Use-in-OECD-Countries-OECD.xlsx’.
OECD, Gender Data Portal - Balancing paid work, unpaid work and leisure, Organisation for Economic Co-operation and Development, viewed 26 July 2021, ‘https://www.oecd.org/gender/data/balancingpaidworkunpaidworkandleisure.htm’.
OECD, Time Use, Organisation for Economic Co-operation and Development, viewed 26 July 2021, ‘https://stats.oecd.org/Index.aspx?DataSetCode=TIME_USE’.
The following plot fixes the main issues in the original.
The visualisation displays the activities using a faceted chart which are of the most interest, in the order of “Work”, “Sleep”, “Leisure”, and “Eating and Drinking”. “Work” and “Leisure” have been grouped to make the chart less busy. Countries with more accurate data are used for the chart. The countries are still sorted based on “Work” from the highest to the lowest. Finally, the activities are separated for easier comparison of the similarities and differences within each activity across the countries, with the time spent against each activities standardised in minutes.