Producing charts for ‘A single chart inflation explainer (July 2022)’
By: Dr. Chris Martin
Tools / packages used: R, R Markdown,
ggplot2, tidyverse (inc. dplyr).
Techniques used: data visualisation, data cleaning/reshaping/manipulation.
This notebook produces the static data visualisations which features in my data storytelling project: A single chart explainer on the UK inflation picture.
A note on my data visualistion workflow
The chart produced in this notebook is a ‘skeleton’ with fairly minimal styling, but all the key structural components are in place. The chart is then exported from this notebook as svg. This is then edited - adding textures, photos, annotations etc. - using graphic design software to create the final version.
Setting up the notebook
# import packages
library(tidyverse) # for data manipulation and viz
library(knitr) # for formatting tables
library(kableExtra) # for formatting tables
library(lubridate) # for working with dates
# set default theme for exploratory plots
theme_set(theme_light()) # using a minimal theme to make it easier to edit
# the plots in graphic design software later on
# set default R markdown chunk options
knitr::opts_chunk$set(echo = TRUE, warning = FALSE, message = FALSE)Reading in and cleaning the data
The historic inflation data is read from .xls. The data source is the Bank of England. A little bit of data cleaning needed to be done around date formats
#-----------------------------------------------------------------------------
# read in and process the historic data
#-----------------------------------------------------------------------------
inflation_df <- readxl::read_xls("./data/Figure_9 _CPI_inflation_rate_driven_by_housing_and_household_services,_transport_and_food_.xls") %>%
# for consistent naming style of variable
janitor::clean_names() %>%
# correcting variable types
mutate( date = lubridate::my(date),
cpi_12_month_inflation_rate =as.numeric(cpi_12_month_inflation_rate)
) %>%
# focus on timeframe of interest
filter(date > lubridate::my("June 2021"))
# output to markdown to check everythinglooks as expected
inflation_df %>%
mutate(across(where(is.character), ~ as.numeric(.x))) %>%
head() %>%
kable(digits = 2)| date | food_and_non_alcoholic_beverages | alcohol_and_tobacco | clothing_and_footwear | housing_and_household_services | furniture_and_household_goods | transport | recreation_and_culture | restaurants_and_hotels | other_goods_and_services | cpi_12_month_inflation_rate |
|---|---|---|---|---|---|---|---|---|---|---|
| 2021-07-01 | -0.05 | 0.07 | 0.08 | 0.27 | 0.18 | 1.05 | 0.12 | 0.13 | 0.21 | 2.0 |
| 2021-08-01 | 0.06 | 0.11 | 0.06 | 0.27 | 0.23 | 1.07 | 0.37 | 0.81 | 0.23 | 3.2 |
| 2021-09-01 | 0.10 | 0.12 | 0.05 | 0.29 | 0.28 | 1.14 | 0.41 | 0.43 | 0.27 | 3.1 |
| 2021-10-01 | 0.15 | 0.08 | 0.01 | 0.95 | 0.35 | 1.35 | 0.38 | 0.54 | 0.38 | 4.2 |
| 2021-11-01 | 0.29 | 0.21 | 0.27 | 0.98 | 0.38 | 1.69 | 0.49 | 0.43 | 0.38 | 5.1 |
| 2021-12-01 | 0.47 | 0.17 | 0.32 | 0.97 | 0.46 | 1.62 | 0.44 | 0.51 | 0.42 | 5.4 |
Next, because they are very small datasets, I manually create the dataframes needed to show the inflation forecasts and the contributors to inflation.
#-----------------------------------------------------------------------------
# manually create forecasts and contribution dataframes
#-----------------------------------------------------------------------------
# inflation forecasts
forecast <- tribble(~date, ~cpi_12_month_inflation_rate, ~forecaster,
my("Jul 22"), 10.1, "BOE",
my("Jul 22"), 10.1, "Citi",
my("Oct 22"), 13, "BOE",
my("Jan 23"), 12.6, "BOE",
my("Jan 23"), 18.6, "Citi")
# contributors to inflation
cont_to_inflation <- tribble(~date, ~cpi_12_month_inflation_rate, ~source,
my("Jul 22"), 7.34, "other",
my("Jul 22"), 2.76, "Housing inc energy") %>%
# to enable ordering of bar stacking in the plot
mutate(source = factor(source, levels = c("other", "Housing inc energy")))Producing the chart
Now we having in place to produce the chart itself.
#-----------------------------------------------------------------------------
# Produce the plot
#-----------------------------------------------------------------------------
ggplot(mapping = aes(date, cpi_12_month_inflation_rate)) +
# core chart - contribution to inflation
geom_col(data = cont_to_inflation,
mapping = aes(fill = source), width = 20) +
# core chart - historic inflation data
geom_line(data = inflation_df, group = 1) +
# core chart - forcast inflation data
geom_point(data = forecast) +
geom_line(data = forecast,
mapping = aes(group = forecaster), linetype = "dashed") +
# format axis
scale_y_continuous(limits = c(0,20), expand = c(0,0), breaks = seq(0,20,2)) +
scale_x_date(date_labels = "%b", date_breaks = "1 months") +
# simplify the plot to make it easier to edit
theme_light() +
theme(legend.position = "none") +
theme(panel.grid.minor.x = element_blank(),
panel.grid.major.x = element_blank(),
panel.grid.minor.y = element_blank())# export the chart for editting
ggsave("inflation-jul-22.svg", units = "mm", width=300, height=200)
ggsave("inflation-raw-jul-22.jpg", units = "mm", width=300, height=200)
ggsave("inflation-raw-jul-22-large.svg", units = "mm", width=423.333, height=245, dpi = 96)