Producing charts for ‘A single chart inflation explainer (July 2022)’

By: Dr. Chris Martin

Tools / packages used: R, R Markdown, ggplot2, tidyverse (inc. dplyr).

Techniques used: data visualisation, data cleaning/reshaping/manipulation.

This notebook produces the static data visualisations which features in my data storytelling project: A single chart explainer on the UK inflation picture.

A note on my data visualistion workflow

The chart produced in this notebook is a ‘skeleton’ with fairly minimal styling, but all the key structural components are in place. The chart is then exported from this notebook as svg. This is then edited - adding textures, photos, annotations etc. - using graphic design software to create the final version.

Setting up the notebook

# import packages
library(tidyverse)  # for data manipulation and viz
library(knitr)      # for formatting tables
library(kableExtra) # for formatting tables
library(lubridate)  # for working with dates

# set default theme for exploratory plots
theme_set(theme_light())  # using a minimal theme to make it easier to edit 
                          # the plots in graphic design software later on

# set default R markdown chunk options
knitr::opts_chunk$set(echo = TRUE, warning = FALSE, message = FALSE)

Reading in and cleaning the data

The historic inflation data is read from .xls. The data source is the Bank of England. A little bit of data cleaning needed to be done around date formats

#-----------------------------------------------------------------------------
# read in and process the historic data
#-----------------------------------------------------------------------------

inflation_df <- readxl::read_xls("./data/Figure_9 _CPI_inflation_rate_driven_by_housing_and_household_services,_transport_and_food_.xls") %>% 
  
  # for consistent naming style of variable
  janitor::clean_names() %>% 
  
  # correcting variable types
  mutate( date = lubridate::my(date),
          cpi_12_month_inflation_rate =as.numeric(cpi_12_month_inflation_rate)
        ) %>% 
  
  # focus on timeframe of interest
  filter(date > lubridate::my("June 2021"))

# output to markdown to check everythinglooks as expected
inflation_df %>% 
  mutate(across(where(is.character), ~ as.numeric(.x))) %>% 
  head() %>% 
  kable(digits = 2)
date food_and_non_alcoholic_beverages alcohol_and_tobacco clothing_and_footwear housing_and_household_services furniture_and_household_goods transport recreation_and_culture restaurants_and_hotels other_goods_and_services cpi_12_month_inflation_rate
2021-07-01 -0.05 0.07 0.08 0.27 0.18 1.05 0.12 0.13 0.21 2.0
2021-08-01 0.06 0.11 0.06 0.27 0.23 1.07 0.37 0.81 0.23 3.2
2021-09-01 0.10 0.12 0.05 0.29 0.28 1.14 0.41 0.43 0.27 3.1
2021-10-01 0.15 0.08 0.01 0.95 0.35 1.35 0.38 0.54 0.38 4.2
2021-11-01 0.29 0.21 0.27 0.98 0.38 1.69 0.49 0.43 0.38 5.1
2021-12-01 0.47 0.17 0.32 0.97 0.46 1.62 0.44 0.51 0.42 5.4

Next, because they are very small datasets, I manually create the dataframes needed to show the inflation forecasts and the contributors to inflation.

#-----------------------------------------------------------------------------
# manually create forecasts and contribution dataframes
#-----------------------------------------------------------------------------

# inflation forecasts
forecast <- tribble(~date, ~cpi_12_month_inflation_rate, ~forecaster,
        my("Jul 22"), 10.1, "BOE",
        my("Jul 22"), 10.1, "Citi",
        my("Oct 22"), 13, "BOE",
        my("Jan 23"), 12.6, "BOE",
        my("Jan 23"), 18.6, "Citi")

# contributors to inflation
cont_to_inflation <- tribble(~date, ~cpi_12_month_inflation_rate, ~source,
                             my("Jul 22"), 7.34, "other",
                             my("Jul 22"), 2.76, "Housing inc energy") %>% 
  
  # to enable ordering of bar stacking in the plot
  mutate(source = factor(source, levels = c("other", "Housing inc energy")))

Producing the chart

Now we having in place to produce the chart itself.

#-----------------------------------------------------------------------------
# Produce the plot
#-----------------------------------------------------------------------------

ggplot(mapping = aes(date, cpi_12_month_inflation_rate)) +
  
  # core chart - contribution to inflation
  geom_col(data = cont_to_inflation,
           mapping = aes(fill = source), width = 20) +
  
  # core chart - historic inflation data
  geom_line(data = inflation_df, group = 1) +
  
  # core chart - forcast inflation data
  geom_point(data = forecast) +
  geom_line(data = forecast,
              mapping = aes(group = forecaster), linetype = "dashed") +
  
  # format axis
  scale_y_continuous(limits = c(0,20), expand = c(0,0), breaks = seq(0,20,2)) +
  scale_x_date(date_labels = "%b", date_breaks = "1 months") +
  
  # simplify the plot to make it easier to edit
  theme_light() +
  theme(legend.position = "none") +
  theme(panel.grid.minor.x = element_blank(),
        panel.grid.major.x = element_blank(),
        panel.grid.minor.y = element_blank())

# export the chart for editting
ggsave("inflation-jul-22.svg", units = "mm", width=300, height=200)
ggsave("inflation-raw-jul-22.jpg", units = "mm", width=300, height=200)
ggsave("inflation-raw-jul-22-large.svg", units = "mm", width=423.333, height=245, dpi = 96)