Producing a Covid-19 risk gradient chart
By: Dr. Chris Martin
Tools / packages used: R, R Markdown,
ggplot2, tidyverse, ggdist.
Techniques used: data visualisation, data cleaning/reshaping/manipulation.
Chart types used: gradient chart.
This notebook produces the a data visualisation I produced for the Makeover Monday challenge. It shows the relative risks of dying from Covid-19 for people in the UK with different ethnic backgrounds.
A note on my data visualistion workflow
The chart produced in this notebook is a ‘skeleton’ with fairly minimal styling, but all the key structural components in places. The chart was exported from this notebook as a svg. This was then edited - adding textures, photos, annotations etc. - using graphic design software to create the final version.
Setting up the code
library("tidyverse")
library("httr") # for getting data
library("readxl") # for excel file formats
# for distribution gradient plots
library("ggdist")
library("distributional")
# for custom ggplot2 theme
source("../makeover_monday_functions.R")
# set default R markdown chunk options
knitr::opts_chunk$set(echo = TRUE, warning = FALSE, message = FALSE)
# enable custom fonts in ggplot2
sysfonts::font_add_google("Lato", "Lato")Import the data
# get data from remote
GET("https://query.data.world/s/v32sjc4huvuevkorlrvbdcojxxr6jl", write_disk(tf <- tempfile(fileext = ".xlsx")))## Response [https://download.data.world/file_download/makeovermonday/2021w29/COVID%20Morality%20Rate%20by%20Ethnicity.xlsx?auth=eyJhbGciOiJIUzUxMiJ9.eyJzdWIiOiJwcm9kLXVzZXItY2xpZW50OnRiazAzIiwiaXNzIjoiYWdlbnQ6dGJrMDM6OjE1ZDA1ZWExLWNmNDktNGU2MC1hNzA0LTkzNTYyZDFhYjk0OCIsImlhdCI6MTYyNjY4NTc0NCwicm9sZSI6WyJ1c2VyIiwidXNlcl9hcGlfYWRtaW4iLCJ1c2VyX2FwaV9lbnRlcnByaXNlX2FkbWluIiwidXNlcl9hcGlfcmVhZCIsInVzZXJfYXBpX3dyaXRlIl0sImdlbmVyYWwtcHVycG9zZSI6ZmFsc2UsInVybCI6IjFhMzRjMjY1NmQzOTU4M2I4ZGJkODcyMGE5YmUzM2FiZWU0NmEwMjIifQ.yz-VywcICmLyMZaucvyaX56xZinxhajLC4UE2dpsMFfzlNoPCmt9fsSzETEU3dqzoAJaqJ6TUog_Wjh_9CI7tA]
## Date: 2022-12-12 21:28
## Status: 200
## Content-Type: application/vnd.openxmlformats-officedocument.spreadsheetml.sheet
## Size: 11.4 kB
## <ON DISK> C:\Users\chris\AppData\Local\Temp\RtmpGqe9IZ\file44a85c3b24e8.xlsx
# standardise variable naming
df <- read_excel(tf) %>%
janitor::clean_names()Creating the data visualisation
# ------------------------------------------------------------------------------
# Reshape data for plotting
# ------------------------------------------------------------------------------
# white group not shown because it the baseline
plotting_df <- df %>%
filter(ethnic_group != "White (baseline)") %>%
filter(measure == "Average")
# calculate the standard deviation
plotting_df <- plotting_df %>%
mutate(sd = (value - lower) / 2,
num_sims = 1000) %>%
rename(mean = value)
# enable custom fonts
showtext::showtext.auto()
# ------------------------------------------------------------------------------
# Create the chart
# ------------------------------------------------------------------------------
plotting_df %>%
ggplot(aes(y = fct_reorder(ethnic_group, mean))) +
# core chart - probability density gradients
stat_dist_gradientinterval(aes(dist = dist_normal(mean, sd), fill = sex),
show_interval = TRUE,
position = "dodge",
point_colour = "black",
fill_type = "gradient",
point_size = 3,
show_interval = FALSE,
.width = 0)+
# make gradients more transparent as probability reduces
scale_slab_alpha_continuous(range = c(0,3)) +
# colours
scale_fill_manual(values = c("#7365AC", "#f27e58")) +
# seperate female and male risks
facet_wrap(~sex) +
# tidy up plot for editting
explanatory_theme_2() +
theme(panel.grid.major.y = element_blank(),
panel.grid.major.x = element_line()) +
guides(fill = FALSE)# turn off custom font support
showtext::showtext.auto(FALSE)
# export for editting
ggsave("covid_ethnicity.svg")For some reason the gradients don’t appear in visualisation in the notebook output (see above). In the actual svg export they do.