Click the Original, Code and Reconstruction tabs to read about the issues and how they were fixed.
Objective
The visualization used in this project was produced in response to the common cumulative chart that would show the number of deaths attributed to the COVID on an increasing rate and would never be negative. The author wanted to compare the rate where those deaths were changing daily to the average number of deaths in a three day range and eventually show a decreasing trend. The data used is based on the COVID-19 related deaths in seven different countries from January 23rd to April 4th.
This publication had a general public as it’s target audience, so even with low or no background in the area this visualization is expected to be an easy an understanding source of information.
The visualization chosen had the following three main issues:
Other issue that can be seen is the use of brackets to represent negative values on the x axis. This representation is not of easy understanding for the general public.
Reference
Danny Dorling.(2020). Three graphs that show a global slowdown in COVID-19 deaths. Retrieved July 27, from: “https://theconversation.com/three-graphs-that-show-a-global-slowdown-in-covid-19-deaths-135756” (Visualization)
Danny Dorling.(2020).SLOWDOWN Covid-19: Phase-portrait diagrams showing mortality rates of Covid-19 virus 2020. Retrieved July 27, from: http://www.dannydorling.org/books/SLOWDOWN/Covid19.html#collapseEleven1 (Visualization Data Source)
Center for Systems Science and Engineering (2021). COVID-19 Data Repository by the Center for Systems Science and Engineering (CSSE) at Johns Hopkins University. Retrieved July 25, from GitHub repository: “https://github.com/CSSEGISandData/COVID-19/blob/master/csse_covid_19_data/csse_covid_19_time_series/time_series_covid19_deaths_global.csv” (Original Data Source)
The following code was used to fix the issues identified in the original.
## Loading the Libraries
library(tidyr)
library(ggplot2) # for data visualization
library(dplyr) # for data wrangling
library(readr) # for reading data
library(magrittr) # for pipes
library(lubridate) # for date.time editing
library(RColorBrewer) # to create/edit color palettes
library(knitr) # for data visualization
library(RColorBrewer) # for color design
library(hrbrthemes) # for a new Theme
library(extrafont) # for extra text fonts
library(extrafontdb) # for extra text fonts
library(fontawesome) # for extra text fonts
library(directlabels) # for labels editing
update_geom_font_defaults(font_rc_light)
import_roboto_condensed()
theme_set(theme_ft_rc(
base_family = "Roboto Condensed",
base_size = 11.5*2,
plot_title_family = "Roboto Condensed",
plot_title_size = 18*2,
plot_title_face = "bold",
plot_title_margin = 10*2,
subtitle_family = if (.Platform$OS.type == "windows") "Roboto Condensed" else
"Roboto Condensed Light",
subtitle_size = 13*2,
subtitle_face = "plain",
subtitle_margin = 15*2,
strip_text_family = "Roboto Condensed",
strip_text_size = 12*2,
strip_text_face = "bold",
caption_family = if (.Platform$OS.type == "windows") "Roboto Condensed" else
"Roboto Condensed Light",
caption_size = 9*2,
caption_face = "plain",
caption_margin = 10*2,
axis_text_size = 11.5*2,
axis_title_family = "Roboto Condensed",
axis_title_size = 9*2,
axis_title_face = "plain",
axis_title_just = "center",
plot_margin = margin(30, 30, 30, 30),
grid = TRUE,
axis = FALSE,
ticks = FALSE
))
# Data wrangling
covid <- read.csv('Seven_April_20.csv') %>% rename(date = time, deaths = Deaths.day, difference = X.change, day_avg = Y..3.day.average, d_labels = dates..labels.) # Loading the data and renaming the vector names for better understanding
covid <- covid %>% filter(country != "UK_occured") %>% mutate_all(~replace(., is.na(.), 0)) # Removing unnecessary variables and replacing NA by 0.
covid$date <- as.Date(covid$date, format = "%d/%m/%y")
covid <- covid %>% filter(date <= "2020-04-06")
covid_new <- gather(covid, key = "measure", value = "value", c("difference", "day_avg"))
facet_labs = as_labeller(c("day_avg" = "Average Deaths Per Day", "difference" = "Difference in Deaths Per Day"))
data_curve_start <-as.Date(c("2010-03-15"))
annotation_start <- as.Date(c("2020-03-15"))
#Plot:
p1 <- ggplot(covid_new,
aes(x=date,
y=value,
color = country)) +
geom_line(stat='identity',
size = 2) +
scale_x_date(date_breaks = "2 week",
date_labels = "%b %d") +
facet_wrap(~measure,
ncol = 1,
scales = "free",
strip.position = "top",
labeller = facet_labs) +
geom_vline(xintercept = as.numeric(date("2020-03-15")),
colour = "#8ba3ba",
size = 1.5,
linetype = "dotdash") +
geom_vline(xintercept = as.numeric(date("2020-03-22")),
colour = "#8ba3ba",
size = 1.5,
linetype = "dotdash") +
geom_rect(aes(xmin=as.Date(c("2020-03-15")), xmax=as.Date(c("2020-03-22")),ymin = -Inf, ymax = Inf), fill = "#8ba3ba", color = NA, alpha = 0.004) +
guides(color = guide_legend(nrow = 1)) +
scale_color_brewer(palette = "Set1",
name = "Country") +
labs(title = "Mortality in seven countries attributed to COVID-19",
subtitle = "(January 23 to April 6, 2020)",
y = "Difference in Number of Deaths Average Number of Deaths",
x = "Date (2020)",
caption = "Source: COVID-19 Data Repository by the Center for Systems Science and Engineering (CSSE) at Johns Hopkins University")+
geom_dl(aes(label = country),
method = list(dl.trans(x = x + 0.2), "last.points", cex = 1.2)) +
theme(strip.background = element_rect(fill = "#333C4A", linetype = "blank"),
strip.placement = "outside",
panel.margin.y=unit(5,"lines"),
legend.position = "none",
axis.title.y = element_text(margin = margin(t = 0, r = 20, b = 0, l = 0)),
axis.title.x = element_text(margin = margin(t = 30, r = 0, b = 0, l = 0)))+
xlim(as.Date("2020-01-23"), as.Date("2020-04-09"))
p2 <- p1 + annotate("text", label = "Period where lockdowns started to happen in each country",
family="Roboto Condensed",
fontface="italic",
angle=0,
size=6,
colour='#8ba3ba',
face="bold",
x = annotation_start,
y = -Inf,
hjust=1.1,
vjust=-40)
Data Reference
Danny Dorling.(2020).SLOWDOWN Covid-19: Phase-portrait diagrams showing mortality rates of Covid-19 virus 2020. Retrieved July 27, from: http://www.dannydorling.org/books/SLOWDOWN/Covid19.html#collapseEleven1 (Visualization Datga Source)
Center for Systems Science and Engineering (2021). COVID-19 Data Repository by the Center for Systems Science and Engineering (CSSE) at Johns Hopkins University. Retrieved July 25, from GitHub repository: “https://github.com/CSSEGISandData/COVID-19/blob/master/csse_covid_19_data/csse_covid_19_time_series/time_series_covid19_deaths_global.csv” (Original Data Source)
The following plot fixes the main issues in the original.