Click the Original, Code and Reconstruction tabs to read about the issues and how they were fixed.

Original


Source: https://theconversation.com/three-graphs-that-show-a-global-slowdown-in-covid-19-deaths-135756 (Dorling/McClure, 2020)


Objective

The visualization used in this project was produced in response to the common cumulative chart that would show the number of deaths attributed to the COVID on an increasing rate and would never be negative. The author wanted to compare the rate where those deaths were changing daily to the average number of deaths in a three day range and eventually show a decreasing trend. The data used is based on the COVID-19 related deaths in seven different countries from January 23rd to April 4th.

This publication had a general public as it’s target audience, so even with low or no background in the area this visualization is expected to be an easy an understanding source of information.

The visualization chosen had the following three main issues:

  • Issue 1: The choice of visualization method (Phase-portrait) makes the graph confusing. Without an explanation of what is being represented, the visualization requires a lot of attention at first look for understanding how the variables are changing related to each other.
  • Issue 2: The use of two dependent variables as X and Y axis and a date-series as dependent variable, creates a confusing and deceptive visualization. This representation works similar to using dual axis and that is misleading due to the scale of each variable. Time-series should be used as independent variable on a single X and Y chart.
  • Issue 3: Misuse of size. Use of line thickness doesn’t have a direct relation to the information being represented. instead, it makes the visualization almost unreadable on the bottom portion of the graph.

Other issue that can be seen is the use of brackets to represent negative values on the x axis. This representation is not of easy understanding for the general public.

Reference

Code

The following code was used to fix the issues identified in the original.

## Loading the Libraries
library(tidyr)
library(ggplot2)  # for data visualization
library(dplyr)    # for data wrangling 
library(readr)    # for reading data 
library(magrittr) # for pipes
library(lubridate) # for date.time editing
library(RColorBrewer) # to create/edit color palettes
library(knitr)    # for data visualization
library(RColorBrewer) # for color design
library(hrbrthemes) # for a new Theme
library(extrafont) # for extra text fonts
library(extrafontdb) # for extra text fonts
library(fontawesome) # for extra text fonts
library(directlabels) # for labels editing
update_geom_font_defaults(font_rc_light)
import_roboto_condensed()

theme_set(theme_ft_rc(
  base_family = "Roboto Condensed",
  base_size = 11.5*2,
  plot_title_family = "Roboto Condensed",
  plot_title_size = 18*2,
  plot_title_face = "bold",
  plot_title_margin = 10*2,
  subtitle_family = if (.Platform$OS.type == "windows") "Roboto Condensed" else
    "Roboto Condensed Light",
  subtitle_size = 13*2,
  subtitle_face = "plain",
  subtitle_margin = 15*2,
  strip_text_family = "Roboto Condensed",
  strip_text_size = 12*2,
  strip_text_face = "bold",
  caption_family = if (.Platform$OS.type == "windows") "Roboto Condensed" else
    "Roboto Condensed Light",
  caption_size = 9*2,
  caption_face = "plain",
  caption_margin = 10*2,
  axis_text_size = 11.5*2,
  axis_title_family = "Roboto Condensed",
  axis_title_size = 9*2,
  axis_title_face = "plain",
  axis_title_just = "center",
  plot_margin = margin(30, 30, 30, 30),
  grid = TRUE,
  axis = FALSE,
  ticks = FALSE
))
# Data wrangling 

covid <- read.csv('Seven_April_20.csv') %>% rename(date = time, deaths = Deaths.day, difference = X.change, day_avg = Y..3.day.average, d_labels = dates..labels.) # Loading the data and renaming the vector names for better understanding

covid <- covid %>% filter(country != "UK_occured") %>% mutate_all(~replace(., is.na(.), 0)) # Removing unnecessary  variables and replacing NA by 0.

covid$date <- as.Date(covid$date, format = "%d/%m/%y")
covid <- covid %>% filter(date <= "2020-04-06")
covid_new <- gather(covid, key = "measure", value = "value", c("difference", "day_avg"))
facet_labs = as_labeller(c("day_avg" = "Average Deaths Per Day", "difference" = "Difference in Deaths Per Day"))

data_curve_start <-as.Date(c("2010-03-15"))
annotation_start <- as.Date(c("2020-03-15"))

#Plot:

p1 <- ggplot(covid_new, 
             aes(x=date, 
                 y=value, 
                 color = country)) +
  geom_line(stat='identity', 
            size = 2) +
  scale_x_date(date_breaks = "2 week",
               date_labels = "%b %d") +
  facet_wrap(~measure, 
             ncol = 1, 
             scales = "free", 
             strip.position = "top", 
             labeller = facet_labs) +
  geom_vline(xintercept = as.numeric(date("2020-03-15")),
             colour = "#8ba3ba", 
             size = 1.5, 
             linetype = "dotdash") +
  geom_vline(xintercept = as.numeric(date("2020-03-22")),
             colour = "#8ba3ba",
             size = 1.5,
             linetype = "dotdash") +
  geom_rect(aes(xmin=as.Date(c("2020-03-15")), xmax=as.Date(c("2020-03-22")),ymin = -Inf, ymax = Inf), fill = "#8ba3ba", color = NA, alpha = 0.004) +
  guides(color = guide_legend(nrow = 1)) +
  scale_color_brewer(palette = "Set1", 
                     name = "Country") +
  labs(title = "Mortality in seven countries attributed to COVID-19", 
       subtitle = "(January 23 to April 6, 2020)", 
       y = "Difference in Number of Deaths                                                                                                             Average Number of Deaths",
       x = "Date (2020)",
       caption = "Source: COVID-19 Data Repository by the Center for Systems Science and Engineering (CSSE) at Johns Hopkins University")+
  geom_dl(aes(label = country), 
          method = list(dl.trans(x = x + 0.2), "last.points", cex = 1.2)) +
  theme(strip.background = element_rect(fill = "#333C4A", linetype = "blank"),
        strip.placement = "outside",
        panel.margin.y=unit(5,"lines"),
        legend.position = "none",
        axis.title.y = element_text(margin = margin(t = 0, r = 20, b = 0, l = 0)),
        axis.title.x = element_text(margin = margin(t = 30, r = 0, b = 0, l = 0)))+
  xlim(as.Date("2020-01-23"), as.Date("2020-04-09"))


p2 <- p1 + annotate("text", label = "Period where lockdowns started to happen in each country",
                    family="Roboto Condensed", 
                    fontface="italic", 
                    angle=0,
                    size=6, 
                    colour='#8ba3ba', 
                    face="bold", 
                    x = annotation_start, 
                    y = -Inf, 
                    hjust=1.1, 
                    vjust=-40)

Data Reference

Reconstruction

The following plot fixes the main issues in the original.