Click the Original, Code and Reconstruction tabs to read about the issues and how they were fixed.

Original


Source: World Health Organisation - Coronavirus (COVID-19)



Objective

The targeted audience is the general population of the world.

COVID-19 is the infectious disease caused by the most recently discovered coronavirus. It is now a pandemic affecting many countries globally. WHO is continuously monitoring and responding to the COVID-19 pandemic.

This visualisation shows the number of new cases confirmed every day. It helps put into context the gravity of the pandemic we are facing. The number of new cases are on the rise everyday. Looking at this visualisation can help the general public understand the need to take precautions and grasp the severity of this pandemic.

However, the visualisation chosen had the following three main issues:

  • Stacked Bar chart makes it difficult to compare the number of cases in the different regions on the same day. The nature of the stacked bar chart gives us the impressions that the region being represented on the top has the highest number of cases even when it is not.
  • No scale on the y axis. We can only compare the relative sizes of the bars in the plot. The actual value represented by each bar is unknown.
  • Not enough detail in the x axis tick labels. Makes it difficult to identify the date of each bar in the plot.

Reference

Code

The following code was used to fix the issues identified in the original.

# load required packages
library(ggplot2)
library(readr)
library(dplyr)

# read and preprocess data
who.data <- read_csv("who_covid_data.csv") %>% 
  select(c(Region,day,Confirmed)) %>% 
  group_by(Region, day) %>% 
  summarise(RegionConfirmed = sum(Confirmed)) # cases given for each country, calculate total for each region
    
# remove missing data
who.data <- na.omit(who.data) 

# plot the data
p1 <- ggplot(data = who.data, aes(x = day, y = RegionConfirmed, group = Region, color = Region)) + 
      geom_line(alpha = 0.9, size = 0.6) +  # plot lines
      geom_point(size = 1.2) +              # plot points
      theme_minimal() +
      # change plot title and subtitle
      ggtitle("Daily Confirmed cases", subtitle = "Coronavirus (COVID-19) (11 Jan 2020 - 10 May 2020)") + 
      # change axis labels and add figure caption
      labs(x = "Time", y = "Deaths", caption = "Source: World Health Organisation - Coronavirus (COVID-19)")+ 
      # change legend title
      labs(color="WHO Regions", size = 20) +   
      # modify x-tick labels
      scale_x_date(date_labels = "%d %b", date_breaks = "weeks") +    
      # change font sizes
      theme(axis.text.x = element_text(angle = 45, hjust = 1, size = 8),      
            plot.title = element_text(size = 20, face = "bold"),
            plot.subtitle = element_text(size = 15),
            legend.title=element_text(size=15),
            legend.text=element_text(size=12)) +
      # change legend colors and labels
      scale_color_manual(values = c("#238A8DFF", "#44015EFF", "#3CBB75FF", "#ED5983", "#952EA0", "#FDD725FF"), 
                         labels = c("Africa", "Americas", "Eastern Mediterranean", "Europe", "South-East Asia", "Western Pacific")) 

Data Reference

  • WHO Coronavirus Disease (COVID-19) Dashboard. (2020). COVID-19 Map Data. Retrieved May 10, 2020, from World Health Organisation website: https://covid19.who.int/

Reconstruction

The following plot fixes the main issues in the original.