Click the Original, Code, and Reconstruction tabs to read about the issues and how they were fixed.
The primary objective of the data visualization is to provide an up-to-date information on the deaths due to COVID-19. The news article showing this visualization tells that Asia was the epicentre of COVID-19 related deaths in March. However, it shifted gradually to Europe with France, the United Kingdom, and Spain being the new hotspots. And slowly, it also reached to American continents and the worst fell upon the USA. The reason being the USA saw fastest growth in the number of new COVID-19 cases/patients per day and the number of deaths per day.
Targetted Audience
Everyone in the public who wants to get an update on COVID-19 - especially the unfortunate deaths due to the pandemic around the world.
Three Main Issues
The above-given data visualisation have the following three main issues:
Deceptive and hard to read. The visualization, which is a version of area graphs, doesn’t show numberical data. Especially, it misses the y-axis, i.e., it misses the numerical data for the number of deaths, making it hard to understand it. Since the graph is meant to show the death counts, it proves to be an ineffective visualization without the y-axis (numerical data for the deaths). Also, it shows just the total count and the count for the USA at the right-most of the x-axis, leaving the audience to guess the numbers for other countries/regions. Then, the plot showing “Daily confirmed deaths (% by region)” doesn’t show all required labels, making it confusing, again.
Improper data distribution. The visualization shows just any country, countinent, or a region or part of it. There is no logical distribution or grouping of the data. For example, it shows just the USA and the rest of North America, then it shows three countries from Europe and the rest of it, then it shows Mideast and Asia as different variables. It could have shown the death counts per country per continent or per continent for simplicity. If not, it should have taken death statistics for the most unfortunate countries or per continent.
Inadequate use of colors. The colors in the visualization are unnoticeable for various countries/regions. For example, in the plot showing “Daily confirmed deaths (% by region)”, it’s hard to differ between the green and light grey colors where they almost intersect. Also, it kind-of creates an optical illusion (thanks to using similar spectrum of colors side-by-side), thus confusing and misleading the audience.
The following code was used to fix the issues identified in the original.
# Import the necessary libraries
library(ggplot2)
library(readr)
# Read and prepare data for the plots
# Data was cleaned and prepared in Python
df2 <- read_csv("COVID19 Death Cases - ALL-e4 (6).csv")
# Create, customize, and show the plots
p2 <- ggplot(df2, aes(fill = Region, y = Deaths, x = Date)) +
geom_bar(position = "stack", stat = "identity") +
theme(text = element_text(size=14)) +
scale_fill_manual(values = c("#73c2fb","#0198e1","#f96abe","#dd1c8e","#ffc059","#f29600",
"#db4841","#ac2923","#36c948","#258c32","#857ada","#6c61c0")
) +
theme(plot.title = element_text(hjust = 0.5),
plot.subtitle = element_text(hjust = 0.5)) +
ggtitle(label = "COVID-19 Deaths around the World",
subtitle = "covering most unfortunate countries per continent.") +
scale_x_date(date_labels = "%d %b",date_breaks = "1 week") +
theme(axis.text.x = element_text(angle = 60, hjust = 1)) +
xlab("Week (Monday)") +
ylab("Total Number of Deaths per Week")
Data Reference
The following plot fixes all the listed issues for the original plot shown under the Original tab.