Hi! In this little piece I am going to try fixing a chart on coronavirus statistics in Saint Petersburg.
So, the chart itself:
‘Anything wrong?’
In general, the graph is fine - if that is all what you’re getting as an update on coronavirus situation. But the graph can be improved, of course.
The bars on the current graph are comprehensive but can be misleading. The counts on recovered covid patients are placed below the counts on infected ones so that, if we mentally draw an x-axis where the dates are put right now, it seems like counts on recovered patients are negative. The brain might erroneously consider those numbers as going below zero. An element indicating the current date (“08 сентября” in this case) is also misleading. Although the counts of recovered versus infected are present for 11 days, this element makes it seem like the reader looks at the statistics only for this exact day.
If I was given this statistics to visualize, I would probably make a grouped barchart as the following:
date <- c("28/08", "29/08", "30/08", '31/08', '01/09', '02/09', '03/09', '04/09', '05/09', '06/09', '07/09')
infected <- c(189, 186, 188, 185, 189, 191, 193, 195, 192, 196, 210)
recovered <- c(95, 68, 96, 86, 106, 112, 101, 96, 75, 65, 123)
C19_spb <- data.frame(date, infected, recovered)
library(reshape2)
library(dplyr)
library(ggplot2)
C19_long = melt(C19_spb, "date")
C19_long$date = as.factor(as.character(C19_long$date))
ggplot(C19_long, aes(x=factor(date, levels = unique(date)), y=value, fill=variable)) +
geom_bar(stat="identity", position="dodge", col = "gray34") +
theme_bw() +
labs(title="Статистика по зараженным и выздоровевшим",
subtitle="Санкт-Петербург: с 28го августа по 7е сентября",
x="",
y="Количество случаев",
fill="") +
scale_fill_manual(values = c("coral2", "aquamarine3"),
labels=c("заразилось", "выздоровело")) +
geom_text(aes(label=value), position=position_dodge(width=1), vjust=-0.25, size = 3.2) None of the numbers are visually perceived as negative now as all of the bars are above 0 on the x-axis. I also colored it so that the infected patients, seen as somewhat “dangerous”, are in reddish and recovered patients, regarded as “safe and treated”, are in greenish. I also adjusted the labeling for the whole chart, so it is comprehensible for which period exactly the statistics is present