Click the Original, Code and Reconstruction tabs to read about the issues and how they were fixed.
Objective
The original data visualisation was targeted for the general public to show the number of COVID-19 deaths as compared to non-COVID-19 deaths. The visualisation was created with the purpose to highlight how COVID-19 is the leading causes of death in the United States.
The visualisation chosen had the following three main issues:
Reference
The following code was used to fix the issues identified in the original.
# Loading relevant packages
library(readr)
library(ggplot2)
library(ggrepel)
# Data cleaning and wrangling was done to combine two datasets (National Center for Health Statistics, 2020a, 2020b) and create the current csv file
covid <- read_csv("data.csv")
# Storing the colorblind-friendly palette that will be used for the plot
cbPalette <- c("#999999", "#E69F00", "#56B4E9", "#009E73", "#F0E442", "#0072B2", "#D55E00", "#CC79A7")
# Labelling the factor variable "death_causes"
covid$death_causes <- factor(covid$death_causes, levels = c("Non-COVID Deaths", "COVID-19 Comorbidities Deaths", "COVID-19 Deaths"), labels = c("Non-COVID Deaths", "COVID-19 Comorbidities Deaths", "COVID-19 Deaths"))
# Creating the plot
p1 <- ggplot(data = covid, aes(fill = death_causes, y = cases, x = age_group))
p1 <- p1 + geom_bar(position = "stack", stat = "identity") +
scale_fill_manual(values = cbPalette) +
labs(title = "COVID and Comorbidities vs Non-COVID Deaths",
subtitle = "Feb 1 - Sep 6, 2020 \n United States",
y = "Number of Cases",
x = "Age Groups") +
scale_y_continuous(labels = scales::comma) +
geom_text_repel(data = covid,
aes(x = age_group, y = cases, label = paste0(ratio,"%")),
size = 3,
vjust = 0.7,
position = position_stack()) +
theme(plot.title = element_text(hjust = 0.5, face = "bold"),
plot.subtitle = element_text(hjust = 0.5, lineheight = 1.2),
legend.position = "bottom",
legend.direction = "horizontal",
legend.title = element_blank())
Data Reference
National Center for Health Statistics. (2020a). Conditions contributing to deaths involving coronavirus disease 2019 (COVID-19), by age group and state, United States [Data set]. Centers for Disease Control and Prevention. https://data.cdc.gov/NCHS/Conditions-contributing-to-deaths-involving-corona/hk9y-quqm
National Center for Health Statistics. (2020b). Provisional COVID-19 Death Counts by Sex, Age, and State [Data set]. Centers for Disease Control and Prevention. https://data.cdc.gov/NCHS/Provisional-COVID-19-Death-Counts-by-Sex-Age-and-S/9bhg-hcku
The following plot fixes the main issues in the original.