Click the Original, Code and Reconstruction tabs to read about the issues and how they were fixed.

Original


Source: Bloomberg Businessweek (2012).


Objective

This data visualisation was presented in a 2012 edition of the weekly Bloomberg Businessweek magazine. This edition of the magazine was released in the wake of Hurricane Sandy, one of the largest and most destructive storms in American history. The intent of this visualisation was to show the general public how the frequency and cost of different types natural disasters are trending over time in the United States.

The visualisation chosen had the following three main issues:

  • A non-continuous scale is used for the time axis. The author appears to have evenly spaced data points in each one year interval, rather than evenly space each yearly interval.While the viewer can compare periods by counting the disasters in each period, at a glance it is difficult to interpret any trends related to frequency or cost of disasters over time. The visualisation relies on text annotations to communicate this information.
  • In order to save space, the scale used for time is also segmented and stacked diagonally by the author. This further increases difficulty in the interpretation of trends related to frequency or cost of disasters over time. As all disaster types are presented together on this singular stacked axis, comparing disaster types over time is also very difficult.
  • While eye-catching, the plot is very busy and overwhelming. This is an example of visual bombardment. There are many large overlapping data points, each represented as a series of concentric circles. These data points are often annotated with unneccessary additional information and are positioned on a visually complicated stacked time axis. While visually interesting, the result is that it is not immediately obvious what the author is trying communicate with the visualisation. If simplified, the visualisation could more effectively communicate the general trends in the data without such a reliance on text annotations.

Reference

Code

The following code was used to fix the issues identified in the original.

library(readr)
library(ggplot2)
library(dplyr)
library(tidyr)
library(magrittr)
library(forcats)

events <- read.csv("https://www.ncdc.noaa.gov/billions/events-US-1980-2020.csv", skip =1, header = TRUE, )

events_clean <- events %>%
  mutate(Disaster = factor(Disaster),
         Cost = Total.CPI.Adjusted.Cost..Millions.of.Dollars./1000,
         Date = parse_datetime(as.character(events$End.Date)),
         Year = format(Date, "%Y")) %>%
  select(Name, Disaster, Cost, Deaths, Year, Date)

events_clean$Disaster <- fct_collapse(events_clean$Disaster,
                                      Drought_Heat_wave = "Drought",
                                      Blizzard_Freeze_Ice = c("Winter Storm", "Freeze"),
                                      Fire = "Wildfire",
                                      Hurricane = "Tropical Cyclone",
                                      Tornado_Storm = "Severe Storm",
                                      Flood = "Flooding")

events_clean$Disaster <- factor(events_clean$Disaster,
                                levels = c("Drought_Heat_wave",
                                           "Flood",
                                           "Blizzard_Freeze_Ice",
                                           "Tornado_Storm",
                                           "Hurricane",
                                           "Fire"),
                                labels = c("Drought/Heat wave",
                                           "Flood",
                                           "Blizzard/Freeze/Ice",
                                           "Tornado/Storm",
                                           "Hurricane",
                                           "Fire"))

background <- "#F5F5F5"

pal <- c("#B4A602",
         "#0AACED",
         "#03A655",
         "#646881",
         "#F06518",
         "#EB048E")


p1 <- events_clean %>%
  ggplot(aes(x = Date, y = Disaster)) +
  geom_point(aes(size = Cost, colour = Disaster), alpha = 0.75) +
  geom_point(aes(size = Cost), shape = 1, colour = "black", alpha = 0.25) +
  geom_point(size = 0.3, colour = "black") +
  geom_rug(aes(x = Date), sides = "t", size = 0.4)+
  scale_size_continuous(range = c(5,25),
                        name = "Cost ($ billion)",
                        breaks = c(1,10,50, 100, 150))+
  guides(alpha = FALSE, colour = FALSE)+
  scale_colour_manual(values = pal) +
  labs(
    title = "U.S Natural disasters with a cost of $1 billion or more (1980-2020)",
    subtitle = "Each point represents a disaster. Each | corresponds to a point.\n\n",
    caption = "Source: ncdc.noaa.gov/billion/events/US/1980-2020\n Note: Cost is CPI adjusted "
    )+
  theme(
    axis.title.x = element_blank(),
    axis.title.y = element_blank(),
    axis.ticks = element_blank(),
    panel.grid.major.x = element_line(colour = "grey80"),
    plot.background = element_rect(fill = background),
    panel.background = element_rect(fill = background),
    legend.background = element_rect(fill = background),
    legend.key = element_rect(fill = background, colour = background),
    plot.subtitle = element_text(face="italic"),
    plot.title = element_text(face="bold"))

Data Reference

Reconstruction

The following plot fixes the main issues in the original.