Click the Original, Code and Reconstruction tabs to read about the issues and how they were fixed.

Original


Source: r/dataisbeautiful


Objective

The objective of the original data visualisation is to break down and display the origins of unauthorised steam logins after account password had been compromised.

The targeted audience are the netizens of reddit.

The visualisation chosen had the following three main issues:

  • Small text on a large pie chart made reading the country names difficult
  • Also difficult to quantitatively see the number of times account was accessed from each country
  • Even though the countries are ordered by counts it’s still not easy to compare them against each other with the awkward format of the pie chart

Reference

Code

The following code was used to fix the issues identified in the original.

library(ggplot2)
library(dplyr)
if (!require("ggflags")) {
  devtools::install_github("rensa/ggflags")
  library(ggflags)
}

# Counts and country names read off the zoomed in original visual; corresponding iso codes were input manually
df <- data.frame(
  Country = c(
    "Oman",
    "Italy",
    "Mozambique",
    "Bermuda",
    "Yemen",
    "Austria",
    "Taiwan",
    "Romania",
    "Ukraine",
    "Honduras",
    "Lesotho",
    "Bosnia and Herzegovina",
    "Iran",
    "Nepal",
    "Moldova",
    "Israel",
    "Turkey",
    "Colombia",
    "Bangladesh",
    "Estonia",
    "Georgia",
    "Albania",
    "Vietnam",
    "Thailand",
    "USA",
    "China",
    "Malaysia",
    "Russia",
    "Brazil",
    "Indonesia"
  ),
  iso_country = c(
    "om",
    "it",
    "mz",
    "bm",
    "ye",
    "at",
    "tw",
    "ro",
    "ua",
    "hn",
    "ls",
    "ba",
    "ir",
    "np",
    "md",
    "il",
    "tr",
    "co",
    "bd",
    "ee",
    "ge",
    "al",
    "vn",
    "th",
    "us",
    "cn",
    "my",
    "ru",
    "br",
    "id"
  ),
  Count = c(1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 2, 2, 3, 4, 5, 7, 8, 10, 15, 20, 22)
)

p1 <- (
  ggplot(data=df, aes(x=reorder(Country, Count), y=Count, country=iso_country)) +
  geom_bar(stat="identity", aes(fill=Count)) +
  scale_fill_gradient(low = "green", high = "red") +
  geom_flag(y=-0.5, size=6) +
  labs(
    title = "Steam logins after password was compromised",
    x = "Country of origin",
    y = "Logins"
  ) +
  theme(legend.position="none") +
  coord_flip()
)

Reconstruction

The following plot fixes the main issues in the original:

  • Country flags makes identifying countries easier
  • Bar chart makes the count easy to see with scaled axis
  • Horizontal bar chart also makes comparison very clear