This is a template file. The example included is not considered a good example to follow for Assignment 2. Remove this warning prior to submitting.

Click the Original, Code and Reconstruction tabs to read about the issues and how they were fixed.

Original


Source: QSSuplies. Wall, K (2023)


Objective

The objective is to show the quality of tap water in various countries worldwide. The scoring system is a scale of 0 to 100, 100 the safest and 0 the least safe. The intended audience includes individuals who drink tap water, as well as researchers and policymakers who are interested in improving water quality standards.

The visualisation chosen had the following three main issues:

  • Inappropriate chart type: it is difficult for the audience to identify and locate specific countries of interest.
  • Overcrowding: The visualization contains too much information in a single view, making it difficult to interpret the data. The large number of data points and labels, creates a cluttered and hard-to-read figure.
  • Incorrect reference to measurement: the measurement diplayed is not the EPI score. The score is the measure of unsafe drinking water (UWD) in countries, 1 of 40 indicators of the EPI score (Welcome | Environmental Performance Index 2022).

Reference

Code

The following code was used to fix the issues identified in the original.

# Import the required packages
library(plotly)
library(dplyr)
library(readr)
#Load the dataset
df <- read_csv("epi2022results05302022.csv")
# Select required columns
data <- df[, c("iso", "country", "UWD.new", "UWD.rnk.new")]
# Rename columns
data <- data %>% rename("UWD" = "UWD.new", "rank" = "UWD.rnk.new")
# Sort the data by UWD score
data_sorted <- data %>% arrange(UWD)
# Create the plot
fig <- plot_geo(data_sorted, locationmode = "ISO-3",
                colors = "Blues",
                hoverinfo = "text",
                text = ~paste(country, "<br>",
                              "UWD Score: ", ifelse(is.na(UWD), "n/a", UWD), "<br>",
                              "Rank: ", rank)) %>%
  add_trace(z = ~UWD, locations = ~iso, type = "choropleth") %>%
  colorbar(title = "UWD Score",
           colorscale = "Blues",
           ticks = "outside",
           tickmode = "array",
           tickvals = c(0, 20, 40, 60, 80, 100),
           ticktext = c("0", "20", "40", "60", "80", "100"),
           nan = "white",
           showscale = TRUE) %>%
  layout(title = list(text = "Quality of Tap Water by 2022 Unsafe Drinking Water (UWD) Score",
                      font = list(size = 16, color = "black", family = "Arial", bold = TRUE)),
         annotations = list(
           list(x = 1, y = 0, xanchor = "right", yanchor = "bottom",
                text = "Source: Yale, edu. EPI2022 Results (2022)",
                font = list(size = 10, color = "gray", family = "Arial"),
                showarrow = FALSE),
           list(x = 0, y = 1, xanchor = "left", yanchor = "top",
                text = paste("The hover label displays the country name, UWD score, and rank"),
                font = list(size = 10, color = "black", family = "Arial"),
                showarrow = FALSE),
           list(x = 0, y = 0, xanchor = "left", yanchor = "bottom",
                text = paste("The UDW score ranges from 0 to 100, with a score of 100 indicating that a country has the safest 
                      drinking water and a score of 0 indicating that a country has the least safe drinking water. 
                      The countries are ranked based on their UWD score, 1 being safest."),
                font = list(size = 10, color = "black", family = "Arial"),
                showarrow = FALSE)))

Data Reference

Reconstruction

The following plot fixes the main issues in the original.