Click the Original, Code and Reconstruction tabs to read about the issues and how they were fixed.

Original


Source: National Payments Cooperation of India (2022).


Objective

The objective is to visualise the expansion of the unified payment interface, and the growth in product sales and company value between 2016 and 2022. The target audience are potential customers such as banks, or independent businesses.

The visualisation chosen had the following three main issues:

  • Deceptive Method (Dual Axes) - Left-hand side: number of banks and number of products sold. Right-hand side: Company value in US Dollars. Scales of two axes are arbitrary, easy to deceive audience
  • Deceptive Method (Truncated Axis) - Both y axes have been truncated, exaggerating the differences between the variables
  • Data Integrity (Citing the Data Source) - The visualisation does not cite the source of the original data

Reference

Code

The following code was used to fix the issues identified in the original.

library(ggplot2)
library(tidyr)
library(dplyr)
library(gridExtra)
library(ggpubr)

NPCI = read.csv("C:/Users/barcus/OneDrive - RMIT University/Desktop/Data Visualisation & Communication/DataForDataVis.csv")
colnames(NPCI)[1] = 'month'
NPCI$volume_Mn = as.numeric(gsub(",","",NPCI$volume_Mn))
NPCI$volume_Mn[62] = 10.35
NPCI$value_Cr = as.numeric(gsub(",","",NPCI$value_Cr))
NPCI_SepDate = NPCI %>% separate(month, into = c('day','month','year'), sep = "/")
NPCI = subset(NPCI_SepDate,select = -c(day,month))
NPCI$volume_Mn = NPCI$volume_Mn * 1000000

plot1 = ggplot(NPCI, aes(year,no_banks_living_on_UPI, color = year)) +
  geom_boxplot(outlier.shape = 8, lwd = 0.5) +
  labs(
    title = 'Number Of Banks Living On UPI',
    caption = 'UPI - Unified Payment Interface',
    x = 'Year',
    y = " ",
    color = 'Year'
    )+
  theme_classic() +
  theme(
    plot.title = element_text(color = "#0099f8", size = 10, face = "bold", hjust = 0.5),
    plot.caption = element_text(face = "italic")
  )

plot2 = ggplot(NPCI, aes(year,volume_Mn, color = year)) +
  geom_boxplot(outlier.shape = 8, lwd = 0.5) +
  labs(
    title = 'Number of Products Sold',
    x = 'Year',
    y = " ",
    color = 'Year'
  )+
  theme_classic() +
  theme(
    plot.title = element_text(color = "#0099f8", size = 10, face = "bold", hjust = 0.5)
  )

plot3 = ggplot(NPCI, aes(year,value_Cr, color = year)) +
  geom_boxplot(outlier.shape = 8, lwd = 0.5) +
  labs(
    title = 'Company Value in Crore Rupees',
    caption = '1 Crore Rupee = 10 million Rupees\n1 Rupee = 0.018 AUS Dollars',
    x = 'Year',
    y = " ",
    color = 'Year'
  )+
  theme_classic()+
  theme(
    plot.title = element_text(color = "#0099f8", size = 10, face = "bold", hjust = 0.5),
    plot.caption = element_text(face = "italic")
  )

plot4 = ggarrange(plot1, plot2, plot3,
          ncol = 2, nrow = 2,
          legend = FALSE)

title <- expression(atop(bold("Product Statistics for the National Payments Cooporation of India"), scriptstyle(bolditalic("2016-2022"))))

final_plot = annotate_figure(plot4,
                top = text_grob(title, color = "dark blue", face = "bold", size = 16),
                bottom = text_grob("Data source: https://www.npci.org.in/what-we-do/upi/product-statistics", color = "blue", hjust = 1, x = 1, face = "italic", size = 10))

Data Reference

Reconstruction

The following plot fixes the main issues in the original.