Click the Original, Code and Reconstruction tabs to read about the issues and how they were fixed.

Original


Source: HowMuch.net, a financial literacy website


Objective

The main objective of this visualization is to show the shift of consumers from in-person to e-commerce shopping in the USA over the years (2000-2020).

Targeted Audience

The main audience of this visualization are the e-commerce companies who want to get an estimate of the growth of e-commerce shopping in order to manage their resources accordingly. It’s also for financial analysts who want to learn more about the change in sales behaviour of the population.

The visualization chosen had the following three main issues:

  • Ignoring Convention- The visualization doesn’t have x and y axis which makes it difficult to compare the sales across different years.
  • Area to show percentage- The width of the pink bands on top of each bar do not show the comparison of percentage sale appropriately. For example- The pink band for 2015 and 2016 look almost similar even though there’s significant difference in the values. Same is the case where the values of percentages are below 4%.
  • Ambiguous Trend- The circular chart looks confusing and it’s hard to visualize trend in it. Moreover, comparison of trend for total sales and e-commerce sales percentage is even more difficult.

Reference

Code

The following code was used to fix the issues identified in the original.

# Importing libraries and reading file
library(ggplot2)
library(reshape2)
library(TSstudio)
library(dygraphs)
library(lubridate)
library(dplyr)
library(readxl)

AdjustedSales <- read_excel("/Users/kushagrasingh/Sem3/DataViz/AdjustedSales.xlsx")
# Data Preprocessing
AdjustedSales$Year <- as.character(AdjustedSales$Year)
AdjustedSales$Year <- as.Date(AdjustedSales$Year, format = "%Y-%m-%d")
SalesDF <- melt(AdjustedSales,id.vars = c("Year"))

# Plotting time series
p1<- ggplot(SalesDF,aes(x=Year,y=value,colour=variable)) + geom_line() + geom_point()
p2<- p1 + geom_jitter(position = position_jitterdodge())+  facet_wrap(~variable,ncol=1,scales = "free")+
  labs(title = "Total Sales and E-commerce percentage of total sales growth by Year",
       x ="Year (2000-2020)", y="Percentage of Total Sales (%)          Sales (millions of dollars)")+ 
  theme(axis.line=element_line()) + 
  scale_y_continuous(labels = scales::label_number_si())+
  geom_text(aes(label=ifelse((value>100),paste(round(value / 1e6, 2)),paste(round(value, 1)))), vjust = -0.6,size = 2.2,position=position_dodge(width=0.6))

Data Reference

Reconstruction

The main issues identified in original is addressed by-

  • Adding convention to the chart.
  • Separate chart to show retail and percentage growth of e-commerce sales.
  • Time Series plot to depict trend more appropriately.

The following plot fixes the main issues in the original.