Click the Original, Code and Reconstruction tabs to read about the issues and how they were fixed.

Original


Source:https://howmuch.net/articles/breakdown-each-state-debt-capita(2019).


Objective

The Objective of the Orignal data visualization is to analyse the amount of debt that Americans owe.The data comes from the Center for Microeconomic Data of the Federal Reserve Bank of New York and contains the total debt per capita for all 50 states.The debts are broken down into auto, credit card, mortgage, student and other debts

Target Audience

The target audience can be general public but it is more focussed on public working in Banking sector as the data also comes from Federal Reserve Bnk of New York.

The visualisation chosen had the following three main issues:

  • Issue 1: The first issue can be the method choosen for visualisation is deceptive. Pie charts are not considered to be a really accurate and good method for visualisation. The orignal data visualisation is a complex pie chart.
  • Issue 2: Perceptual and Colour is the second issue with the visualisation.The visualisation uses almost same shades of pink for representing two different classes.
  • Issue 3: The third issue is the naming for the labels. Abbreviation is used for the names of US states. It ussually difficult for someone to remember the names of all 50 states.

Reference

Juan Carlos(2020).Visualizing America’s Debt per Capita by State from howmuch.net website: https://howmuch.net/articles/breakdown-each-state-debt-capita

Code

The following code was used to fix the issues identified in the original.

# Loading the neccessary packages
library(ggplot2)
library(readxl)
library(tidyr)

# Reading Dataset
US_debt <- read_excel("Final_dataset.xlsx")


# Renaming the abbreviaton used in data.

US_debt$State <-   factor(US_debt$State,levels=c("AK","AL","AR","AZ","CA","CO","CT","DC","DE","FL","GA","HI","IA","ID","IL","IN","KS","KY","LA","MA","MD","ME","MI","MN","MO","MS","MT","NC","ND","NE","NH","NJ","NM","NV","NY","OH","OK","OR","PA","RI","SC","SD","TN","TX","UT","VA","VT","WA","WI","WV","WY"), labels = c("Alaska","Alabama","Arkansas","Arizona","California","Colorado","Connecticut","District of Columbia", "Delaware","Florida", "Georgia","Hawaii","Iowa","Idaho","Illinois","Indiana","Kansas","Kentucky","Louisiana","Massachusetts","Maryland","Maine","Michigan","Minnesota","Missouri","Mississippi","Montana","North Carolina","North Dakota","Nebraska","New Hampshire","New Jersey","New Mexico","Nevada","New York","Ohio","Oklahoma","Oregon","Pennsylvania","Rhode Island","South Carolina","SouthDakota","Tennessee","Texas","Utah","Virginia","Vermont","Washington","Wisconsin","West Virginia","Wyoming"))


# Converting dataset to long format using tidyr package
long_df <- US_debt %>% gather(debt_type , value , 2:6)
long_df$debt_type <- as.factor(long_df$debt_type)

# Creating Visualisation

p1<- ggplot(long_df, aes(x = reorder(State,`Total Debt per Capita`), y = value, fill = factor(debt_type)))
p1 <- p1 + coord_flip() +  geom_col() + geom_bar(stat="identity", width= .8 , position = "stack") +
  geom_text(aes(label = paste0("$",`Total Debt per Capita`)), size = 2.5, hjust = 0.5, vjust = 0.5, position = position_fill(vjust = 0)) +
  labs( title = "Breakdown of USA State's Debt per Capita", y= "Amount of Debt(In US Dollar)", x= "State", subtitle = "Total Debt balance in Q4, 2019", fill = "Debt Type") +  
  scale_y_continuous(breaks =seq(0, 90000, by = 3000) )+ 
  scale_fill_manual(values = c("#EEC900", "#00008B", "#e31a1c", "#2E8B57", "#FF8C00")) +
  theme(
  axis.text.x=element_text(angle=45, hjust = 1),
  axis.text.y=element_text( hjust = 1, vjust = 0.5),
  legend.title = element_text(size = 11, face = "bold"),
  legend.key.width = unit(0.7, "cm"),
  legend.key.height = unit(0.7, "cm"),
  legend.position="right",
  legend.direction="vertical",
  legend.background = element_rect(size=0.7, linetype="solid",colour ="black"),
  axis.title.x = element_text(size = 14, face = "bold"),
  axis.title.y = element_text(size = 14, face = "bold"),
  plot.title = element_text(size = 20,face = 'bold', hjust = 0.53) ,
  plot.subtitle  = element_text(size = 18,face = 'bold', hjust = 0.53) 
  )

Data Reference

Reconstruction

The following plot fixes the main issues in the original.