Assignment 2

Click the Original, Code and Reconstruction tabs to read about the issues and how they were fixed.

Original

Source: ScarletteOScare (2022).

Objective

The objective of this visualization is to demonstrate the percentages of Income Tax for each state in the United States.

Target Audience

The target audience would be the residents of the United States who live and pay income tax in the US or people who are considering living in the US.

The visualization chosen had the following three main issues:

Visualization Layout - the layout isn’t straightforward for people to identify which states have the highest and lowest income tax percentage. The audiences might have to spend very long time to identify which color scale the states belongs to as it is difficult to compare the color with the color scale one by one.
The color scale- color in sequential is much more suitable for ordinal data while the income tax percentage is quantitative data. Audiences could only have brief idea of which range the states belongs to because some of the color is too similar to other.
Data integrity - the label didn’t tell what yellow color means in the graph and didn’t mention what the value means in the scale label. The unit of the value is not mentioned. (E.g. percentage? US dollar?) This might confuse or mislead audience so the audience might get wrong information from this visualization.

Reference

ScarletteOScare (2022). If you’re colorblind in the U.S you pay the same tax as everyone else, according to this graph, [Reddit]. Available at: https://www.reddit.com/r/dataisugly/comments/wxw7ks/if_youre_colorblind_in_the_us_you_pay_the_same/?utm_source=share&utm_medium=ios_app&utm_name=iossmf [29 August 2022].

Code

The following code was used to fix the issues identified in the original.

library("rjson")
library("dplyr")
library("ggplot2")
library("shadowtext")

#read and convert data to dataframe
myData <- fromJSON(file="newdata.json")
data <- as.data.frame(myData)

#select and sort needed data
new <- select(data,c("State","IncomeTax"))
colnames(new) <- c("state","IncomeTax")

new$state<-factor(new$state,
                  levels = new$state[order(new$IncomeTax,decreasing = FALSE)])

p2 <-ggplot(data = new, aes(x=IncomeTax, y=state,fill=IncomeTax))
  
p2 <- p2 + geom_histogram(stat="identity") +
  theme(
    axis.text = element_text(size = 13),
    axis.title = element_text(size = 25), 
    plot.title = element_text(size = 30),
    panel.grid.major.x = element_line(color = "gray10",size = 0.5,linetype = 1),
    panel.grid.minor.x = element_line(color = "gray20",size = 0.5,linetype = 2))+
  scale_colour_gradient(low="blue",high="red")+
  geom_shadowtext (aes(label = paste(IncomeTax,"%")),color = 'white',size = 5, fontface = "bold", vjust = 0.4,nudge_x = .5)+
  labs(
    title = "Income Tax for Each State in US 2022",
    y = "States",
    x = "Income Tax %")+
  scale_x_continuous(expand = c(0,0),limits = c(0,15),breaks=seq(0, 15,2))

Noted that source is in json format and I have modified the data structure so it could be loaded with the rjson library.

I have upload the python code to github and here is the link:https://github.com/LukeHii97/Visualization-Assignment-2--Preporcessing-json-data/tree/main

Data Reference

World Population Review. (2022). States with Lowest Taxes 2022. Retrieved September 2, 2022, from:https://worldpopulationreview.com/state-rankings/states-with-lowest-taxes

Reconstruction

The following plot fixes the main issues in the original.

Assignment 2

Deconstruct, Reconstruct Web Report

HII LU TECK (s3939509)

Original

Code

Reconstruction