Click the Original, Code and Reconstruction tabs to read about the issues and how they were fixed.

Original


Source: Howmuch.net 2020


Objective

The objective of this graph is to portray the top 20 fastest growing occupations in the U.S.The target audience includes the general public of the United states of America and also the wider population who may have an interest in the occupational growth of the country.

The visualization chosen had the following three main issues:

  • Visual Bombardment: The circular graph contains the predicted growth percent in addition to the medium pay for each occupation. It also contains images that vaguely represent the occupation that distracts the viewer from the main objective. The visual representation also doesn’t use horizontal scaling when writing the amount and the occupation’s names.

  • Lack of Visual Accuracy: The uneven lengths of the bar is misleading when presented in circular form. The percentages written on top of each bar has uneven size range making it confusing and misleading to read. 34% growth in the Home health and personal care aides appears to be much lower than the 31% growth shown in the Physician assistance field. This mainly due to the larger orange area allocated for the field with the lower percent.The occupation with the highest growth has the least importance according to the visualization due to its smaller bar length.Where as physician assistants and Nurse practitioners are very prominent due to their larger bar graphs. This takes teh focus away from the main objective of the visualization.

  • Colour scheme: The colour scheme of the visualization adds on to the confusion. The Blue gradient used to represent the growth percent ranges are unnecessary as the visualization only shows the top 20 of the country. Writing the values for the medium pay on top of each bar colour coded with the orange blue is also unnecessary to convey the objective of the visualization. The blue used to write the growth pay matches with the dark blue used for the higher percent growth values.The white on the ilghter blue shades also make it harder to read.

Reference

Code

The following code was used to fix the issues identified in the original.

library(rmarkdown)
library(dplyr)
library(ggplot2)
library(magrittr)
library(knitr)
library(tidyr)

##Imposrt Data from csv
setwd('C:/Users/oashw/OneDrive/Desktop/New folder/UNI/Sem 5/Data Vis/Assignment 2/assignment2template')
data <-  read.csv("Fastest_Growing_Occupations_csv.csv", header=TRUE)
data$Occupation <- factor(data$Occupation)

##Rename the columns
names(data)[1] <- 'Occupation'
names(data)[2] <- 'MedianPay2019'
names(data)[3] <- 'GrowthRate'
names(data)[4] <- 'MedianPay2029'

##Order the fields in accordance to the Growth Rate
data$Occupation <- factor(data$Occupation,levels = data$Occupation[order(data$GrowthRate)])

##convert to numeric value
data$MedianPay2029 <- as.numeric(as.character(data$MedianPay2029))
data$MedianPay2019 <- as.numeric(as.character(data$MedianPay2019))

##Gather Median Pay
data <- data %>% gather(`MedianPay2029`,`MedianPay2019`,key='MedianPay',value='Value')

## Plot the Graph
p1<- ggplot(data, aes(fill=`MedianPay`, y=Value, x=`Occupation`)) +
  geom_bar( width = 0.7,stat="identity",position="dodge")+
  labs(x = "", y = "Median Pay in $", fill = "Median Pay")+
  scale_fill_manual(values = c("#1338be","#f05e16"))+
  coord_flip(ylim = c(0,180000))+
  labs(title = "Top 20 Occupations in the U.S Predicted to have the Highest Growth Rate by 2029",x = "Occupation") +
  geom_text( aes( label = paste0( " $"," ",Value ), y = Value ),
             position = position_dodge(width = 0.6), hjust = 0.01,size = 6,color = "black")+
  theme_minimal()+
  theme(text = element_text(size=20)) +
  theme(axis.text.x = element_text(angle = 90))+
  theme(legend.position="right")

## Percentage of Growth value
p1 <- p1 +
geom_text(aes(label = GrowthRate, y = 0),
fill = "gray", hjust = "top", family="Georgia", size = 5.5)

Data Reference

Reconstruction

The following plot fixes the main issues in the original.