Click the Original, Code and Reconstruction tabs to read about the issues and how they were fixed.

Original


Source: https://howmuch.net/articles/international-trade-as-a-share-of-state-GDP. (Raul Amoros- 27 March 2018)


Introduction

GDP and trade shares of states are very important because it gives information about the size of the economy and how an economy is performing.As growth rate of real GDP is used as an indicator of the general health of the economy researchers use this to determine the International trade analysis for each US state’s economy.Markets are down as political rhetoric continues to escalate around everything from tariffs to the trade deficits.By giving proper attention to international trade,this problematic complex multivariate data vizualisation figures out which states economies depend the most on imports and exports using the parameters Trade Shares of GDP,GDP in billions for the year 2017.

Objective

The Bureau of Economic Analysis keeps track of GDP figures at the state level, the US Census Bureau tabulates trade figures for each state for each year, and the American Enterprise Institute neatly synthesized the information and represents the analysis in this data visualisation.The main objective of this data visualisation is to find out which states economies occupies most on the exports and imports.The above one is a circular bar chart where each blue area denotes total GDP in $B for each state.Here a color coded red area used to highlight the percentage of states economy which is dependant on trade as states vary significantly in size.Using this highlighted values its easy to find out top states where international trade makes up the greatest percentage of the local economy.In other words we can say that the main goal is to highlight how important trade is to the overall economy of both large and small states.

Targetted Audience

The graph tries to capture the attention of:

*Consumers in the US and people who get access to the best products in the world at the lowest price and greatest value.

*Global Market finders,Traders like U.S Exporters and Importers

*Researchers and economists who want to study trade shares of GDP,GDP values to ascertain whether the domestic economy is in trade surplus/trade deficit,as it is an important indicator for the level of economic growth in the country.Also economic Students/Individuals who need to explore about American Economy

*US Trade statisticians who are using the Census API’s and also important for the Government focusing on GDP values and deciding whether to continue importing/exporting from that state with a particular commodity and for further policy making,like setting up trade duties and tarrifs, etc.Government also needs this informtion to study the country’s balance of trade as it directly affects currency value which could impact the exchange rates, GDP and inflation rates.

*Global traders, data enthusiasts,audiences include foreign investors and stock market employees.

Issues

The visualisation chosen had the following three main issues:

  • Improper Information Bombardment or Visual Complexity: The spiral shape is highly intriguing to observers eye as there is an information burst.Its hard to concentrate on objective and hard for comparison between the GDP and Trade shares.Instead of this shape if its represented by regular bar plot comparison would be much easier.The viewers will get deceived because of constant angles where observers try to analyse by finding the angles and area partition in this kind of plots.Too many variables and picture icons in a single visualistion are increasing the complexity and time involved for its comprehension.But the actual important information, state names are not properly mentioned here.Instead of that its denoted as acronym where its burden to the people who are not familiar with these abbreviations.Its improper method to denote the states information For example,If we are trying to search for trade share value 13.2% the observers have to strain their eyes as its in the circular bars ending area which is not eye catching and diffult to find.Also if the viewer got confused with the state name “CA” no references have included to know the actual state name “CALIFORNIA”

  • Deceptive Issues: In this visualisation its sorted by one element GDP and the participant is not easily find the trade shares of GDP of 2017 denoted in the edge of spiral plot which is as important variable as GDP.This issue is a bit haphazard and makes one find it difficult to make comparison. For example if you need to make a comparison between the top five states where international trade makes up the greatest percentage of the local economy of each state,directly we will notice Texas with trade share 31.2% due to its size and in a eye catching position top of spiral with highest trade share.But as per statistics the actual value is Michigan with 38.9% trade share.This kind of things leads to analyzer is getting misinformed as a part of the study

  • Color Issues:Even though the two color’s used here are quite contradictive and eye catching,the gradient colors 4 shades which is used for the Trade shares of GDP is a bit confusing.This contrast of colors might be disturbing to the eye.When we are analysing towards the decreasing GDP’s in the end of the spiral it is hard to see the difference and also the values/information denoted. For example the Montana with trade share 12.4% and South Dakota is with trade share 5.1% which is not clear enough and difficult for a person with a colour vision deficiency.

Reference

Code

The following code was used to fix the issues identified in the original.

#Packages required
installed.packages("dplyr") # package to use %>% pipe operator
##      Package LibPath Version Priority Depends Imports LinkingTo Suggests
##      Enhances License License_is_FOSS License_restricts_use OS_type Archs
##      MD5sum NeedsCompilation Built
installed.packages("readr") # package to import dataset
##      Package LibPath Version Priority Depends Imports LinkingTo Suggests
##      Enhances License License_is_FOSS License_restricts_use OS_type Archs
##      MD5sum NeedsCompilation Built
installed.packages("tidyr") # package to tidy data
##      Package LibPath Version Priority Depends Imports LinkingTo Suggests
##      Enhances License License_is_FOSS License_restricts_use OS_type Archs
##      MD5sum NeedsCompilation Built
installed.packages("tidyverse") #package to make operations faster
##      Package LibPath Version Priority Depends Imports LinkingTo Suggests
##      Enhances License License_is_FOSS License_restricts_use OS_type Archs
##      MD5sum NeedsCompilation Built
installed.packages("hablar") #package used to deal the data types
##      Package LibPath Version Priority Depends Imports LinkingTo Suggests
##      Enhances License License_is_FOSS License_restricts_use OS_type Archs
##      MD5sum NeedsCompilation Built
installed.packages("ggplot2") # package to plot data visualisation
##      Package LibPath Version Priority Depends Imports LinkingTo Suggests
##      Enhances License License_is_FOSS License_restricts_use OS_type Archs
##      MD5sum NeedsCompilation Built
installed.packages("readxl") #package to read the excel files
##      Package LibPath Version Priority Depends Imports LinkingTo Suggests
##      Enhances License License_is_FOSS License_restricts_use OS_type Archs
##      MD5sum NeedsCompilation Built
#Libraries
library(tidyverse) #useful to make faster
library(hablar) #useful to work with data types
library(dplyr) # useful to use %>% pipe operator
library(readr) #useful to import dataset
library(tidyr) # useful to tidy data
library(ggplot2) # useful to plot data visulisation
library(readxl) #Useful to read the excel files

#Importing the file and assigning it to 'GDP'

df <- read_excel("C:/Users/Geena George/Desktop/Sem2/DataVisualisation/International Trade as a Share of State GDP (2017).xlsx")

#We can analyse few records to get more information about the data
head(df)
## # A tibble: 6 x 4
##   `States, Ranked by~ `GDP($ billions),~ `Exports + Imports ~ `Trade Shares of ~
##   <chr>                            <dbl>                <dbl>              <dbl>
## 1 CALIFORNIA                        2734                  613               22.4
## 2 TEXAS                             1692                  527               31.2
## 3 NEW YORK                          1550                  202               13  
## 4 FLORIDA                            971                  130               13.4
## 5 ILLINOIS                           818                  201               24.6
## 6 PENNSYLVANIA                       746                  122               16.3
#Data processing/Tidying Steps

#Renaming column names for ease of data plotting 
colnames(df) [1] <- 'States'  #first column

colnames(df) [2] <- 'StateGDP_2017' #Second column

colnames(df) [3] <- 'Exports_and_Imports' #Third column

colnames(df) [4] <- 'Trade_Shares_2017' #Fourth column

#Checking the renamed column names by displaying first six rows
head(df)
## # A tibble: 6 x 4
##   States       StateGDP_2017 Exports_and_Imports Trade_Shares_2017
##   <chr>                <dbl>               <dbl>             <dbl>
## 1 CALIFORNIA            2734                 613              22.4
## 2 TEXAS                 1692                 527              31.2
## 3 NEW YORK              1550                 202              13  
## 4 FLORIDA                971                 130              13.4
## 5 ILLINOIS               818                 201              24.6
## 6 PENNSYLVANIA           746                 122              16.3
#Using the data structure factor creating labels
df$States <- factor(df$States,
                    levels = c(  "CALIFORNIA",
                                 "TEXAS", 
                                 "NEW YORK",
                                 "FLORIDA", 
                                 "ILLINOIS",
                                 "PENNSYLVANIA", 
                                 "OHIO", 
                                 "NEW JERSEY", 
                                 "GEORGIA", 
                                 "NORTH CAROLINA", 
                                 "MASSACHUSETTS", 
                                 "MICHIGAN", 
                                 "VIRGINIA", 
                                 "WASHINGTON",
                                 "MARYLAND", 
                                 "INDIANA", 
                                 "MINNESOTA", 
                                 "TENNESSEE", 
                                 "COLORADO", 
                                 "WISCONSIN",
                                 "ARIZONA",
                                 "MISSOURI", 
                                 "CONNECTICUT", 
                                 "LOUISIANA", 
                                 "OREGON", 
                                 "SOUTH AROLINA",
                                 "ALABAMA", 
                                 "KENTUCKY", 
                                 "OKLAHOMA", 
                                 "IOWA", 
                                 "UTAH", 
                                 "NEVADA",
                                 "KANSAS", 
                                 "DISTRICT OF COLUMBIA",
                                 "ARKANSAS", 
                                 "NEBRASKA", 
                                 "MISSISSIPPI", 
                                 "NEW MEXICO", 
                                 "HAWAII",
                                 "NEW HAMPSHIRE", 
                                 "WEST VIRGINIA", 
                                 "DELAWARE",
                                 "IDAHO", 
                                 "MAINE",
                                 "RHODE ISLAND", 
                                 "NORTH DAKOTA", 
                                 "ALASKA",
                                 "SOUTH DAKOTA", 
                                 "MONTANA",
                                 "WYOMING",
                                 "VERMONT" ),
                    labels = c("CALIFORNIA",
                               "TEXAS", 
                               "NEW YORK",
                               "FLORIDA", 
                               "ILLINOIS",
                               "PENNSYLVANIA", 
                               "OHIO", 
                               "NEW JERSEY", 
                               "GEORGIA", 
                               "NORTH CAROLINA", 
                               "MASSACHUSETTS", 
                               "MICHIGAN", 
                               "VIRGINIA", 
                               "WASHINGTON",
                               "MARYLAND", 
                               "INDIANA", 
                               "MINNESOTA", 
                               "TENNESSEE", 
                               "COLORADO", 
                               "WISCONSIN",
                               "ARIZONA",
                               "MISSOURI", 
                               "CONNECTICUT", 
                               "LOUISIANA", 
                               "OREGON", 
                               "SOUTH AROLINA",
                               "ALABAMA", 
                               "KENTUCKY", 
                               "OKLAHOMA", 
                               "IOWA", 
                               "UTAH", 
                               "NEVADA",
                               "KANSAS", 
                               "DISTRICT OF COLUMBIA",
                               "ARKANSAS", 
                               "NEBRASKA", 
                               "MISSISSIPPI", 
                               "NEW MEXICO", 
                               "HAWAII",
                               "NEW HAMPSHIRE", 
                               "WEST VIRGINIA", 
                               "DELAWARE",
                               "IDAHO", 
                               "MAINE",
                               "RHODE ISLAND", 
                               "NORTH DAKOTA", 
                               "ALASKA",
                               "SOUTH DAKOTA", 
                               "MONTANA",
                               "WYOMING",
                               "VERMONT" ))

#Combining the values of the State GDP into Vector
df$StateGDP_2017 = c(2734,1692,1550,971,818,746,651,589,555,543,527,515,511,503,396,360,354,345,341,324,320,307,262,243,238,215,204,190,189,166,153,153,132,127,119,112,98,88,80,77,75,72,61,59,55,52,49,48,41,32)

#Gathering the columns into key-value pairs or a single key column
Plot <- gather(df, key="measure", value="value", c("StateGDP_2017", "Trade_Shares_2017"))

#Facetting the State_GDP and trade shares of 2017 with the values
Graph <- ggplot(Plot, aes(y=States, x = value),size=.5,width=1) + geom_bar(stat='identity', fill="light blue") + facet_wrap(. ~ measure, scales="free_x") + geom_text(aes(label = value), size = 3,color= 'black') + theme(axis.text.x = element_text(angle = 45, hjust = 1 ,size=10,face="bold") ,axis.title.y=element_blank(),axis.text.y=element_text(size =7,face="bold"),legend.position = "top",plot.title = element_text(color="black", size=10, face="bold"),strip.text = element_text(size=10 , face="bold"))

Data Reference

Reconstruction

The following plot fixes the main issues in the original.