Click the Original, Code and Reconstruction tabs to read about the issues and how they were fixed.
With growth of population and network and buisness , the amount of data piled up has increased dramatically. Cloud computing promises to assis society with managing and re-derivign the stored data. Recently,it has been seen the cloud computing is changing the economy and transforming the work environment of millions of people and many private companies. The objective of the origianl data visualisation is to show top 25 private cloud companies in the world in year of 2019,and to observe it’s relation with number of employees in the companies. The target audience is average people who is interested in the cloud computing system and cloud comptuing companies.
The visualisation chosen had the following three main issues:
Visual Bombardment- The data visualisation includes too many usage of images such as company logo and numerical numbers are displayed in the cloud shape. Some of the company logo/names are not very clear as all of the information are squeezed in the cloud shape. This creates confusion for audience and hence lose target of the visualisation. In addition, bar at the botom of the graph which indicates rank of the company is unecessary visual usage. It can also distract and lose the audience’s focus.
Misleading visualisation- According to dataset, the company ‘UiPath’ has highest funding amount as well as highest number of employees. However, this data visualisation misleads as the company ‘Stripe’ is the rank 1. Also, one of the company does not even have numerical value yet still in the top 25 rank graph.
Poor choice of colour scale- The visualisation uses colour gradient to show the number of employees in the company. However, the colours are not very distinguishable and therefore hard to differentiate between different range of the number of the employees.
Reference
*Howmuch.net. (2019). Retrieved April 25, 2022, from website: https://howmuch.net/articles/best-private-cloud-companies-2019
The following code was used to fix the issues identified in the original.
# Load the required libraries
library (tidyr)
library(dplyr)
library(ggplot2)
# Importing dataset
Data <-read.csv("top25company.csv")
head(Data)
## Company Industry Funding....M.
## 1 Stripe Payment processing 785
## 2 Snowflake Cloud data warehouse 920
## 3 UiPath Robotic process automation 1100
## 4 HashiCorp Cloud infrastructure automation 174
## 5 Datadog Data monitoring and analytics 148
## 6 Procore Construction management 250
## Headquarters Employees CEO
## 1 San Francisco, California 2000 Patrick Collison
## 2 San Mateo, California 1400 Frank Slootman
## 3 New York, New York 3200 Daniel Dines
## 4 San Francisco, California 600 David McJannet
## 5 New York, New York 1200 Olivier Pomel
## 6 Carpinteria, California 1800 Tooey Courtemanche
# Renaming columns of Company and Funding
Data <- rename(Data, Funding= "Funding....M.")
head(Data)
## Company Industry Funding Headquarters
## 1 Stripe Payment processing 785 San Francisco, California
## 2 Snowflake Cloud data warehouse 920 San Mateo, California
## 3 UiPath Robotic process automation 1100 New York, New York
## 4 HashiCorp Cloud infrastructure automation 174 San Francisco, California
## 5 Datadog Data monitoring and analytics 148 New York, New York
## 6 Procore Construction management 250 Carpinteria, California
## Employees CEO
## 1 2000 Patrick Collison
## 2 1400 Frank Slootman
## 3 3200 Daniel Dines
## 4 600 David McJannet
## 5 1200 Olivier Pomel
## 6 1800 Tooey Courtemanche
# Selecting required columns from Dataset
# Rearranging Funding in Descending order
df <- Data %>% select(Company,Funding,Employees)
topcompanydf <- df %>% arrange(desc(Funding))
topcompanydf
## Company Funding Employees
## 1 UiPath 1100 3200
## 2 Snowflake 920 1400
## 3 Tanium 800 1200
## 4 Stripe 785 2000
## 5 Rubrik 553 1600
## 6 Gusto 516 1000
## 7 Databricks 499 800
## 8 Toast 496 2000
## 9 TripActions 480 800
## 10 Cloudflare 404 1000
## 11 InVision 350 840
## 12 Illumio 333 325
## 13 ServiceTitan 326 702
## 14 Plaid 310 390
## 15 Segment 284 440
## 16 Squarespace 279 1000
## 17 Procore 250 1800
## 18 Intercom 241 600
## 19 Confluent 206 500
## 20 Darktrace 177 1000
## 21 HashiCorp 174 600
## 22 Vlocity 163 800
## 23 nCino 150 750
## 24 Datadog 148 1200
## 25 Canva 140 650
# Creating data frame
companydf<-data.frame (Company=c("Uipath","Snowflake","Tanium","Stripe","Rubrik","Gusto","Databricks","Toast","TripActions","Cloudflare","InVision","Illumio","ServiceTitan","Plaid","Segment","Squarespace","Procore","Intercom","Confluent","Darktrace","HashiCorp","Vlocity","nCino","Datadog","Canva"), Funding=c(1100,920,800,785,553,516,499,496,480,404,350,333,326,310,284,279,250,241,206,177,174,163,150,148,140), Employees= c("3k and More","1k-1.99k","1k-1.99k","2k-2.99k","1k-1.99k","1k-1.99k","Less than 1k","2k-2.99k","Less than 1k","1k-1.99k","Less than 1k","Less than 1k","Less than 1k","Less than 1k","Less than 1k","1k-1.99k","1k-1.99k","Less than 1k","Less than 1k","1k-1.99k","Less than 1k","Less than 1k","Less than 1k","1k-1.99k","Less than 1k"))
# Converting to factor
companydf$Company <- as.factor (companydf$Company)
companydf$Employees <- as.factor(companydf$Employees)
companydf
## Company Funding Employees
## 1 Uipath 1100 3k and More
## 2 Snowflake 920 1k-1.99k
## 3 Tanium 800 1k-1.99k
## 4 Stripe 785 2k-2.99k
## 5 Rubrik 553 1k-1.99k
## 6 Gusto 516 1k-1.99k
## 7 Databricks 499 Less than 1k
## 8 Toast 496 2k-2.99k
## 9 TripActions 480 Less than 1k
## 10 Cloudflare 404 1k-1.99k
## 11 InVision 350 Less than 1k
## 12 Illumio 333 Less than 1k
## 13 ServiceTitan 326 Less than 1k
## 14 Plaid 310 Less than 1k
## 15 Segment 284 Less than 1k
## 16 Squarespace 279 1k-1.99k
## 17 Procore 250 1k-1.99k
## 18 Intercom 241 Less than 1k
## 19 Confluent 206 Less than 1k
## 20 Darktrace 177 1k-1.99k
## 21 HashiCorp 174 Less than 1k
## 22 Vlocity 163 Less than 1k
## 23 nCino 150 Less than 1k
## 24 Datadog 148 1k-1.99k
## 25 Canva 140 Less than 1k
# Drawing the data visualisation using ggplot
plot<-ggplot(companydf, aes(x = reorder(Company,+Funding), y = Funding, fill = Employees))+
geom_bar(stat = "identity", position = "dodge", width=0.7, ,color="Black")+
coord_flip(ylim = c(130,1150))+
labs(title = "The World's top 25 private cloud companies in 2019",
caption= "Source:https://howmuch.net/articles/best-private-cloud-companies-2019",
subtitle = "By Funding & Number of Employees",
x= 'Company name', y="Funding Value($M)")+
geom_text(aes(label=Funding),hjust = -0.5,size = 2, fill="black", family = "Times New Roman")
Data Reference
The following plot fixes the main issues in the original.