Principles of Data Visualization and Introduction to ggplot2
I have provided you with data about the 5,000 fastest growing companies in the US, as compiled by Inc. magazine. lets read this in:
library(ggplot2)
library(scales)
library(tidyverse)
## ── Attaching packages ─────────────────────────────────────────────────────────────────────────────────────────────── tidyverse 1.2.1 ──
## ✔ tibble 2.1.3 ✔ purrr 0.3.2
## ✔ tidyr 1.0.0 ✔ dplyr 0.8.3
## ✔ readr 1.3.1 ✔ stringr 1.4.0
## ✔ tibble 2.1.3 ✔ forcats 0.5.0
## ── Conflicts ────────────────────────────────────────────────────────────────────────────────────────────────── tidyverse_conflicts() ──
## ✖ readr::col_factor() masks scales::col_factor()
## ✖ purrr::discard() masks scales::discard()
## ✖ dplyr::filter() masks stats::filter()
## ✖ dplyr::lag() masks stats::lag()
inc <- read.csv("https://raw.githubusercontent.com/charleyferrari/CUNY_DATA_608/master/module1/Data/inc5000_data.csv", header= TRUE)
And lets preview this data:
head(inc)
## Rank Name Growth_Rate Revenue
## 1 1 Fuhu 421.48 1.179e+08
## 2 2 FederalConference.com 248.31 4.960e+07
## 3 3 The HCI Group 245.45 2.550e+07
## 4 4 Bridger 233.08 1.900e+09
## 5 5 DataXu 213.37 8.700e+07
## 6 6 MileStone Community Builders 179.38 4.570e+07
## Industry Employees City State
## 1 Consumer Products & Services 104 El Segundo CA
## 2 Government Services 51 Dumfries VA
## 3 Health 132 Jacksonville FL
## 4 Energy 50 Addison TX
## 5 Advertising & Marketing 220 Boston MA
## 6 Real Estate 63 Austin TX
summary(inc)
## Rank Name Growth_Rate
## Min. : 1 (Add)ventures : 1 Min. : 0.340
## 1st Qu.:1252 @Properties : 1 1st Qu.: 0.770
## Median :2502 1-Stop Translation USA: 1 Median : 1.420
## Mean :2502 110 Consulting : 1 Mean : 4.612
## 3rd Qu.:3751 11thStreetCoffee.com : 1 3rd Qu.: 3.290
## Max. :5000 123 Exteriors : 1 Max. :421.480
## (Other) :4995
## Revenue Industry Employees
## Min. :2.000e+06 IT Services : 733 Min. : 1.0
## 1st Qu.:5.100e+06 Business Products & Services: 482 1st Qu.: 25.0
## Median :1.090e+07 Advertising & Marketing : 471 Median : 53.0
## Mean :4.822e+07 Health : 355 Mean : 232.7
## 3rd Qu.:2.860e+07 Software : 342 3rd Qu.: 132.0
## Max. :1.010e+10 Financial Services : 260 Max. :66803.0
## (Other) :2358 NA's :12
## City State
## New York : 160 CA : 701
## Chicago : 90 TX : 387
## Austin : 88 NY : 311
## Houston : 76 VA : 283
## San Francisco: 75 FL : 282
## Atlanta : 74 IL : 273
## (Other) :4438 (Other):2764
Think a bit on what these summaries mean. Use the space below to add some more relevant non-visual exploratory information you think helps you understand this data:
# Stastical summary by Industry
by(inc, inc$Industry, summary)
## inc$Industry: Advertising & Marketing
## Rank Name Growth_Rate
## Min. : 5 (Add)ventures : 1 Min. : 0.350
## 1st Qu.:1058 2020 Exhibits : 1 1st Qu.: 0.820
## Median :2276 206inc : 1 Median : 1.610
## Mean :2359 29 Prime : 1 Mean : 6.225
## 3rd Qu.:3643 33Across : 1 3rd Qu.: 4.010
## Max. :4972 352 Media Group: 1 Max. :213.370
## (Other) :465
## Revenue Industry Employees
## Min. : 2000000 Advertising & Marketing :471 Min. : 2.00
## 1st Qu.: 4100000 Business Products & Services: 0 1st Qu.: 22.00
## Median : 7900000 Computer Hardware : 0 Median : 37.00
## Mean : 16528662 Construction : 0 Mean : 84.36
## 3rd Qu.: 14950000 Consumer Products & Services: 0 3rd Qu.: 74.00
## Max. :356800000 Education : 0 Max. :5637.00
## (Other) : 0
## City State
## New York : 43 CA : 91
## San Francisco: 18 NY : 57
## Chicago : 17 FL : 31
## Austin : 12 IL : 28
## Los Angeles : 10 TX : 24
## Seattle : 10 MA : 19
## (Other) :361 (Other):221
## --------------------------------------------------------
## inc$Industry: Business Products & Services
## Rank Name Growth_Rate
## Min. : 24 1-Stop Translation USA : 1 Min. : 0.340
## 1st Qu.:1718 3Pillar Global : 1 1st Qu.: 0.615
## Median :3066 4Wall Lighting : 1 Median : 1.095
## Mean :2878 Access Information Management: 1 Mean : 3.518
## 3rd Qu.:4189 Acclaris : 1 3rd Qu.: 2.257
## Max. :4996 Acquirent : 1 Max. :85.850
## (Other) :476
## Revenue Industry Employees
## Min. :2.000e+06 Business Products & Services:482 Min. : 4.0
## 1st Qu.:4.700e+06 Advertising & Marketing : 0 1st Qu.: 24.0
## Median :9.850e+06 Computer Hardware : 0 Median : 49.0
## Mean :5.471e+07 Construction : 0 Mean : 244.5
## 3rd Qu.:2.540e+07 Consumer Products & Services: 0 3rd Qu.: 127.0
## Max. :2.400e+09 Education : 0 Max. :32000.0
## (Other) : 0 NA's :2
## City State
## New York : 14 CA : 69
## San Diego : 10 TX : 40
## Bellevue : 9 WA : 29
## Chicago : 9 FL : 27
## Seattle : 9 NY : 26
## Washington: 9 IL : 25
## (Other) :422 (Other):266
## --------------------------------------------------------
## inc$Industry: Computer Hardware
## Rank Name Growth_Rate
## Min. : 16 3D-P : 1 Min. : 0.3500
## 1st Qu.:2128 Advanced Assembly : 1 1st Qu.: 0.6075
## Median :3154 Ambir Technology : 1 Median : 1.0450
## Mean :3008 Amped Wireless : 1 Mean : 4.0898
## 3rd Qu.:4195 Apposite Technologies: 1 3rd Qu.: 1.7450
## Max. :4991 Arteris : 1 Max. :110.6800
## (Other) :38
## Revenue Industry Employees
## Min. :3.800e+06 Computer Hardware :44 Min. : 6.00
## 1st Qu.:1.295e+07 Advertising & Marketing : 0 1st Qu.: 27.00
## Median :2.235e+07 Business Products & Services: 0 Median : 45.50
## Mean :2.701e+08 Construction : 0 Mean : 220.77
## 3rd Qu.:4.878e+07 Consumer Products & Services: 0 3rd Qu.: 85.25
## Max. :1.010e+10 Education : 0 Max. :6800.00
## (Other) : 0
## City State
## Atlanta : 2 CA :17
## Fremont : 2 IL : 5
## Grand Rapids: 2 CO : 3
## Los Angeles : 2 GA : 3
## San Diego : 2 VA : 3
## Aurora : 1 MI : 2
## (Other) :33 (Other):11
## --------------------------------------------------------
## inc$Industry: Construction
## Rank Name Growth_Rate
## Min. : 50 123 Exteriors : 1 Min. : 0.350
## 1st Qu.:1642 4Corners Homes : 1 1st Qu.: 0.730
## Median :2701 ABC Supply : 1 Median : 1.300
## Mean :2642 Absolute Concrete Construction: 1 Mean : 3.367
## 3rd Qu.:3861 Air Force One : 1 3rd Qu.: 2.380
## Max. :4978 Air Genie Air Conditioning : 1 Max. :53.280
## (Other) :181
## Revenue Industry Employees
## Min. :2.100e+06 Construction :187 Min. : 3.0
## 1st Qu.:7.000e+06 Advertising & Marketing : 0 1st Qu.: 24.0
## Median :1.400e+07 Business Products & Services: 0 Median : 47.0
## Mean :7.045e+07 Computer Hardware : 0 Mean : 155.6
## 3rd Qu.:3.430e+07 Consumer Products & Services: 0 3rd Qu.: 113.5
## Max. :4.700e+09 Education : 0 Max. :6549.0
## (Other) : 0
## City State
## Austin : 4 FL : 18
## San Diego : 4 CA : 17
## Canton : 3 TX : 16
## Nashville : 3 IL : 10
## Oklahoma City: 3 GA : 9
## Baton Rouge : 2 NC : 9
## (Other) :168 (Other):108
## --------------------------------------------------------
## inc$Industry: Consumer Products & Services
## Rank Name Growth_Rate
## Min. : 1.0 3d Lacrosse : 1 Min. : 0.350
## 1st Qu.: 786.5 4Moms : 1 1st Qu.: 0.900
## Median :2056.0 4U2U Brands : 1 Median : 1.820
## Mean :2193.8 AAMI : 1 Mean : 8.776
## 3rd Qu.:3445.0 Abbyson Living: 1 3rd Qu.: 5.795
## Max. :4980.0 ADA Collection: 1 Max. :421.480
## (Other) :197
## Revenue Industry Employees
## Min. :2.000e+06 Consumer Products & Services:203 Min. : 1
## 1st Qu.:4.000e+06 Advertising & Marketing : 0 1st Qu.: 17
## Median :9.400e+06 Business Products & Services: 0 Median : 34
## Mean :7.368e+07 Computer Hardware : 0 Mean : 224
## 3rd Qu.:2.080e+07 Construction : 0 3rd Qu.: 79
## Max. :4.600e+09 Education : 0 Max. :13200
## (Other) : 0
## City State
## Austin : 7 CA :38
## New York : 5 TX :20
## San Francisco: 5 NY :17
## Brooklyn : 4 FL :14
## Dallas : 4 IL :10
## Denver : 4 CO : 9
## (Other) :174 (Other):95
## --------------------------------------------------------
## inc$Industry: Education
## Rank Name Growth_Rate
## Min. : 35 3 Key Elements : 1 Min. : 0.360
## 1st Qu.:1448 Academix Direct : 1 1st Qu.: 0.700
## Median :2852 Achieve3000 : 1 Median : 1.200
## Mean :2758 AfterCollege : 1 Mean : 3.643
## 3rd Qu.:3940 All-Star Driver : 1 3rd Qu.: 2.780
## Max. :4948 Appleton Learning: 1 Max. :70.630
## (Other) :77
## Revenue Industry Employees
## Min. : 2000000 Education :83 Min. : 1.00
## 1st Qu.: 3850000 Advertising & Marketing : 0 1st Qu.: 22.00
## Median : 6800000 Business Products & Services: 0 Median : 44.00
## Mean : 13726506 Computer Hardware : 0 Mean : 92.59
## 3rd Qu.: 13250000 Construction : 0 3rd Qu.:100.00
## Max. :145700000 Consumer Products & Services: 0 Max. :850.00
## (Other) : 0
## City State
## New York : 8 NY :14
## Austin : 3 CA :13
## Chicago : 3 IL : 8
## Columbia : 2 PA : 6
## San Francisco: 2 TX : 6
## Walnut Creek : 2 FL : 3
## (Other) :63 (Other):33
## --------------------------------------------------------
## inc$Industry: Energy
## Rank Name Growth_Rate
## Min. : 4 3TIER : 1 Min. : 0.350
## 1st Qu.: 627 A&R Solar : 1 1st Qu.: 0.910
## Median :1843 ADI Energy : 1 Median : 2.080
## Mean :2038 Advanced BioEnergy: 1 Mean : 9.603
## 3rd Qu.:3407 AED Group : 1 3rd Qu.: 7.320
## Max. :4967 AK Environmental : 1 Max. :233.080
## (Other) :103
## Revenue Industry Employees
## Min. :2.100e+06 Energy :109 Min. : 2.0
## 1st Qu.:8.600e+06 Advertising & Marketing : 0 1st Qu.: 25.0
## Median :2.940e+07 Business Products & Services: 0 Median : 70.0
## Mean :1.263e+08 Computer Hardware : 0 Mean : 242.5
## 3rd Qu.:1.081e+08 Construction : 0 3rd Qu.: 245.0
## Max. :1.900e+09 Consumer Products & Services: 0 Max. :2501.0
## (Other) : 0
## City State
## Houston :15 TX :29
## Dallas : 3 CA :16
## Auburn : 2 MA : 6
## Bakersfield: 2 MN : 5
## Columbus : 2 NY : 5
## Minneapolis: 2 OH : 5
## (Other) :83 (Other):43
## --------------------------------------------------------
## inc$Industry: Engineering
## Rank Name Growth_Rate
## Min. : 533 Aaski Technology : 1 Min. :0.360
## 1st Qu.:1587 ACAI Associates : 1 1st Qu.:0.700
## Median :2606 Accutek Testing Laboratory: 1 Median :1.350
## Mean :2733 AE Works : 1 Mean :1.984
## 3rd Qu.:3940 Airetel Staffing : 1 3rd Qu.:2.500
## Max. :4935 Andromeda Systems : 1 Max. :8.540
## (Other) :68
## Revenue Industry Employees
## Min. : 2100000 Engineering :74 Min. : 11.0
## 1st Qu.: 5950000 Advertising & Marketing : 0 1st Qu.: 35.5
## Median : 12100000 Business Products & Services: 0 Median : 64.0
## Mean : 34222973 Computer Hardware : 0 Mean : 276.1
## 3rd Qu.: 27075000 Construction : 0 3rd Qu.: 146.8
## Max. :688700000 Consumer Products & Services: 0 Max. :10000.0
## (Other) : 0
## City State
## New York : 3 CA :10
## San Diego : 3 OH : 6
## Cincinnati: 2 TX : 5
## Houston : 2 MA : 4
## Pasadena : 2 NY : 4
## Pittsburgh: 2 PA : 4
## (Other) :60 (Other):41
## --------------------------------------------------------
## inc$Industry: Environmental Services
## Rank Name Growth_Rate
## Min. : 532 Aaron Oil Company : 1 Min. :0.340
## 1st Qu.:1690 Accent Wire : 1 1st Qu.:0.765
## Median :2517 Advanced Chemical Transport: 1 Median :1.410
## Mean :2699 Advanced Disposal : 1 Mean :2.068
## 3rd Qu.:3766 ALL4 : 1 3rd Qu.:2.300
## Max. :5000 Allied : 1 Max. :8.540
## (Other) :45
## Revenue Industry Employees
## Min. :2.100e+06 Environmental Services :51 Min. : 4.0
## 1st Qu.:5.150e+06 Advertising & Marketing : 0 1st Qu.: 30.0
## Median :1.250e+07 Business Products & Services: 0 Median : 67.0
## Mean :5.174e+07 Computer Hardware : 0 Mean : 199.1
## 3rd Qu.:3.500e+07 Construction : 0 3rd Qu.: 122.0
## Max. :1.400e+09 Consumer Products & Services: 0 Max. :5347.0
## (Other) : 0
## City State
## Oklahoma City: 2 TX : 7
## Austin : 1 CA : 6
## Baton Rouge : 1 PA : 4
## Beaumont : 1 MI : 3
## Boulder : 1 OH : 3
## Buford : 1 OK : 3
## (Other) :44 (Other):25
## --------------------------------------------------------
## inc$Industry: Financial Services
## Rank Name Growth_Rate
## Min. : 7.0 360 Mortgage Group : 1 Min. : 0.3400
## 1st Qu.: 994.5 Abacus Wealth Partners : 1 1st Qu.: 0.8575
## Median :2425.5 Abound Resources : 1 Median : 1.4850
## Mean :2352.8 Absolute Capital Management: 1 Mean : 5.4353
## 3rd Qu.:3552.2 Account Control Technology : 1 3rd Qu.: 4.3250
## Max. :4999.0 AccountNow : 1 Max. :174.0400
## (Other) :254
## Revenue Industry Employees
## Min. : 2000000 Financial Services :260 Min. : 5.0
## 1st Qu.: 6000000 Advertising & Marketing : 0 1st Qu.: 28.0
## Median : 15550000 Business Products & Services: 0 Median : 70.0
## Mean : 50580385 Computer Hardware : 0 Mean : 183.4
## 3rd Qu.: 47125000 Construction : 0 3rd Qu.: 190.5
## Max. :959100000 Consumer Products & Services: 0 Max. :1829.0
## (Other) : 0
## City State
## Atlanta : 8 CA : 44
## San Diego: 7 TX : 23
## Austin : 6 GA : 13
## New York : 6 NY : 13
## Charlotte: 5 IL : 12
## Chicago : 5 MI : 11
## (Other) :223 (Other):144
## --------------------------------------------------------
## inc$Industry: Food & Beverage
## Rank Name Growth_Rate
## Min. : 19 11thStreetCoffee.com : 1 Min. : 0.340
## 1st Qu.:1576 34 Degrees : 1 1st Qu.: 0.790
## Median :2666 3D Corporate Solutions : 1 Median : 1.320
## Mean :2649 AdvancePierre Foods : 1 Mean : 3.637
## 3rd Qu.:3719 Adventure in Food Trading: 1 3rd Qu.: 2.515
## Max. :4997 Bamboo Sushi : 1 Max. :100.100
## (Other) :125
## Revenue Industry Employees
## Min. :2.000e+06 Food & Beverage :131 Min. : 3.0
## 1st Qu.:7.400e+06 Advertising & Marketing : 0 1st Qu.: 24.0
## Median :1.860e+07 Business Products & Services: 0 Median : 65.0
## Mean :9.856e+07 Computer Hardware : 0 Mean : 510.9
## 3rd Qu.:5.595e+07 Construction : 0 3rd Qu.: 235.0
## Max. :4.500e+09 Consumer Products & Services: 0 Max. :7681.0
## (Other) : 0 NA's :2
## City State
## San Francisco: 4 CA :26
## Atlanta : 3 IL :13
## Austin : 3 CO :11
## Boulder : 3 NY : 9
## Brooklyn : 3 TX : 9
## Denver : 3 UT : 6
## (Other) :112 (Other):57
## --------------------------------------------------------
## inc$Industry: Government Services
## Rank Name
## Min. : 2.0 1st American Systems and Services: 1
## 1st Qu.: 890.8 22nd Century Technologies : 1
## Median :1815.0 2HB Software Designs : 1
## Mean :2087.9 A-T Solutions : 1
## 3rd Qu.:3214.2 AB Staffing Solutions : 1
## Max. :4995.0 Academy Solutions Group : 1
## (Other) :196
## Growth_Rate Revenue Industry
## Min. : 0.350 Min. :2.000e+06 Government Services :202
## 1st Qu.: 1.008 1st Qu.:6.325e+06 Advertising & Marketing : 0
## Median : 2.110 Median :1.145e+07 Business Products & Services: 0
## Mean : 7.238 Mean :2.975e+07 Computer Hardware : 0
## 3rd Qu.: 4.992 3rd Qu.:2.415e+07 Construction : 0
## Max. :248.310 Max. :1.400e+09 Consumer Products & Services: 0
## (Other) : 0
## Employees City State
## Min. : 6.00 Huntsville: 13 VA :83
## 1st Qu.: 34.25 Arlington : 12 MD :29
## Median : 71.00 Alexandria: 10 AL :14
## Mean : 129.63 Washington: 10 FL :12
## 3rd Qu.: 142.75 Fairfax : 8 DC :10
## Max. :1352.00 Reston : 8 CA : 7
## (Other) :141 (Other):47
## --------------------------------------------------------
## inc$Industry: Health
## Rank Name Growth_Rate
## Min. : 3 24hr HomeCare : 1 Min. : 0.350
## 1st Qu.:1128 A/R Allegiance Group : 1 1st Qu.: 0.805
## Median :2319 Abbeville Dental Health Management: 1 Median : 1.570
## Mean :2416 Acadian Companies : 1 Mean : 4.856
## 3rd Qu.:3681 Accelerated Claims : 1 3rd Qu.: 3.720
## Max. :4985 Accurate Home Care : 1 Max. :245.450
## (Other) :349
## Revenue Industry Employees
## Min. :2.000e+06 Health :355 Min. : 2.00
## 1st Qu.:5.550e+06 Advertising & Marketing : 0 1st Qu.: 32.25
## Median :1.140e+07 Business Products & Services: 0 Median : 71.00
## Mean :5.032e+07 Computer Hardware : 0 Mean : 232.85
## 3rd Qu.:3.365e+07 Construction : 0 3rd Qu.: 200.00
## Max. :2.700e+09 Consumer Products & Services: 0 Max. :4390.00
## (Other) : 0 NA's :1
## City State
## San Antonio: 7 CA : 33
## Scottsdale : 7 TX : 31
## Chicago : 6 FL : 26
## New York : 6 MA : 19
## Seattle : 6 PA : 19
## Alpharetta : 5 GA : 16
## (Other) :318 (Other):211
## --------------------------------------------------------
## inc$Industry: Human Resources
## Rank Name Growth_Rate
## Min. : 137 AAP : 1 Min. : 0.350
## 1st Qu.:1235 ABBA Staffing and Consulting: 1 1st Qu.: 0.830
## Median :2386 ACA Talent : 1 Median : 1.520
## Mean :2426 Accolo : 1 Mean : 3.300
## 3rd Qu.:3617 AccruePartners : 1 3rd Qu.: 3.342
## Max. :4994 Afterburner : 1 Max. :26.960
## (Other) :190
## Revenue Industry Employees
## Min. : 2000000 Human Resources :196 Min. : 4.0
## 1st Qu.: 5500000 Advertising & Marketing : 0 1st Qu.: 26.0
## Median : 11450000 Business Products & Services: 0 Median : 71.5
## Mean : 47173980 Computer Hardware : 0 Mean : 1158.1
## 3rd Qu.: 37225000 Construction : 0 3rd Qu.: 271.0
## Max. :537700000 Consumer Products & Services: 0 Max. :66803.0
## (Other) : 0
## City State
## New York : 9 CA : 23
## Atlanta : 6 GA : 18
## Houston : 6 NJ : 13
## Charlotte: 4 NC : 12
## Denver : 4 IL : 11
## Raleigh : 4 NY : 11
## (Other) :163 (Other):108
## --------------------------------------------------------
## inc$Industry: Insurance
## Rank Name Growth_Rate
## Min. : 553 All Web Leads : 1 Min. :0.3500
## 1st Qu.:1461 Alliant National Title Insurance: 1 1st Qu.:0.5425
## Median :2868 AmWins Group : 1 Median :1.2000
## Mean :2899 Astonish Results : 1 Mean :2.0084
## 3rd Qu.:4369 AutoClaims Direct : 1 3rd Qu.:2.7425
## Max. :4965 Baldwin Krystyn Sherman Partners: 1 Max. :8.1900
## (Other) :44
## Revenue Industry Employees
## Min. : 2700000 Insurance :50 Min. : 7.00
## 1st Qu.: 6900000 Advertising & Marketing : 0 1st Qu.: 32.75
## Median : 12650000 Business Products & Services: 0 Median : 53.00
## Mean : 46758000 Computer Hardware : 0 Mean : 146.78
## 3rd Qu.: 30425000 Construction : 0 3rd Qu.: 94.00
## Max. :535000000 Consumer Products & Services: 0 Max. :2877.00
## (Other) : 0
## City State
## Jacksonville : 2 CA : 8
## Allen : 1 FL : 6
## American Fork: 1 PA : 5
## Appleton : 1 NJ : 4
## Atlanta : 1 TX : 3
## Austin : 1 GA : 2
## (Other) :43 (Other):22
## --------------------------------------------------------
## inc$Industry: IT Services
## Rank Name Growth_Rate
## Min. : 21 110 Consulting : 1 Min. : 0.340
## 1st Qu.:1447 360 Vantage : 1 1st Qu.: 0.840
## Median :2569 360IT Partners : 1 Median : 1.380
## Mean :2541 3i People : 1 Mean : 3.332
## 3rd Qu.:3604 7Delta : 1 3rd Qu.: 2.800
## Max. :5000 A3 Communications: 1 Max. :90.440
## (Other) :727
## Revenue Industry Employees
## Min. :2.000e+06 IT Services :733 Min. : 2.0
## 1st Qu.:4.700e+06 Advertising & Marketing : 0 1st Qu.: 27.0
## Median :1.000e+07 Business Products & Services: 0 Median : 57.0
## Mean :2.821e+07 Computer Hardware : 0 Mean : 140.4
## 3rd Qu.:2.410e+07 Construction : 0 3rd Qu.: 130.0
## Max. :3.800e+09 Consumer Products & Services: 0 Max. :7000.0
## (Other) : 0 NA's :1
## City State
## New York : 21 CA : 82
## Chicago : 20 VA : 69
## Atlanta : 13 TX : 54
## Alpharetta: 12 IL : 48
## Reston : 12 GA : 44
## Houston : 11 NY : 43
## (Other) :644 (Other):393
## --------------------------------------------------------
## inc$Industry: Logistics & Transportation
## Rank Name Growth_Rate
## Min. : 17 24/7 Express Logistics : 1 Min. : 0.360
## 1st Qu.:1338 A.M. Transport Services : 1 1st Qu.: 0.755
## Median :2659 A1 Express Delivery Service: 1 Median : 1.320
## Mean :2580 Access America Transport : 1 Mean : 4.339
## 3rd Qu.:3810 Access Worldwide : 1 3rd Qu.: 3.035
## Max. :4952 Ace Van & Storage Company : 1 Max. :105.730
## (Other) :149
## Revenue Industry Employees
## Min. :2.300e+06 Logistics & Transportation :155 Min. : 1.0
## 1st Qu.:7.100e+06 Advertising & Marketing : 0 1st Qu.: 18.0
## Median :2.080e+07 Business Products & Services: 0 Median : 50.0
## Mean :9.575e+07 Computer Hardware : 0 Mean : 259.7
## 3rd Qu.:4.870e+07 Construction : 0 3rd Qu.: 178.0
## Max. :1.900e+09 Consumer Products & Services: 0 Max. :10800.0
## (Other) : 0 NA's :1
## City State
## Atlanta : 4 IL :16
## Austin : 3 OH :15
## Chicago : 3 CA :14
## Houston : 3 FL :11
## Indianapolis: 3 GA : 8
## Pittsburgh : 3 MO : 8
## (Other) :136 (Other):83
## --------------------------------------------------------
## inc$Industry: Manufacturing
## Rank Name
## Min. : 64 AAC Enterprises : 1
## 1st Qu.:2107 ABCO Automation : 1
## Median :3104 Access Display Group : 1
## Mean :3001 Accurate Lubricants & Metalworking Fluids: 1
## 3rd Qu.:4009 Adafruit : 1
## Max. :4992 ADMET : 1
## (Other) :250
## Growth_Rate Revenue Industry
## Min. : 0.350 Min. :2.100e+06 Manufacturing :256
## 1st Qu.: 0.670 1st Qu.:5.500e+06 Advertising & Marketing : 0
## Median : 1.070 Median :1.190e+07 Business Products & Services: 0
## Mean : 2.295 Mean :4.955e+07 Computer Hardware : 0
## 3rd Qu.: 1.768 3rd Qu.:3.530e+07 Construction : 0
## Max. :48.130 Max. :1.700e+09 Consumer Products & Services: 0
## (Other) : 0
## Employees City State
## Min. : 1.0 Austin : 5 IL : 22
## 1st Qu.: 24.0 Atlanta : 3 OH : 22
## Median : 49.0 New York : 3 CA : 21
## Mean : 172.3 Oklahoma City: 3 TX : 19
## 3rd Qu.: 124.5 Batavia : 2 WI : 15
## Max. :8500.0 Baton Rouge : 2 NC : 13
## NA's :1 (Other) :238 (Other):144
## --------------------------------------------------------
## inc$Industry: Media
## Rank Name Growth_Rate
## Min. : 174.0 Advantage Media Group : 1 Min. : 0.410
## 1st Qu.: 791.2 Akers Media Group : 1 1st Qu.: 0.835
## Median :1949.0 AudioMicro : 1 Median : 1.940
## Mean :2222.9 Benztown : 1 Mean : 4.374
## 3rd Qu.:3619.5 Blue Telescope : 1 3rd Qu.: 5.800
## Max. :4796.0 BlueWater Technologies: 1 Max. :23.010
## (Other) :48
## Revenue Industry Employees
## Min. : 2100000 Media :54 Min. : 2.00
## 1st Qu.: 4500000 Advertising & Marketing : 0 1st Qu.: 19.25
## Median : 7600000 Business Products & Services: 0 Median : 34.00
## Mean : 32266667 Computer Hardware : 0 Mean : 176.52
## 3rd Qu.: 18500000 Construction : 0 3rd Qu.: 89.00
## Max. :441500000 Consumer Products & Services: 0 Max. :3300.00
## (Other) : 0
## City State
## New York :11 CA :12
## Burbank : 3 NY :11
## Cleveland: 2 CT : 4
## Reston : 2 IL : 3
## Stamford : 2 OH : 3
## Atlanta : 1 FL : 2
## (Other) :33 (Other):19
## --------------------------------------------------------
## inc$Industry: Real Estate
## Rank Name Growth_Rate
## Min. : 6 @Properties : 1 Min. : 0.350
## 1st Qu.: 853 1st Equity : 1 1st Qu.: 1.105
## Median :1848 Accurate Group : 1 Median : 2.070
## Mean :2018 ACT Appraisal : 1 Mean : 7.747
## 3rd Qu.:3032 American Eagle Mortgage: 1 3rd Qu.: 5.235
## Max. :4987 American Reporting : 1 Max. :179.380
## (Other) :90
## Revenue Industry Employees
## Min. : 2100000 Real Estate :96 Min. : 2.0
## 1st Qu.: 6100000 Advertising & Marketing : 0 1st Qu.: 20.0
## Median : 12800000 Business Products & Services: 0 Median : 49.0
## Mean : 30892708 Computer Hardware : 0 Mean : 198.9
## 3rd Qu.: 33975000 Construction : 0 3rd Qu.: 126.5
## Max. :517000000 Consumer Products & Services: 0 Max. :2391.0
## (Other) : 0 NA's :1
## City State
## Austin : 4 CA :16
## Chicago : 4 TX :11
## Irvine : 3 FL : 7
## Plano : 3 IL : 6
## San Francisco: 3 GA : 5
## Frisco : 2 WA : 5
## (Other) :77 (Other):46
## --------------------------------------------------------
## inc$Industry: Retail
## Rank Name Growth_Rate Revenue
## Min. : 10 99Perfume.com: 1 Min. : 0.340 Min. :2.000e+06
## 1st Qu.: 806 A Wireless : 1 1st Qu.: 0.880 1st Qu.:4.300e+06
## Median :2119 Adora : 1 Median : 1.760 Median :8.200e+06
## Mean :2234 AirRattle : 1 Mean : 6.185 Mean :5.053e+07
## 3rd Qu.:3486 Aleva Stores : 1 3rd Qu.: 5.620 3rd Qu.:2.390e+07
## Max. :4998 Alex and Ani : 1 Max. :166.890 Max. :2.800e+09
## (Other) :197
## Industry Employees City
## Retail :203 Min. : 2.0 New York : 7
## Advertising & Marketing : 0 1st Qu.: 13.0 Austin : 6
## Business Products & Services: 0 Median : 31.0 San Francisco: 4
## Computer Hardware : 0 Mean : 182.6 Boston : 3
## Construction : 0 3rd Qu.: 83.5 Brooklyn : 3
## Consumer Products & Services: 0 Max. :5821.0 Richmond : 3
## (Other) : 0 (Other) :177
## State
## CA : 33
## FL : 17
## TX : 15
## NY : 14
## UT : 9
## AZ : 8
## (Other):107
## --------------------------------------------------------
## inc$Industry: Security
## Rank Name Growth_Rate
## Min. : 110 Accuvant : 1 Min. : 0.370
## 1st Qu.:1391 Alert Logic : 1 1st Qu.: 0.790
## Median :2350 Alliance Security: 1 Median : 1.540
## Mean :2494 AMP Security : 1 Mean : 3.389
## 3rd Qu.:3714 AnchorFree : 1 3rd Qu.: 2.900
## Max. :4925 Arrow Security : 1 Max. :31.160
## (Other) :67
## Revenue Industry Employees
## Min. : 2000000 Security :73 Min. : 7.0
## 1st Qu.: 5800000 Advertising & Marketing : 0 1st Qu.: 33.0
## Median : 12300000 Business Products & Services: 0 Median : 77.0
## Mean : 52230137 Computer Hardware : 0 Mean : 562.5
## 3rd Qu.: 28700000 Construction : 0 3rd Qu.: 226.0
## Max. :718100000 Consumer Products & Services: 0 Max. :20000.0
## (Other) : 0
## City State
## Chicago : 4 CA : 9
## Atlanta : 2 TX : 7
## Mountain View: 2 GA : 5
## Overland Park: 2 IL : 5
## Phoenix : 2 UT : 5
## Plano : 2 NY : 4
## (Other) :59 (Other):38
## --------------------------------------------------------
## inc$Industry: Software
## Rank Name Growth_Rate
## Min. : 14 360 Cloud Solutions: 1 Min. : 0.3500
## 1st Qu.:1045 3C Software : 1 1st Qu.: 0.8025
## Median :2156 3DCart : 1 Median : 1.7150
## Mean :2328 5AM Solutions : 1 Mean : 5.0206
## 3rd Qu.:3680 8th Light : 1 3rd Qu.: 4.0625
## Max. :4976 ABIS : 1 Max. :128.6300
## (Other) :336
## Revenue Industry Employees
## Min. : 2000000 Software :342 Min. : 1.0
## 1st Qu.: 4900000 Advertising & Marketing : 0 1st Qu.: 32.0
## Median : 8700000 Business Products & Services: 0 Median : 69.0
## Mean : 23802924 Computer Hardware : 0 Mean : 150.3
## 3rd Qu.: 22275000 Construction : 0 3rd Qu.: 153.0
## Max. :487500000 Consumer Products & Services: 0 Max. :3000.0
## (Other) : 0 NA's :1
## City State
## San Francisco: 10 CA : 65
## Houston : 8 TX : 24
## Chicago : 7 VA : 18
## Portland : 7 CO : 17
## Austin : 6 FL : 16
## Boulder : 6 MA : 16
## (Other) :298 (Other):186
## --------------------------------------------------------
## inc$Industry: Telecommunications
## Rank Name Growth_Rate
## Min. : 90 3S Network : 1 Min. : 0.380
## 1st Qu.:1409 5Linx Enterprises : 1 1st Qu.: 0.780
## Median :2731 Access Media 3 : 1 Median : 1.280
## Mean :2597 Actiontec Electronics : 1 Mean : 2.884
## 3rd Qu.:3742 Advantage Communications Group: 1 3rd Qu.: 2.850
## Max. :4893 AllConnex : 1 Max. :37.340
## (Other) :123
## Revenue Industry Employees
## Min. : 2100000 Telecommunications :129 Min. : 6.0
## 1st Qu.: 8200000 Advertising & Marketing : 0 1st Qu.: 25.5
## Median : 16600000 Business Products & Services: 0 Median : 65.0
## Mean : 56855814 Computer Hardware : 0 Mean : 242.8
## 3rd Qu.: 31700000 Construction : 0 3rd Qu.: 136.5
## Max. :846200000 Consumer Products & Services: 0 Max. :10000.0
## (Other) : 0 NA's :2
## City State
## New York: 6 CA :23
## Atlanta : 3 NY :17
## Portland: 3 TX :10
## Chicago : 2 FL : 8
## Columbia: 2 GA : 8
## Dublin : 2 IL : 6
## (Other) :111 (Other):57
## --------------------------------------------------------
## inc$Industry: Travel & Hospitality
## Rank Name Growth_Rate
## Min. : 153 21c Museum Hotels : 1 Min. : 0.3500
## 1st Qu.:1762 305 Degrees : 1 1st Qu.: 0.6325
## Median :2986 A1ALimo.com : 1 Median : 1.1350
## Mean :2906 Access Destination Services: 1 Mean : 2.3531
## 3rd Qu.:4121 Adventure Life : 1 3rd Qu.: 2.1975
## Max. :4990 All Star Vacation Homes : 1 Max. :25.1300
## (Other) :56
## Revenue Industry Employees
## Min. : 2200000 Travel & Hospitality :62 Min. : 3.0
## 1st Qu.: 4375000 Advertising & Marketing : 0 1st Qu.: 15.0
## Median : 8600000 Business Products & Services: 0 Median : 38.0
## Mean : 47283871 Computer Hardware : 0 Mean : 371.5
## 3rd Qu.: 17400000 Construction : 0 3rd Qu.: 205.2
## Max. :506600000 Consumer Products & Services: 0 Max. :4878.0
## (Other) : 0
## City State
## New York : 4 FL : 9
## Boston : 3 CA : 8
## San Diego : 3 TX : 8
## Coral Springs : 2 NY : 7
## Dallas : 2 MA : 6
## Greenwood Village: 2 CO : 3
## (Other) :46 (Other):21
# find out bigger and smaller companies by the median
inc$size <- ifelse(inc$Revenue < median(inc$Revenue),
"small", "big"
)
# total counts of big and small companies
table(inc$size)
##
## big small
## 2510 2491
# total counts of big and small companies by industry
table(inc$Industry, inc$size)
##
## big small
## Advertising & Marketing 176 295
## Business Products & Services 223 259
## Computer Hardware 35 9
## Construction 109 78
## Consumer Products & Services 92 111
## Education 26 57
## Energy 78 31
## Engineering 44 30
## Environmental Services 26 25
## Financial Services 150 110
## Food & Beverage 85 46
## Government Services 106 96
## Health 185 170
## Human Resources 100 96
## Insurance 27 23
## IT Services 349 384
## Logistics & Transportation 105 50
## Manufacturing 135 121
## Media 21 33
## Real Estate 51 45
## Retail 90 113
## Security 38 35
## Software 149 193
## Telecommunications 85 44
## Travel & Hospitality 25 37
Create a graph that shows the distribution of companies in the dataset by State (ie how many are in each state). There are a lot of States, so consider which axis you should use. This visualization is ultimately going to be consumed on a ‘portrait’ oriented screen (ie taller than wide), which should further guide your layout choices.
# summarize the raw data by state
state <- inc %>%
group_by(State) %>%
count(State) %>%
arrange(desc(n))
state
## # A tibble: 52 x 2
## # Groups: State [52]
## State n
## <fct> <int>
## 1 CA 701
## 2 TX 387
## 3 NY 311
## 4 VA 283
## 5 FL 282
## 6 IL 273
## 7 GA 212
## 8 OH 186
## 9 MA 182
## 10 PA 164
## # … with 42 more rows
# Visualization
ggplot(data = state, aes(x = reorder(State, n), y = n)) +
geom_bar(fill="gray40", stat = "identity", width = 0.5) + theme(axis.title=element_blank()) +
theme(axis.text = element_text(size = 6))+
geom_hline(yintercept=seq(1, 800, 100), col="white", lwd=1) +
coord_flip() +
labs(title = 'Number of Fastest Growing Companies by State') +
xlab('State') +
ylab('Number of Companies')
Lets dig in on the state with the 3rd most companies in the data set. Imagine you work for the state and are interested in how many people are employed by companies in different industries. Create a plot that shows the average and/or median employment by industry for companies in this state (only use cases with full data, use R’s complete.cases()
function.) In addition to this, your graph should show how variable the ranges are, and you should deal with outliers.
# the 3rd most companies
third_state <- state[3,"State"]
third_state
## # A tibble: 1 x 1
## # Groups: State [1]
## State
## <fct>
## 1 NY
# keep only the complete rows and filter NY
ny <- inc[complete.cases(inc),] %>%
filter(State == "NY")
# Visualization
ggplot(ny, aes(reorder(Industry,Employees,mean), Employees))+
geom_boxplot(outlier.shape = NA, show.legend=F) + labs(x = "Industry", y = "Employees") + coord_flip() +
geom_point(aes(x=Industry, y=Employees), color='darkblue', size = 0.5) +
theme(plot.title = element_text(hjust = 0.5), plot.subtitle = element_text(hjust = 0.5)) + scale_y_log10()
Now imagine you work for an investor and want to see which industries generate the most revenue per employee. Create a chart that makes this information clear. Once again, the distribution per industry should be shown.
# summurize data with reveneue per employee
rev_p_emp <- inc[complete.cases(inc),] %>%
group_by(Industry) %>%
summarise(rev = sum(Revenue), emp = sum(Employees)) %>%
mutate(RevPerEmp = (rev / emp))
rev_p_emp
## # A tibble: 25 x 4
## Industry rev emp RevPerEmp
## <fct> <dbl> <int> <dbl>
## 1 Advertising & Marketing 7785000000 39731 195943.
## 2 Business Products & Services 26345900000 117357 224494.
## 3 Computer Hardware 11885700000 9714 1223564.
## 4 Construction 13174300000 29099 452741.
## 5 Consumer Products & Services 14956400000 45464 328972.
## 6 Education 1139300000 7685 148250.
## 7 Energy 13771600000 26437 520921.
## 8 Engineering 2532500000 20435 123930.
## 9 Environmental Services 2638800000 10155 259852.
## 10 Financial Services 13150900000 47693 275741.
## # … with 15 more rows
# Visualization
ggplot(rev_p_emp, aes(x=reorder(Industry, RevPerEmp), y=RevPerEmp))+
geom_bar(stat= "identity") +coord_flip()+ labs(x = "Industry", y = "Number of Employees")
# Visualization for distribution
rev_p_emp2 <- inc[complete.cases(inc),] %>%
mutate(RevPerEmp = (Revenue / Employees))
ggplot(rev_p_emp2, aes(x=reorder(Industry, RevPerEmp), y=RevPerEmp))+
geom_boxplot() +coord_flip()+ scale_y_log10() + labs(x = "Industry", y = "Revenue per Employee")