Description of the Final Project

Gather the latest data to revise to date an interesting article by Andy Kiersz about The most disproportionately popular job in every state published in April 2014.

Source

The main source of information will be the Bureau of Labor Statistics (BLS) Occupational Employment Statistics. This government site contains a wealth of employment information for federal and state level that can be obtained through text files, tables, and API.

This project will reference two annual reports:

  1. The latest May 2015 Occupational Employment Statistics report which was released on March 2016.
  2. The May 2013 Occupational Employment Statistics report which was released on March 2014.

Motivation

Being a new citizen, I have a general interest in exploring/travelling to other states. Apart from east coast states and easily identifiable states such as California, Nevada, Florida, I wouldn’t know exactly where the other states lie or their abbreviations! The content of the article where my inspiration came from for this final project is both uniquely interesting and an eye-opener to the great wide United States and to the specific occupations each state has.

Approach / Methodology

Some of these are based on the current limited technical knowledge and/or time for further research

  1. Use snapshot of data. The BLS employment reports will be saved to github and use those in this project, instead of directly referencing the reports on the BLS each time the code is run. This approach will provide static data at a given point in time which helps ensure the integrity of information on the map because all the transformation, tidying, and manual step performed (e.g. shortening the occupation titles), would be on the same data that produced the map as of a certain date.

  2. Include state abbreviations on the map to help with location identification.

  3. Embed map image in Rmarkdown instead of direct output of code. Since my ggplot code results in an image too small to contain occupation titles, the resulting image will be saved in PNG, uploaded to github, and will just be called by Rmarkdown. This method seems to provide a larger picture of the map.

  4. Separate map of smaller states. This method is an alternative to having points/lines referencing the smaller states, which would have been better for presentation, but taking a significant research time to figure out how to code. For future similar project, research can be done on implementing shapefiles.

  5. Provide detail and major occupational grouping. The BLS’ employment statistics categorized occupations in different levels – detail, broad, minor, and major groups. For this project I will gather the both ends, detail and major groups between 2013 and 2015.

Challenges

  1. Getting the raw data used in original article for comparison purposes against the 2015 data. The article was written in April 2014 which was based on then-newly released May 2013 report. However, BLS last updated that same report in September 2014. Initial checking determined that some occupations had changed in few states between the two reports. For example, in the article, Hawaii’s occupation with highest location quotient was “Tour Guides”; in the September version of the report, it was “Dancers”. To avoid manually re-creating the raw article data, this project is going to reference the September version from the BLS site.

  2. Plotting the US map with Alaska and Hawaii. The base package only plots US mainland states. After looking through web pages on how to plot all 51 states, fortunately, there is now a package that can do this. The next challenge to this is, since Alaska and Hawaii are repositioned to be displayed at the bottom of the map, placing text on Alaska and Hawaii would have to account for the change in longitude and latitude. I did trial-and-error on how to position the text near to those 2 states.

  3. Plotting occupations proved challenging due to length of titles and the size of the state on the map. Certain liberties were taken to manually shorten some of the titles; some looks better on multiple lines, some are one-liner. Related to this, the smaller states have to be plotted separately to contain their job label. Overall, the resulting image looks good, but not perfect. For future similar project, the angle aesthetic can be implemented, but this seems case-to-case depending on string length and state’s dimension.

Processes

Below are the steps performed to generate the table and corresponding map showing the most disproportionately popular occupation in every state.


Attach Libraries

library(RColorBrewer) 
library(tidyr)
library(dplyr)
library(ggplot2)
library(data.table)
library(maps)
library(fiftystater)
library(DT)
library(knitr)


Download the reports

st2013data <- read.csv("https://raw.githubusercontent.com/L-Velasco/Fall16_IS607/master/state_M2013_dl.csv", stringsAsFactors = FALSE)
st2015data <- read.csv("https://raw.githubusercontent.com/L-Velasco/Fall16_IS607/master/state_M2015_dl.csv", stringsAsFactors = FALSE)


Tidy and Transform

# excludes US territories
excludes_terr <- c("AS","MP","PR","VI","GU")

### --- Major Occupations 

st2013major <- st2013data %>% 
  filter(OCC_GROUP == 'major' & LOC_Q > 0 & !(ST %in% excludes_terr)) %>% 
  select(OCC_TITLE, STATE, LOC_Q) %>% 
  arrange(OCC_TITLE, desc(as.numeric(LOC_Q))) %>% 
  rename(MAJOR_OCCUPATIONS = OCC_TITLE,
         LOC_Q_2013 = LOC_Q,
         STATE_2013 = STATE)

# removes duplicates, leaving only row with highest location quotient per state
st2013major <- st2013major[ !duplicated(st2013major$MAJOR_OCCUPATIONS), ]

st2015major <- st2015data %>% 
  filter(OCC_GROUP == 'major' & LOC_Q > 0 & !(ST %in% excludes_terr)) %>% 
  select(OCC_TITLE, LOC_Q, STATE) %>% 
  arrange(OCC_TITLE, desc(as.numeric(LOC_Q))) %>% 
  rename(MAJOR_OCCUPATIONS = OCC_TITLE,
         LOC_Q_2015 = LOC_Q,
         STATE_2015 = STATE)

# removes duplicates, leaving only row with highest location quotient per state
st2015major <- st2015major[ !duplicated(st2015major$MAJOR_OCCUPATIONS), ]

st1315major <- cbind(st2013major, st2015major[,(2:3)])

### --- Detailed Occupations 

st2013detail <- st2013data %>% 
  filter(OCC_GROUP == 'detailed' & LOC_Q > 0 & !(ST %in% excludes_terr)) %>% 
  select(ST, STATE, OCC_TITLE, LOC_Q) %>% 
  arrange(ST, desc(as.numeric(LOC_Q))) %>% 
  rename(OCC_TITLE_2013 = OCC_TITLE,
         LOC_Q_2013 = LOC_Q)
st2013detail$STATE <- tolower(st2013detail$STATE)
st2013detail <- st2013detail[ !duplicated(st2013detail$ST), ]

st2015detail <- st2015data %>% 
  filter(OCC_GROUP == 'detailed' & LOC_Q > 0 & !(ST %in% excludes_terr)) %>% 
  select(ST, STATE, OCC_TITLE, LOC_Q) %>% 
  arrange(ST, desc(as.numeric(LOC_Q))) %>% 
  rename(OCC_TITLE_2015 = OCC_TITLE,
         LOC_Q_2015 = LOC_Q) %>% 
  mutate(loc_q_2015 = as.numeric(LOC_Q_2015) / 10) # helps with color fill aesthetics
st2015detail <- st2015detail[ !duplicated(st2015detail$ST), ]

st1315detail <- cbind(st2013detail[,(1:4)], st2015detail[,(3:5)])

st1315detail_tbl <- cbind(st2013detail[,(1:4)], st2015detail[,(3:4)])


Display the Major Occupations

The table below shows the 22 Major Occupations included in the annual survey for Employment Statistics and the spefic state with higher rates of this occupation than in the country as a whole.

A quick glance, DC figured in almost a third of the list . The data makes sense that Protective Service, Legal, Business and Finance occupations would be greater than other states since DC functions as the center of government.

The only other notable change in the list is that the concentration of jobs in Office and Administrative Support, and Sales which 2 years ago was in Florida, now in Utah and New Hampshire respectably.


kable(st1315major)
MAJOR_OCCUPATIONS STATE_2013 LOC_Q_2013 LOC_Q_2015 STATE_2015
1 Architecture and Engineering Occupations Michigan 1.67 1.75 Michigan
52 Arts, Design, Entertainment, Sports, and Media Occupations District of Columbia 3.19 3.57 District of Columbia
103 Building and Grounds Cleaning and Maintenance Occupations Hawaii 1.91 1.87 Hawaii
154 Business and Financial Operations Occupations District of Columbia 3.19 2.87 District of Columbia
205 Community and Social Service Occupations Vermont 1.84 1.87 Vermont
256 Computer and Mathematical Occupations Virginia 1.96 1.99 District of Columbia
307 Construction and Extraction Occupations Wyoming 3.09 2.81 Wyoming
358 Education, Training, and Library Occupations Vermont 1.42 1.44 Vermont
409 Farming, Fishing, and Forestry Occupations California 4.21 4.09 California
460 Food Preparation and Serving Related Occupations Nevada 1.61 1.58 Nevada
511 Healthcare Practitioners and Technical Occupations West Virginia 1.31 1.38 West Virginia
562 Healthcare Support Occupations Rhode Island 1.44 1.52 Rhode Island
613 Installation, Maintenance, and Repair Occupations Wyoming 1.68 1.69 Wyoming
664 Legal Occupations District of Columbia 7.72 7.32 District of Columbia
715 Life, Physical, and Social Science Occupations District of Columbia 3.62 4.08 District of Columbia
766 Management Occupations District of Columbia 2.31 2.36 District of Columbia
817 Office and Administrative Support Occupations Florida 1.12 1.13 Utah
868 Personal Care and Service Occupations Nevada 2.01 1.90 Nevada
919 Production Occupations Indiana 1.83 1.95 Indiana
970 Protective Service Occupations District of Columbia 1.73 1.67 District of Columbia
1021 Sales and Related Occupations Florida 1.25 1.26 New Hampshire
1072 Transportation and Material Moving Occupations North Dakota 1.42 1.48 North Dakota


Display the Detailed Occupations

The table below shows the job in each state with highest location quotient, making it over-represented job in that state. These jobs exist at much higher rates in each state than in the country as a whole.

For example, there are 6 times as many Fashion Designers employed in New York than in the whole U.S. in 2015.

A quick glance between the list of 2013 and 2015, the majority of the states have a change in the occupation.


kable(st1315detail_tbl)
ST STATE OCC_TITLE_2013 LOC_Q_2013 OCC_TITLE_2015 LOC_Q_2015
1 AK alaska Mining Machine Operators, All Other 58.18 Mining Machine Operators, All Other 82.99
516 AL alabama Layout Workers, Metal and Plastic 9.05 Textile Winding, Twisting, and Drawing Out Machine Setters, Operators, and Tenders 6.55
1228 AR arkansas Shoe Machine Operators and Tenders 13.35 Shoe Machine Operators and Tenders 19.65
1913 AZ arizona Mining Machine Operators, All Other 6.25 Plasterers and Stucco Masons 3.62
2621 CA california Farmworkers and Laborers, Crop, Nursery, and Greenhouse 5.89 Farmworkers and Laborers, Crop, Nursery, and Greenhouse 5.82
3410 CO colorado Atmospheric and Space Scientists 9.65 Atmospheric and Space Scientists 9.20
4129 CT connecticut Actuaries 5.17 Area, Ethnic, and Cultural Studies Teachers, Postsecondary 4.77
4790 DC district of columbia Political Scientists 120.46 Economists 78.00
5276 DE delaware Chemists 12.70 Paperhangers 8.23
5767 FL florida Motorboat Operators 8.30 Motorboat Operators 6.13
6534 GA georgia Textile Winding, Twisting, and Drawing Out Machine Setters, Operators, and Tenders 10.49 Textile Winding, Twisting, and Drawing Out Machine Setters, Operators, and Tenders 11.47
7270 HI hawaii Dancers 12.83 Dancers 13.17
7831 IA iowa Soil and Plant Scientists 12.81 Food Processing Workers, All Other 12.33
8524 ID idaho Forest and Conservation Technicians 15.58 Forest and Conservation Technicians 16.32
9147 IL illinois Rail Transportation Workers, All Other 5.67 Loading Machine Operators, Underground Mining 4.92
9906 IN indiana Boilermakers 6.39 Rolling Machine Setters, Operators, and Tenders, Metal and Plastic 8.19
10650 KS kansas Home Economics Teachers, Postsecondary 6.14 Layout Workers, Metal and Plastic 6.99
11316 KY kentucky Roof Bolters, Mining 14.14 Mine Shuttle Car Operators 16.45
12018 LA louisiana Riggers 19.95 Captains, Mates, and Pilots of Water Vessels 19.17
12731 MA massachusetts Industrial-Organizational Psychologists 8.18 Biomedical Engineers 4.91
13457 MD maryland Astronomers 14.53 Court Reporters 7.48
14176 ME maine Logging Equipment Operators 11.15 Logging Equipment Operators 10.56
14794 MI michigan Model Makers, Metal and Plastic 6.12 Commercial and Industrial Designers 6.40
15542 MN minnesota Mathematical Technicians 8.92 Radio Operators 6.99
16259 MO missouri Rock Splitters, Quarry 3.95 Artists and Related Workers, All Other 6.09
16989 MS mississippi Upholsterers 16.76 Forest Fire Inspectors and Prevention Specialists 17.76
17660 MT montana Loading Machine Operators, Underground Mining 21.91 Forest and Conservation Technicians 21.41
18244 NC north carolina Textile Winding, Twisting, and Drawing Out Machine Setters, Operators, and Tenders 8.88 Textile Winding, Twisting, and Drawing Out Machine Setters, Operators, and Tenders 8.53
18986 ND north dakota Extraction Workers, All Other 35.75 Extraction Workers, All Other 41.10
19534 NE nebraska Dredge Operators 13.99 Meat, Poultry, and Fish Cutters and Trimmers 10.26
20172 NH new hampshire Forest Fire Inspectors and Prevention Specialists 21.53 Metal Workers and Plastic Workers, All Other 8.22
20772 NJ new jersey Mathematical Science Occupations, All Other 5.03 Shampooers 4.87
21504 NM new mexico Physical Scientists, All Other 12.03 Derrick Operators, Oil and Gas 12.67
22139 NV nevada Gaming Supervisors 31.98 Gaming Service Workers, All Other 31.66
22761 NY new york Fashion Designers 6.34 Fashion Designers 6.00
23525 OH ohio Foundry Mold and Coremakers 3.54 Engine and Other Machine Assemblers 4.01
24273 OK oklahoma Gaming Managers 12.74 Rotary Drill Operators, Oil and Gas 10.95
24978 OR oregon Logging Workers, All Other 40.15 Logging Workers, All Other 44.24
25690 PA pennsylvania Gas Compressor and Gas Pumping Station Operators 4.66 Nuclear Technicians 3.79
26461 RI rhode island Textile Bleaching and Dyeing Machine Operators and Tenders 7.68 Jewelers and Precious Stone and Metal Workers 8.35
26983 SC south carolina Tire Builders 11.80 Tire Builders 12.77
27673 SD south dakota Forest and Conservation Workers 20.30 Forest and Conservation Workers 21.59
28228 TN tennessee Patternmakers, Wood 5.08 Model Makers, Wood 6.62
28969 TX texas Petroleum Engineers 6.84 Petroleum Engineers 6.17
29751 UT utah Mine Cutting and Channeling Machine Operators 8.36 Telephone Operators 13.95
30416 VA virginia Mathematical Science Occupations, All Other 8.07 Mathematical Science Occupations, All Other 8.59
31159 VT vermont Model Makers, Wood 20.16 Floor Sanders and Finishers 9.51
31671 WA washington Aircraft Structure, Surfaces, Rigging, and Systems Assemblers 15.68 Aerospace Engineers 5.95
32410 WI wisconsin Animal Breeders 10.02 Floor Sanders and Finishers 6.72
33151 WV west virginia Mine Shuttle Car Operators 76.87 Roof Bolters, Mining 79.86
33767 WY wyoming Extraction Workers, All Other 29.16 Wellhead Pumpers 31.78

Plot the Detail Occupations on the Map

Map 1: United States excluding Vermont, New Hampshire, Massachusetts, Rhode Island, Connecticut, New Jersey, Maryland, Delaware, District of Columbia (Bureau of Labor Statistics, U.S. Department of Labor, Occupational Employment Statistics - December 2016)

Map 1: United States excluding Vermont, New Hampshire, Massachusetts, Rhode Island, Connecticut, New Jersey, Maryland, Delaware, District of Columbia (Bureau of Labor Statistics, U.S. Department of Labor, Occupational Employment Statistics - December 2016)

Map 2: Vermont, New Hampshire, Massachusetts, Rhode Island, Connecticut, New Jersey, Maryland, Delaware, District of Columbia (Bureau of Labor Statistics, U.S. Department of Labor, Occupational Employment Statistics - December 2016)

Map 2: Vermont, New Hampshire, Massachusetts, Rhode Island, Connecticut, New Jersey, Maryland, Delaware, District of Columbia (Bureau of Labor Statistics, U.S. Department of Labor, Occupational Employment Statistics - December 2016)