Median annual wage for early-, mid-, and late-career workers by gender
includes all full-time workers in the top 20 most common occupational categories from the 2014-18 American Community Survey
occupation category1 Under 352 35 to 502 50 plus2 total jobs3 n4 obs5
Male Female Male Female Male Female Male Female
Managers, nec (including Postmasters) $55K $50K $87K $74K $93K $74K 66% 34% 4.1M 201,003
First-Line Supervisors of Sales Workers $38K $32K $55K $42K $56K $41K 61% 39% 3.8M 178,512
Driver/Sales Workers and Truck Drivers $32K $25K $42K $31K $43K $32K 95% 5% 3.0M 137,864
Elementary and Middle School Teachers $45K $42K $58K $53K $64K $58K 22% 78% 3.0M 158,680
Secretaries and Administrative Assistants $33K $31K $47K $37K $50K $41K 6% 94% 2.2M 118,095
Registered Nurses $60K $55K $79K $69K $82K $75K 13% 87% 2.1M 102,224
Customer Service Representatives $30K $28K $43K $36K $46K $37K 35% 65% 1.9M 84,546
Retail Salespersons $30K $23K $45K $31K $42K $29K 63% 37% 1.7M 76,833
Accountants and Auditors $58K $51K $85K $61K $85K $60K 41% 59% 1.7M 85,524
Janitors and Building Cleaners $24K $19K $31K $22K $33K $24K 74% 26% 1.7M 74,742
Laborers and Freight, Stock, and Material Movers, Hand $25K $21K $34K $26K $37K $30K 83% 17% 1.5M 69,348
Chefs and Cooks $22K $20K $28K $21K $29K $22K 68% 32% 1.5M 60,069
Computer Scientists and Systems Analysts/Network systems Analysts/Web Developers $58K $53K $85K $72K $92K $75K 71% 29% 1.5M 71,645
Construction Laborers $26K $26K $31K $29K $32K $32K 97% 3% 1.4M 58,108
Sales Representatives, Wholesale and Manufacturing $50K $46K $76K $63K $76K $58K 74% 26% 1.2M 61,087
Software Developers, Applications and Systems Software $84K $76K $114K $94K $115K $95K 81% 19% 1.2M 58,938
First-Line Supervisors of Office and Administrative Support Workers $41K $37K $60K $48K $68K $51K 40% 60% 1.2M 59,351
Nursing, Psychiatric, and Home Health Aides $25K $23K $32K $27K $33K $29K 14% 86% 1.2M 48,667
Chief executives and legislators/public administration $79K $72K $136K $106K $147K $108K 76% 24% 1.1M 57,858
Financial Managers $61K $44K $110K $69K $118K $71K 47% 53% 1.1M 53,799
Data source: Steven Ruggles, Sarah Flood, Ronald Goeken, Josia Grover, Erin Meyer, Jose Pacas and Matthew Sobek. IPUMS USA: Version 10.0 [dataset]. Minneapolis, MN: IPUMS 2020. https://doi.org/10.18128/D010.V10.0. Dollars are all in 2018 values. Full-time employment is defined as at least 37 hours per typical week.

1 IPUMS USA variable OCC2010, which is ''a harmonized occupation coding scheme based on the Census Bureau's 2010 ACS occupation classification scheme.''

2 Darker colors indicated a larger wage gap between men and women.

3 Purple = more men than woman. Green = more women than men.

4 The weighted estimate of total full-time workers in the United States

5 The unweighted sample count

 

Public opinion researchers (and others) often want to display the interaction of 3 variables in a 3-way crosstab table. The gt package (along with tidyr::pivot_wider) make this once-tedious task easy. Here is an example of how to build one of these tables using U.S. Census microdata downloaded from IPUMS USA.

Preparing the data

I’ve already summarized the individual-level microdata into two summary tables. Click here to see this process in detail.

The first table contains the median wages for workers by sex and age.

library(tidyverse)
library(gt)

income.sex.age <- read_csv("https://raw.githubusercontent.com/jdjohn215/2020-RStudio-Table-Contest/master/data/income_by_sex_and_age.csv")
## Parsed with column specification:
## cols(
##   OCC2010 = col_character(),
##   SEX = col_character(),
##   age_category = col_character(),
##   median_wage = col_double()
## )
head(income.sex.age)
## # A tibble: 6 x 4
##   OCC2010                  SEX    age_category median_wage
##   <chr>                    <chr>  <chr>              <dbl>
## 1 Accountants and Auditors Male   Under 35           58218
## 2 Accountants and Auditors Male   35 to 50           84681
## 3 Accountants and Auditors Male   50 plus            84638
## 4 Accountants and Auditors Female Under 35           51232
## 5 Accountants and Auditors Female 35 to 50           61000
## 6 Accountants and Auditors Female 50 plus            60000

The second shows the total share of jobs by sex, as well as the total job count (weighted and unweighted).

total.jobs.by.sex <- read_csv("https://raw.githubusercontent.com/jdjohn215/2020-RStudio-Table-Contest/master/data/total_jobs_by_sex.csv")
## Parsed with column specification:
## cols(
##   OCC2010 = col_character(),
##   SEX = col_character(),
##   pct = col_double(),
##   n = col_double(),
##   unweighted_n = col_double(),
##   age_category = col_character()
## )
head(total.jobs.by.sex)
## # A tibble: 6 x 6
##   OCC2010                          SEX      pct      n unweighted_n age_category
##   <chr>                            <chr>  <dbl>  <dbl>        <dbl> <chr>       
## 1 Accountants and Auditors         Male    40.6 1.70e6        85524 total       
## 2 Accountants and Auditors         Female  59.4 1.70e6        85524 total       
## 3 Chefs and Cooks                  Male    67.9 1.54e6        60069 total       
## 4 Chefs and Cooks                  Female  32.1 1.54e6        60069 total       
## 5 Chief executives and legislator… Male    76.1 1.11e6        57858 total       
## 6 Chief executives and legislator… Female  23.9 1.11e6        57858 total

The rows in the table are ordered from most jobs to fewest. This correctly-ordered vector of occupation names will be useful later on.

# this is a vector of the occupation names in order of largest to smallest
top.jobs <- total.jobs.by.sex %>%
  group_by(OCC2010, n) %>% 
  summarise() %>%
  arrange(desc(n)) %>%
  pull(OCC2010)

The first step is widening and combining the tables into a tibble which will then be passed to gt.

wage.tibble <- inner_join(
  # pivot wider
  income.sex.age %>%
    pivot_wider(names_from = c(age_category, SEX), values_from = median_wage),
  # pivot wider
  total.jobs.by.sex %>%
    pivot_wider(names_from = c(age_category, SEX), values_from = pct)
) %>%
  # arrange columns in the appropriate order
  select(OCC2010, starts_with("Under"), starts_with("35"), 
         starts_with("50"), `total jobs_Male` = total_Male, 
         `total jobs_Female` = total_Female, n, obs = unweighted_n) %>%
  # make OCC2010 a factor and order from most jobs to fewest
  mutate(OCC2010 = factor(OCC2010, levels = top.jobs)) %>%
  arrange(OCC2010)

wage.tibble
## # A tibble: 20 x 11
##    OCC2010 `Under 35_Male` `Under 35_Femal… `35 to 50_Male` `35 to 50_Femal…
##    <fct>             <dbl>            <dbl>           <dbl>            <dbl>
##  1 Manage…           55000            50000           87094            74058
##  2 First-…           38087            31739           55330            42319
##  3 Driver…           31739            25000           42340            31421
##  4 Elemen…           44674            42319           58218            52899
##  5 Secret…           32797            30739           47132            37048
##  6 Regist…           60000            55043           79000            68769
##  7 Custom…           30000            27825           43035            35500
##  8 Retail…           29510            23400           45000            30739
##  9 Accoun…           58218            51232           84681            61000
## 10 Janito…           24090            19044           31421            21517
## 11 Labore…           25000            20947           33855            26184
## 12 Chefs …           22000            20000           28000            21000
## 13 Comput…           57605            52926           84681            72269
## 14 Constr…           26449            25616           31421            28580
## 15 Sales …           49750            45561           76213            63000
## 16 Softwa…           84000            75823          114163            94266
## 17 First-…           40985            36887           60000            47633
## 18 Nursin…           25404            23076           31739            26641
## 19 Chief …           79388            71725          136158           105798
## 20 Financ…           60747            44457          109974            69126
## # … with 6 more variables: `50 plus_Male` <dbl>, `50 plus_Female` <dbl>, `total
## #   jobs_Male` <dbl>, `total jobs_Female` <dbl>, n <dbl>, obs <dbl>

Using {gt}

Now that the tibble is organized, we can turn it into a gt object, specifying that the occupation name column should become the “stub.”

wage.gt.step1 <- gt(wage.tibble, rowname_col = "OCC2010")
wage.gt.step1 %>% 
  # this is just for rmarkdown display purposes
  tab_options(container.height = px(500))

Under 35_Male Under 35_Female 35 to 50_Male 35 to 50_Female 50 plus_Male 50 plus_Female total jobs_Male total jobs_Female n obs
Managers, nec (including Postmasters) 55000 50000 87094 74058 93102 74096 65.943157 34.056843 4055972 201003
First-Line Supervisors of Sales Workers 38087 31739 55330 42319 56355 40985 61.257642 38.742358 3803101 178512
Driver/Sales Workers and Truck Drivers 31739 25000 42340 31421 43377 31739 95.190974 4.809026 2986946 137864
Elementary and Middle School Teachers 44674 42319 58218 52899 63511 57605 21.905729 78.094271 2978262 158680
Secretaries and Administrative Assistants 32797 30739 47132 37048 50000 40541 5.746597 94.253403 2243032 118095
Registered Nurses 60000 55043 79000 68769 81971 75154 12.989853 87.010147 2084173 102224
Customer Service Representatives 30000 27825 43035 35500 46109 37297 35.246122 64.753878 1917096 84546
Retail Salespersons 29510 23400 45000 30739 42319 29400 62.611065 37.388935 1742117 76833
Accountants and Auditors 58218 51232 84681 61000 84638 60000 40.605670 59.394330 1698450 85524
Janitors and Building Cleaners 24090 19044 31421 21517 32678 24333 74.464430 25.535570 1651418 74742
Laborers and Freight, Stock, and Material Movers, Hand 25000 20947 33855 26184 37029 29638 83.063071 16.936929 1549053 69348
Chefs and Cooks 22000 20000 28000 21000 28690 21594 67.916571 32.083429 1537931 60069
Computer Scientists and Systems Analysts/Network systems Analysts/Web Developers 57605 52926 84681 72269 92217 75000 71.162270 28.837730 1505167 71645
Construction Laborers 26449 25616 31421 28580 31739 31739 97.343698 2.656302 1446560 58108
Sales Representatives, Wholesale and Manufacturing 49750 45561 76213 63000 76174 57605 73.881937 26.118063 1239644 61087
Software Developers, Applications and Systems Software 84000 75823 114163 94266 115211 95218 80.966669 19.033331 1230100 58938
First-Line Supervisors of Office and Administrative Support Workers 40985 36887 60000 47633 67626 50783 40.247629 59.752371 1197273 59351
Nursing, Psychiatric, and Home Health Aides 25404 23076 31739 26641 32814 28690 14.315815 85.684185 1176992 48667
Chief executives and legislators/public administration 79388 71725 136158 105798 147059 107587 76.092997 23.907003 1110708 57858
Financial Managers 60747 44457 109974 69126 117833 70920 47.491407 52.508593 1098026 53799
 

Notice the the underscores (_) added to the column names when we used pivot_wider in step 1. These underscores indicate the column spanners. In this step, we’ll also format the data presentation in each column.

wage.gt.step2 <- wage.gt.step1 %>%
  # create column spanners as already indicated in the column names by "_"
  tab_spanner_delim(delim = "_") %>%
  # format currency colums
  fmt_currency(columns = c(2:7), decimals = 0, suffixing = T) %>%
  # format percentage columns
  fmt_percent(columns = 8:9, decimals = 0, scale_values = F) %>%
  # format the weighted count as a rounded, and suffixed number
  fmt_number(columns = 10, decimals = 1, suffixing = T) %>%
  # format the actual sample count without rounding
  fmt_number(columns = 11, decimals = 0)
wage.gt.step2 %>% tab_options(container.height = px(500))

Under 35 35 to 50 50 plus total jobs n obs
Male Female Male Female Male Female Male Female
Managers, nec (including Postmasters) $55K $50K $87K $74K $93K $74K 66% 34% 4.1M 201,003
First-Line Supervisors of Sales Workers $38K $32K $55K $42K $56K $41K 61% 39% 3.8M 178,512
Driver/Sales Workers and Truck Drivers $32K $25K $42K $31K $43K $32K 95% 5% 3.0M 137,864
Elementary and Middle School Teachers $45K $42K $58K $53K $64K $58K 22% 78% 3.0M 158,680
Secretaries and Administrative Assistants $33K $31K $47K $37K $50K $41K 6% 94% 2.2M 118,095
Registered Nurses $60K $55K $79K $69K $82K $75K 13% 87% 2.1M 102,224
Customer Service Representatives $30K $28K $43K $36K $46K $37K 35% 65% 1.9M 84,546
Retail Salespersons $30K $23K $45K $31K $42K $29K 63% 37% 1.7M 76,833
Accountants and Auditors $58K $51K $85K $61K $85K $60K 41% 59% 1.7M 85,524
Janitors and Building Cleaners $24K $19K $31K $22K $33K $24K 74% 26% 1.7M 74,742
Laborers and Freight, Stock, and Material Movers, Hand $25K $21K $34K $26K $37K $30K 83% 17% 1.5M 69,348
Chefs and Cooks $22K $20K $28K $21K $29K $22K 68% 32% 1.5M 60,069
Computer Scientists and Systems Analysts/Network systems Analysts/Web Developers $58K $53K $85K $72K $92K $75K 71% 29% 1.5M 71,645
Construction Laborers $26K $26K $31K $29K $32K $32K 97% 3% 1.4M 58,108
Sales Representatives, Wholesale and Manufacturing $50K $46K $76K $63K $76K $58K 74% 26% 1.2M 61,087
Software Developers, Applications and Systems Software $84K $76K $114K $94K $115K $95K 81% 19% 1.2M 58,938
First-Line Supervisors of Office and Administrative Support Workers $41K $37K $60K $48K $68K $51K 40% 60% 1.2M 59,351
Nursing, Psychiatric, and Home Health Aides $25K $23K $32K $27K $33K $29K 14% 86% 1.2M 48,667
Chief executives and legislators/public administration $79K $72K $136K $106K $147K $108K 76% 24% 1.1M 57,858
Financial Managers $61K $44K $110K $69K $118K $71K 47% 53% 1.1M 53,799
 

Add explanatory text. The gt package makes it easy to add titles, subtitles, footnotes, and data notes. We don’t need to specify footnote numbers, because gt assigns them automatically based on their first appearance in the table. The md function returns markdown formatted text.

wage.gt.step3 <- wage.gt.step2 %>%
  # add explanatory text
  tab_header(title = "Median annual wage for early-, mid-, and late-career workers by gender",
             subtitle = md("includes all full-time workers in the top 20 most common occupational categories from the 2014-18 [American Community Survey](https://www.census.gov/programs-surveys/acs)")) %>%
  tab_stubhead("occupation category") %>%
  tab_source_note(md("**Data source:** Steven Ruggles, Sarah Flood, Ronald Goeken, Josia Grover, Erin Meyer, Jose Pacas
                     and Matthew Sobek. *IPUMS USA: Version 10.0* [dataset]. Minneapolis, MN: IPUMS 2020. [https://doi.org/10.18128/D010.V10.0](https://doi.org/10.18128/D010.V10.0).
                     Dollars are all in 2018 values. Full-time employment is defined as at least 37 hours per typical week.")) %>%
  tab_footnote(footnote = md("IPUMS USA variable [*OCC2010*](https://usa.ipums.org/usa-action/variables/OCC2010#description_section), which is ''a harmonized occupation coding
                  scheme based on the Census Bureau's 2010 ACS occupation classification scheme.''"),
               locations = cells_stubhead()) %>%
  tab_footnote(footnote = "The weighted estimate of total full-time workers in the United States",
               locations = cells_column_labels("n")) %>%
  tab_footnote(footnote = "The unweighted sample count",
               locations = cells_column_labels("obs")) %>%
  tab_options(heading.title.font.size = 24,
              heading.title.font.weight = "bolder",
              column_labels.font.weight = "bold")
wage.gt.step3 %>% tab_options(container.height = px(500))

Median annual wage for early-, mid-, and late-career workers by gender
includes all full-time workers in the top 20 most common occupational categories from the 2014-18 American Community Survey
occupation category1 Under 35 35 to 50 50 plus total jobs n2 obs3
Male Female Male Female Male Female Male Female
Managers, nec (including Postmasters) $55K $50K $87K $74K $93K $74K 66% 34% 4.1M 201,003
First-Line Supervisors of Sales Workers $38K $32K $55K $42K $56K $41K 61% 39% 3.8M 178,512
Driver/Sales Workers and Truck Drivers $32K $25K $42K $31K $43K $32K 95% 5% 3.0M 137,864
Elementary and Middle School Teachers $45K $42K $58K $53K $64K $58K 22% 78% 3.0M 158,680
Secretaries and Administrative Assistants $33K $31K $47K $37K $50K $41K 6% 94% 2.2M 118,095
Registered Nurses $60K $55K $79K $69K $82K $75K 13% 87% 2.1M 102,224
Customer Service Representatives $30K $28K $43K $36K $46K $37K 35% 65% 1.9M 84,546
Retail Salespersons $30K $23K $45K $31K $42K $29K 63% 37% 1.7M 76,833
Accountants and Auditors $58K $51K $85K $61K $85K $60K 41% 59% 1.7M 85,524
Janitors and Building Cleaners $24K $19K $31K $22K $33K $24K 74% 26% 1.7M 74,742
Laborers and Freight, Stock, and Material Movers, Hand $25K $21K $34K $26K $37K $30K 83% 17% 1.5M 69,348
Chefs and Cooks $22K $20K $28K $21K $29K $22K 68% 32% 1.5M 60,069
Computer Scientists and Systems Analysts/Network systems Analysts/Web Developers $58K $53K $85K $72K $92K $75K 71% 29% 1.5M 71,645
Construction Laborers $26K $26K $31K $29K $32K $32K 97% 3% 1.4M 58,108
Sales Representatives, Wholesale and Manufacturing $50K $46K $76K $63K $76K $58K 74% 26% 1.2M 61,087
Software Developers, Applications and Systems Software $84K $76K $114K $94K $115K $95K 81% 19% 1.2M 58,938
First-Line Supervisors of Office and Administrative Support Workers $41K $37K $60K $48K $68K $51K 40% 60% 1.2M 59,351
Nursing, Psychiatric, and Home Health Aides $25K $23K $32K $27K $33K $29K 14% 86% 1.2M 48,667
Chief executives and legislators/public administration $79K $72K $136K $106K $147K $108K 76% 24% 1.1M 57,858
Financial Managers $61K $44K $110K $69K $118K $71K 47% 53% 1.1M 53,799
Data source: Steven Ruggles, Sarah Flood, Ronald Goeken, Josia Grover, Erin Meyer, Jose Pacas and Matthew Sobek. IPUMS USA: Version 10.0 [dataset]. Minneapolis, MN: IPUMS 2020. https://doi.org/10.18128/D010.V10.0. Dollars are all in 2018 values. Full-time employment is defined as at least 37 hours per typical week.

1 IPUMS USA variable OCC2010, which is ''a harmonized occupation coding scheme based on the Census Bureau's 2010 ACS occupation classification scheme.''

2 The weighted estimate of total full-time workers in the United States

3 The unweighted sample count

 

Formatting colors

Out of the box, gt provides the ability to target a column by name and rows by a logical condition for formatting. It doesn’t currently have the built-in ability to simply format a cell by applying a function the value contained in that cell. In this table, I want to do some something more complicated yet.

One interesting aspect of the table is the size of the gender wage gap in each age/occupation pair. This requires the viewer to do some mental math. To help the viewer compare wage gaps across the table, I want to shade each cell by the size of the wage gap between it and its partner cell.

Doing this takes several steps. First, make a tibble with the same rows and columns that we want to format in our gt object. Except, the values in the cells of this tibble will be the wage gaps–the values we want to color the gt cells by–not the median wages. Then, write a function which uses a for loop to target each cell in the gt object individually, coloring its background based on the value in the wage gap tibble.

step 1: calculate wage gaps

# first, calculate the size of the wage gap for each gender/age pair
wage.gap.values <- income.sex.age %>%
  pivot_wider(names_from = SEX, values_from = median_wage) %>%
  mutate(wage_gap = Male - Female)

# create a table with same column names and row order as the gt object
wage.gap.table <- wage.gap.values %>%
  pivot_longer(cols = c("Male", "Female")) %>%
  select(OCC2010, age_category, name, wage_gap) %>%
  pivot_wider(names_from = c(age_category, name), values_from = wage_gap) %>%
  # arrange from most jobs to fewest
  mutate(OCC2010 = factor(OCC2010, levels = top.jobs)) %>%
  arrange(OCC2010)

step 2: write functions to define the cell colors

First, I make a function which returns a color name in my chosen palette for each wage gap value. I create this function using the useful colorNumeric function from the leaflet package. Because all of the wage gaps have the same sign (men make more than women) a sequential color palette is appropriate. I’m using the “Purples” palette from ColorBrewer.

# borrow the colorNumeric function from {leaflet}
# this new function, purple_pal, returns a color for each value
purple_pal <- leaflet::colorNumeric(palette = "Purples", 
                                    domain = wage.gap.values$wage_gap)
purple_pal(wage.gap.values$wage_gap[10]) # for example
## [1] "#F2F0F7"

Some of the background colors are quite dark, so in these cases I need to switch the font color from black to white. To do this, I create two helper functions–luminance calculates the brightness of each color and text_color chooses between “black” or “white” based on that value. My luminance function is based on the helpful discussion beneath this StackOverflow question.

luminance <- function(col){
  # RGB values of the color
  rgb.values <- col2rgb(col)
  
  # calculate luminance
  red.value <- rgb.values[1] * 0.299
  green.value <- rgb.values[2] * 0.587
  blue.value <- rgb.values[3] * 0.114
  
  # add the values together and divide by 255 (the max value)
  (red.value + green.value + blue.value)/255
}
luminance("orange") # for example
## [1] 0.6788235
# this function returns a text color based on luminance of background color
text_color <- function(color){
  if(luminance(color) < 0.6){
    "white"
  } else {
    "black"
  }
}
text_color("orange") # for example
## [1] "black"

Finally, I combine all these helper functions into one function we can be applied to each relevant column in the gt object. Thanks to Arun Kirshna for a helpful suggestion on this step.

# create a function which formats each cell individually
format_wage_gap <- function(gtobj, column){
  # generate color for each cell in the column
  color_value <- purple_pal(wage.gap.table %>% pull(sym(column)))
  
  # for each row in the column, perform these formatting steps
  for(i in seq_along(wage.tibble %>% pull(sym(column)))){
    gtobj <- gtobj %>%
      tab_style(style = list(
        # fill the cell background
        cell_fill(color = color_value[i]),
        # color the text black or white depending on the luminence of the background color
        cell_text(color = text_color(color_value[i]))
      ),
      locations = cells_body(columns = column, rows = i))
  }
  gtobj
}
wage.gt.step4 <- wage.gt.step3 %>%
  format_wage_gap("Under 35_Male") %>%
  format_wage_gap("Under 35_Female") %>%
  format_wage_gap("35 to 50_Male") %>%
  format_wage_gap("35 to 50_Female") %>%
  format_wage_gap("50 plus_Male") %>%
  format_wage_gap("50 plus_Female")
wage.gt.step4 %>% tab_options(container.height = px(500))
Median annual wage for early-, mid-, and late-career workers by gender
includes all full-time workers in the top 20 most common occupational categories from the 2014-18 American Community Survey
occupation category1 Under 35 35 to 50 50 plus total jobs n2 obs3
Male Female Male Female Male Female Male Female
Managers, nec (including Postmasters) $55K $50K $87K $74K $93K $74K 66% 34% 4.1M 201,003
First-Line Supervisors of Sales Workers $38K $32K $55K $42K $56K $41K 61% 39% 3.8M 178,512
Driver/Sales Workers and Truck Drivers $32K $25K $42K $31K $43K $32K 95% 5% 3.0M 137,864
Elementary and Middle School Teachers $45K $42K $58K $53K $64K $58K 22% 78% 3.0M 158,680
Secretaries and Administrative Assistants $33K $31K $47K $37K $50K $41K 6% 94% 2.2M 118,095
Registered Nurses $60K $55K $79K $69K $82K $75K 13% 87% 2.1M 102,224
Customer Service Representatives $30K $28K $43K $36K $46K $37K 35% 65% 1.9M 84,546
Retail Salespersons $30K $23K $45K $31K $42K $29K 63% 37% 1.7M 76,833
Accountants and Auditors $58K $51K $85K $61K $85K $60K 41% 59% 1.7M 85,524
Janitors and Building Cleaners $24K $19K $31K $22K $33K $24K 74% 26% 1.7M 74,742
Laborers and Freight, Stock, and Material Movers, Hand $25K $21K $34K $26K $37K $30K 83% 17% 1.5M 69,348
Chefs and Cooks $22K $20K $28K $21K $29K $22K 68% 32% 1.5M 60,069
Computer Scientists and Systems Analysts/Network systems Analysts/Web Developers $58K $53K $85K $72K $92K $75K 71% 29% 1.5M 71,645
Construction Laborers $26K $26K $31K $29K $32K $32K 97% 3% 1.4M 58,108
Sales Representatives, Wholesale and Manufacturing $50K $46K $76K $63K $76K $58K 74% 26% 1.2M 61,087
Software Developers, Applications and Systems Software $84K $76K $114K $94K $115K $95K 81% 19% 1.2M 58,938
First-Line Supervisors of Office and Administrative Support Workers $41K $37K $60K $48K $68K $51K 40% 60% 1.2M 59,351
Nursing, Psychiatric, and Home Health Aides $25K $23K $32K $27K $33K $29K 14% 86% 1.2M 48,667
Chief executives and legislators/public administration $79K $72K $136K $106K $147K $108K 76% 24% 1.1M 57,858
Financial Managers $61K $44K $110K $69K $118K $71K 47% 53% 1.1M 53,799
Data source: Steven Ruggles, Sarah Flood, Ronald Goeken, Josia Grover, Erin Meyer, Jose Pacas and Matthew Sobek. IPUMS USA: Version 10.0 [dataset]. Minneapolis, MN: IPUMS 2020. https://doi.org/10.18128/D010.V10.0. Dollars are all in 2018 values. Full-time employment is defined as at least 37 hours per typical week.

1 IPUMS USA variable OCC2010, which is ''a harmonized occupation coding scheme based on the Census Bureau's 2010 ACS occupation classification scheme.''

2 The weighted estimate of total full-time workers in the United States

3 The unweighted sample count

step 3: repeat for job gaps

I also want to color the cell backgrounds of the job gap columns, but this time a diverging color palette is appropriate because some professions have more women than men and others have more men than women. I use another ColorBrewer palette, “PRGn,” because it is colorblind-safe.

# build the tibble of job gap values
job.gap.table <- total.jobs.by.sex %>%
  pivot_wider(names_from = SEX, values_from = pct) %>%
  mutate(job_gap = Male - Female) %>%
  select(OCC2010, `total jobs_Male` = Male, `total jobs_Female` = Female, job_gap) %>%
  mutate(`total jobs_Male` = job_gap, `total jobs_Female` = job_gap) %>%
  select(-job_gap) %>%
  mutate(OCC2010 = factor(OCC2010, levels = top.jobs)) %>%
  arrange(OCC2010)

# use the PRGn diverging palette
purple_green_pal <- leaflet::colorNumeric(palette = "PRGn", domain = job.gap.table$`total jobs_Male`,
                                          reverse = TRUE)

format_job_gap <- function(gtobj, column){
  # generate color for each cell in the column
  color_value <- purple_green_pal(job.gap.table %>% pull(sym(column)))
  
  # for each row in the column, perform these formatting steps
  for(i in seq_along(wage.tibble %>% pull(sym(column)))){
    gtobj <- gtobj %>%
      tab_style(style = list(
        # fill the cell background
        cell_fill(color = color_value[i]),
        # color the text black or white depending on the luminence of the background color
        cell_text(color = text_color(color_value[i]))
      ),
      locations = cells_body(columns = column, rows = i))
  }
  gtobj
}

wage.gt.step5 <- wage.gt.step4 %>%
  format_job_gap("total jobs_Male") %>%
  format_job_gap("total jobs_Female")
wage.gt.step5 %>% tab_options(container.height = px(500))
Median annual wage for early-, mid-, and late-career workers by gender
includes all full-time workers in the top 20 most common occupational categories from the 2014-18 American Community Survey
occupation category1 Under 35 35 to 50 50 plus total jobs n2 obs3
Male Female Male Female Male Female Male Female
Managers, nec (including Postmasters) $55K $50K $87K $74K $93K $74K 66% 34% 4.1M 201,003
First-Line Supervisors of Sales Workers $38K $32K $55K $42K $56K $41K 61% 39% 3.8M 178,512
Driver/Sales Workers and Truck Drivers $32K $25K $42K $31K $43K $32K 95% 5% 3.0M 137,864
Elementary and Middle School Teachers $45K $42K $58K $53K $64K $58K 22% 78% 3.0M 158,680
Secretaries and Administrative Assistants $33K $31K $47K $37K $50K $41K 6% 94% 2.2M 118,095
Registered Nurses $60K $55K $79K $69K $82K $75K 13% 87% 2.1M 102,224
Customer Service Representatives $30K $28K $43K $36K $46K $37K 35% 65% 1.9M 84,546
Retail Salespersons $30K $23K $45K $31K $42K $29K 63% 37% 1.7M 76,833
Accountants and Auditors $58K $51K $85K $61K $85K $60K 41% 59% 1.7M 85,524
Janitors and Building Cleaners $24K $19K $31K $22K $33K $24K 74% 26% 1.7M 74,742
Laborers and Freight, Stock, and Material Movers, Hand $25K $21K $34K $26K $37K $30K 83% 17% 1.5M 69,348
Chefs and Cooks $22K $20K $28K $21K $29K $22K 68% 32% 1.5M 60,069
Computer Scientists and Systems Analysts/Network systems Analysts/Web Developers $58K $53K $85K $72K $92K $75K 71% 29% 1.5M 71,645
Construction Laborers $26K $26K $31K $29K $32K $32K 97% 3% 1.4M 58,108
Sales Representatives, Wholesale and Manufacturing $50K $46K $76K $63K $76K $58K 74% 26% 1.2M 61,087
Software Developers, Applications and Systems Software $84K $76K $114K $94K $115K $95K 81% 19% 1.2M 58,938
First-Line Supervisors of Office and Administrative Support Workers $41K $37K $60K $48K $68K $51K 40% 60% 1.2M 59,351
Nursing, Psychiatric, and Home Health Aides $25K $23K $32K $27K $33K $29K 14% 86% 1.2M 48,667
Chief executives and legislators/public administration $79K $72K $136K $106K $147K $108K 76% 24% 1.1M 57,858
Financial Managers $61K $44K $110K $69K $118K $71K 47% 53% 1.1M 53,799
Data source: Steven Ruggles, Sarah Flood, Ronald Goeken, Josia Grover, Erin Meyer, Jose Pacas and Matthew Sobek. IPUMS USA: Version 10.0 [dataset]. Minneapolis, MN: IPUMS 2020. https://doi.org/10.18128/D010.V10.0. Dollars are all in 2018 values. Full-time employment is defined as at least 37 hours per typical week.

1 IPUMS USA variable OCC2010, which is ''a harmonized occupation coding scheme based on the Census Bureau's 2010 ACS occupation classification scheme.''

2 The weighted estimate of total full-time workers in the United States

3 The unweighted sample count

Final output

As a final step, I add some additional footnotes explaining the cell background colors. Notice how gt automatically renumbers all the footnotes appropriately.

# add footnotes explaining the coloring of the cells
wage.gt.step6 <- wage.gt.step5 %>%
  tab_footnote(footnote = "Darker colors indicated a larger wage gap between men and women.",
               locations = cells_column_spanners(c("Under 35", "35 to 50", "50 plus"))) %>%
  tab_footnote(footnote = "Purple = more men than woman. Green = more women than men.",
               locations = cells_column_spanners("total jobs"))
wage.gt.step6 %>% tab_options(container.height = px(500))
Median annual wage for early-, mid-, and late-career workers by gender
includes all full-time workers in the top 20 most common occupational categories from the 2014-18 American Community Survey
occupation category1 Under 352 35 to 502 50 plus2 total jobs3 n4 obs5
Male Female Male Female Male Female Male Female
Managers, nec (including Postmasters) $55K $50K $87K $74K $93K $74K 66% 34% 4.1M 201,003
First-Line Supervisors of Sales Workers $38K $32K $55K $42K $56K $41K 61% 39% 3.8M 178,512
Driver/Sales Workers and Truck Drivers $32K $25K $42K $31K $43K $32K 95% 5% 3.0M 137,864
Elementary and Middle School Teachers $45K $42K $58K $53K $64K $58K 22% 78% 3.0M 158,680
Secretaries and Administrative Assistants $33K $31K $47K $37K $50K $41K 6% 94% 2.2M 118,095
Registered Nurses $60K $55K $79K $69K $82K $75K 13% 87% 2.1M 102,224
Customer Service Representatives $30K $28K $43K $36K $46K $37K 35% 65% 1.9M 84,546
Retail Salespersons $30K $23K $45K $31K $42K $29K 63% 37% 1.7M 76,833
Accountants and Auditors $58K $51K $85K $61K $85K $60K 41% 59% 1.7M 85,524
Janitors and Building Cleaners $24K $19K $31K $22K $33K $24K 74% 26% 1.7M 74,742
Laborers and Freight, Stock, and Material Movers, Hand $25K $21K $34K $26K $37K $30K 83% 17% 1.5M 69,348
Chefs and Cooks $22K $20K $28K $21K $29K $22K 68% 32% 1.5M 60,069
Computer Scientists and Systems Analysts/Network systems Analysts/Web Developers $58K $53K $85K $72K $92K $75K 71% 29% 1.5M 71,645
Construction Laborers $26K $26K $31K $29K $32K $32K 97% 3% 1.4M 58,108
Sales Representatives, Wholesale and Manufacturing $50K $46K $76K $63K $76K $58K 74% 26% 1.2M 61,087
Software Developers, Applications and Systems Software $84K $76K $114K $94K $115K $95K 81% 19% 1.2M 58,938
First-Line Supervisors of Office and Administrative Support Workers $41K $37K $60K $48K $68K $51K 40% 60% 1.2M 59,351
Nursing, Psychiatric, and Home Health Aides $25K $23K $32K $27K $33K $29K 14% 86% 1.2M 48,667
Chief executives and legislators/public administration $79K $72K $136K $106K $147K $108K 76% 24% 1.1M 57,858
Financial Managers $61K $44K $110K $69K $118K $71K 47% 53% 1.1M 53,799
Data source: Steven Ruggles, Sarah Flood, Ronald Goeken, Josia Grover, Erin Meyer, Jose Pacas and Matthew Sobek. IPUMS USA: Version 10.0 [dataset]. Minneapolis, MN: IPUMS 2020. https://doi.org/10.18128/D010.V10.0. Dollars are all in 2018 values. Full-time employment is defined as at least 37 hours per typical week.

1 IPUMS USA variable OCC2010, which is ''a harmonized occupation coding scheme based on the Census Bureau's 2010 ACS occupation classification scheme.''

2 Darker colors indicated a larger wage gap between men and women.

3 Purple = more men than woman. Green = more women than men.

4 The weighted estimate of total full-time workers in the United States

5 The unweighted sample count

saving the table

The function gtsave function offers several ways to save your table, including as a png or pdf. In this example, I save the output as an html file.

gtsave(wage.gt.step4, "WageAgeGenderTable.html", inline_css = TRUE,
       path = "/your/file/directory/")