Median annual wage for early-, mid-, and late-career workers by gender | ||||||||||
---|---|---|---|---|---|---|---|---|---|---|
includes all full-time workers in the top 20 most common occupational categories from the 2014-18 American Community Survey | ||||||||||
occupation category1 | Under 352 | 35 to 502 | 50 plus2 | total jobs3 | n4 | obs5 | ||||
Male | Female | Male | Female | Male | Female | Male | Female | |||
Managers, nec (including Postmasters) | $55K | $50K | $87K | $74K | $93K | $74K | 66% | 34% | 4.1M | 201,003 |
First-Line Supervisors of Sales Workers | $38K | $32K | $55K | $42K | $56K | $41K | 61% | 39% | 3.8M | 178,512 |
Driver/Sales Workers and Truck Drivers | $32K | $25K | $42K | $31K | $43K | $32K | 95% | 5% | 3.0M | 137,864 |
Elementary and Middle School Teachers | $45K | $42K | $58K | $53K | $64K | $58K | 22% | 78% | 3.0M | 158,680 |
Secretaries and Administrative Assistants | $33K | $31K | $47K | $37K | $50K | $41K | 6% | 94% | 2.2M | 118,095 |
Registered Nurses | $60K | $55K | $79K | $69K | $82K | $75K | 13% | 87% | 2.1M | 102,224 |
Customer Service Representatives | $30K | $28K | $43K | $36K | $46K | $37K | 35% | 65% | 1.9M | 84,546 |
Retail Salespersons | $30K | $23K | $45K | $31K | $42K | $29K | 63% | 37% | 1.7M | 76,833 |
Accountants and Auditors | $58K | $51K | $85K | $61K | $85K | $60K | 41% | 59% | 1.7M | 85,524 |
Janitors and Building Cleaners | $24K | $19K | $31K | $22K | $33K | $24K | 74% | 26% | 1.7M | 74,742 |
Laborers and Freight, Stock, and Material Movers, Hand | $25K | $21K | $34K | $26K | $37K | $30K | 83% | 17% | 1.5M | 69,348 |
Chefs and Cooks | $22K | $20K | $28K | $21K | $29K | $22K | 68% | 32% | 1.5M | 60,069 |
Computer Scientists and Systems Analysts/Network systems Analysts/Web Developers | $58K | $53K | $85K | $72K | $92K | $75K | 71% | 29% | 1.5M | 71,645 |
Construction Laborers | $26K | $26K | $31K | $29K | $32K | $32K | 97% | 3% | 1.4M | 58,108 |
Sales Representatives, Wholesale and Manufacturing | $50K | $46K | $76K | $63K | $76K | $58K | 74% | 26% | 1.2M | 61,087 |
Software Developers, Applications and Systems Software | $84K | $76K | $114K | $94K | $115K | $95K | 81% | 19% | 1.2M | 58,938 |
First-Line Supervisors of Office and Administrative Support Workers | $41K | $37K | $60K | $48K | $68K | $51K | 40% | 60% | 1.2M | 59,351 |
Nursing, Psychiatric, and Home Health Aides | $25K | $23K | $32K | $27K | $33K | $29K | 14% | 86% | 1.2M | 48,667 |
Chief executives and legislators/public administration | $79K | $72K | $136K | $106K | $147K | $108K | 76% | 24% | 1.1M | 57,858 |
Financial Managers | $61K | $44K | $110K | $69K | $118K | $71K | 47% | 53% | 1.1M | 53,799 |
Data source: Steven Ruggles, Sarah Flood, Ronald Goeken, Josia Grover, Erin Meyer, Jose Pacas and Matthew Sobek. IPUMS USA: Version 10.0 [dataset]. Minneapolis, MN: IPUMS 2020. https://doi.org/10.18128/D010.V10.0. Dollars are all in 2018 values. Full-time employment is defined as at least 37 hours per typical week. | ||||||||||
1
IPUMS USA variable OCC2010, which is ''a harmonized occupation coding scheme based on the Census Bureau's 2010 ACS occupation classification scheme.''
2
Darker colors indicated a larger wage gap between men and women.
3
Purple = more men than woman. Green = more women than men.
4
The weighted estimate of total full-time workers in the United States
5
The unweighted sample count
|
Public opinion researchers (and others) often want to display the interaction of 3 variables in a 3-way crosstab table. The gt
package (along with tidyr::pivot_wider
) make this once-tedious task easy. Here is an example of how to build one of these tables using U.S. Census microdata downloaded from IPUMS USA.
I’ve already summarized the individual-level microdata into two summary tables. Click here to see this process in detail.
The first table contains the median wages for workers by sex and age.
library(tidyverse)
library(gt)
income.sex.age <- read_csv("https://raw.githubusercontent.com/jdjohn215/2020-RStudio-Table-Contest/master/data/income_by_sex_and_age.csv")
## Parsed with column specification:
## cols(
## OCC2010 = col_character(),
## SEX = col_character(),
## age_category = col_character(),
## median_wage = col_double()
## )
head(income.sex.age)
## # A tibble: 6 x 4
## OCC2010 SEX age_category median_wage
## <chr> <chr> <chr> <dbl>
## 1 Accountants and Auditors Male Under 35 58218
## 2 Accountants and Auditors Male 35 to 50 84681
## 3 Accountants and Auditors Male 50 plus 84638
## 4 Accountants and Auditors Female Under 35 51232
## 5 Accountants and Auditors Female 35 to 50 61000
## 6 Accountants and Auditors Female 50 plus 60000
The second shows the total share of jobs by sex, as well as the total job count (weighted and unweighted).
total.jobs.by.sex <- read_csv("https://raw.githubusercontent.com/jdjohn215/2020-RStudio-Table-Contest/master/data/total_jobs_by_sex.csv")
## Parsed with column specification:
## cols(
## OCC2010 = col_character(),
## SEX = col_character(),
## pct = col_double(),
## n = col_double(),
## unweighted_n = col_double(),
## age_category = col_character()
## )
head(total.jobs.by.sex)
## # A tibble: 6 x 6
## OCC2010 SEX pct n unweighted_n age_category
## <chr> <chr> <dbl> <dbl> <dbl> <chr>
## 1 Accountants and Auditors Male 40.6 1.70e6 85524 total
## 2 Accountants and Auditors Female 59.4 1.70e6 85524 total
## 3 Chefs and Cooks Male 67.9 1.54e6 60069 total
## 4 Chefs and Cooks Female 32.1 1.54e6 60069 total
## 5 Chief executives and legislator… Male 76.1 1.11e6 57858 total
## 6 Chief executives and legislator… Female 23.9 1.11e6 57858 total
The rows in the table are ordered from most jobs to fewest. This correctly-ordered vector of occupation names will be useful later on.
# this is a vector of the occupation names in order of largest to smallest
top.jobs <- total.jobs.by.sex %>%
group_by(OCC2010, n) %>%
summarise() %>%
arrange(desc(n)) %>%
pull(OCC2010)
The first step is widening and combining the tables into a tibble
which will then be passed to gt
.
wage.tibble <- inner_join(
# pivot wider
income.sex.age %>%
pivot_wider(names_from = c(age_category, SEX), values_from = median_wage),
# pivot wider
total.jobs.by.sex %>%
pivot_wider(names_from = c(age_category, SEX), values_from = pct)
) %>%
# arrange columns in the appropriate order
select(OCC2010, starts_with("Under"), starts_with("35"),
starts_with("50"), `total jobs_Male` = total_Male,
`total jobs_Female` = total_Female, n, obs = unweighted_n) %>%
# make OCC2010 a factor and order from most jobs to fewest
mutate(OCC2010 = factor(OCC2010, levels = top.jobs)) %>%
arrange(OCC2010)
wage.tibble
## # A tibble: 20 x 11
## OCC2010 `Under 35_Male` `Under 35_Femal… `35 to 50_Male` `35 to 50_Femal…
## <fct> <dbl> <dbl> <dbl> <dbl>
## 1 Manage… 55000 50000 87094 74058
## 2 First-… 38087 31739 55330 42319
## 3 Driver… 31739 25000 42340 31421
## 4 Elemen… 44674 42319 58218 52899
## 5 Secret… 32797 30739 47132 37048
## 6 Regist… 60000 55043 79000 68769
## 7 Custom… 30000 27825 43035 35500
## 8 Retail… 29510 23400 45000 30739
## 9 Accoun… 58218 51232 84681 61000
## 10 Janito… 24090 19044 31421 21517
## 11 Labore… 25000 20947 33855 26184
## 12 Chefs … 22000 20000 28000 21000
## 13 Comput… 57605 52926 84681 72269
## 14 Constr… 26449 25616 31421 28580
## 15 Sales … 49750 45561 76213 63000
## 16 Softwa… 84000 75823 114163 94266
## 17 First-… 40985 36887 60000 47633
## 18 Nursin… 25404 23076 31739 26641
## 19 Chief … 79388 71725 136158 105798
## 20 Financ… 60747 44457 109974 69126
## # … with 6 more variables: `50 plus_Male` <dbl>, `50 plus_Female` <dbl>, `total
## # jobs_Male` <dbl>, `total jobs_Female` <dbl>, n <dbl>, obs <dbl>
Now that the tibble
is organized, we can turn it into a gt
object, specifying that the occupation name column should become the “stub.”
wage.gt.step1 <- gt(wage.tibble, rowname_col = "OCC2010")
wage.gt.step1 %>%
# this is just for rmarkdown display purposes
tab_options(container.height = px(500))
Under 35_Male | Under 35_Female | 35 to 50_Male | 35 to 50_Female | 50 plus_Male | 50 plus_Female | total jobs_Male | total jobs_Female | n | obs | |
---|---|---|---|---|---|---|---|---|---|---|
Managers, nec (including Postmasters) | 55000 | 50000 | 87094 | 74058 | 93102 | 74096 | 65.943157 | 34.056843 | 4055972 | 201003 |
First-Line Supervisors of Sales Workers | 38087 | 31739 | 55330 | 42319 | 56355 | 40985 | 61.257642 | 38.742358 | 3803101 | 178512 |
Driver/Sales Workers and Truck Drivers | 31739 | 25000 | 42340 | 31421 | 43377 | 31739 | 95.190974 | 4.809026 | 2986946 | 137864 |
Elementary and Middle School Teachers | 44674 | 42319 | 58218 | 52899 | 63511 | 57605 | 21.905729 | 78.094271 | 2978262 | 158680 |
Secretaries and Administrative Assistants | 32797 | 30739 | 47132 | 37048 | 50000 | 40541 | 5.746597 | 94.253403 | 2243032 | 118095 |
Registered Nurses | 60000 | 55043 | 79000 | 68769 | 81971 | 75154 | 12.989853 | 87.010147 | 2084173 | 102224 |
Customer Service Representatives | 30000 | 27825 | 43035 | 35500 | 46109 | 37297 | 35.246122 | 64.753878 | 1917096 | 84546 |
Retail Salespersons | 29510 | 23400 | 45000 | 30739 | 42319 | 29400 | 62.611065 | 37.388935 | 1742117 | 76833 |
Accountants and Auditors | 58218 | 51232 | 84681 | 61000 | 84638 | 60000 | 40.605670 | 59.394330 | 1698450 | 85524 |
Janitors and Building Cleaners | 24090 | 19044 | 31421 | 21517 | 32678 | 24333 | 74.464430 | 25.535570 | 1651418 | 74742 |
Laborers and Freight, Stock, and Material Movers, Hand | 25000 | 20947 | 33855 | 26184 | 37029 | 29638 | 83.063071 | 16.936929 | 1549053 | 69348 |
Chefs and Cooks | 22000 | 20000 | 28000 | 21000 | 28690 | 21594 | 67.916571 | 32.083429 | 1537931 | 60069 |
Computer Scientists and Systems Analysts/Network systems Analysts/Web Developers | 57605 | 52926 | 84681 | 72269 | 92217 | 75000 | 71.162270 | 28.837730 | 1505167 | 71645 |
Construction Laborers | 26449 | 25616 | 31421 | 28580 | 31739 | 31739 | 97.343698 | 2.656302 | 1446560 | 58108 |
Sales Representatives, Wholesale and Manufacturing | 49750 | 45561 | 76213 | 63000 | 76174 | 57605 | 73.881937 | 26.118063 | 1239644 | 61087 |
Software Developers, Applications and Systems Software | 84000 | 75823 | 114163 | 94266 | 115211 | 95218 | 80.966669 | 19.033331 | 1230100 | 58938 |
First-Line Supervisors of Office and Administrative Support Workers | 40985 | 36887 | 60000 | 47633 | 67626 | 50783 | 40.247629 | 59.752371 | 1197273 | 59351 |
Nursing, Psychiatric, and Home Health Aides | 25404 | 23076 | 31739 | 26641 | 32814 | 28690 | 14.315815 | 85.684185 | 1176992 | 48667 |
Chief executives and legislators/public administration | 79388 | 71725 | 136158 | 105798 | 147059 | 107587 | 76.092997 | 23.907003 | 1110708 | 57858 |
Financial Managers | 60747 | 44457 | 109974 | 69126 | 117833 | 70920 | 47.491407 | 52.508593 | 1098026 | 53799 |
Notice the the underscores (_
) added to the column names when we used pivot_wider
in step 1. These underscores indicate the column spanners. In this step, we’ll also format the data presentation in each column.
wage.gt.step2 <- wage.gt.step1 %>%
# create column spanners as already indicated in the column names by "_"
tab_spanner_delim(delim = "_") %>%
# format currency colums
fmt_currency(columns = c(2:7), decimals = 0, suffixing = T) %>%
# format percentage columns
fmt_percent(columns = 8:9, decimals = 0, scale_values = F) %>%
# format the weighted count as a rounded, and suffixed number
fmt_number(columns = 10, decimals = 1, suffixing = T) %>%
# format the actual sample count without rounding
fmt_number(columns = 11, decimals = 0)
wage.gt.step2 %>% tab_options(container.height = px(500))
Under 35 | 35 to 50 | 50 plus | total jobs | n | obs | |||||
---|---|---|---|---|---|---|---|---|---|---|
Male | Female | Male | Female | Male | Female | Male | Female | |||
Managers, nec (including Postmasters) | $55K | $50K | $87K | $74K | $93K | $74K | 66% | 34% | 4.1M | 201,003 |
First-Line Supervisors of Sales Workers | $38K | $32K | $55K | $42K | $56K | $41K | 61% | 39% | 3.8M | 178,512 |
Driver/Sales Workers and Truck Drivers | $32K | $25K | $42K | $31K | $43K | $32K | 95% | 5% | 3.0M | 137,864 |
Elementary and Middle School Teachers | $45K | $42K | $58K | $53K | $64K | $58K | 22% | 78% | 3.0M | 158,680 |
Secretaries and Administrative Assistants | $33K | $31K | $47K | $37K | $50K | $41K | 6% | 94% | 2.2M | 118,095 |
Registered Nurses | $60K | $55K | $79K | $69K | $82K | $75K | 13% | 87% | 2.1M | 102,224 |
Customer Service Representatives | $30K | $28K | $43K | $36K | $46K | $37K | 35% | 65% | 1.9M | 84,546 |
Retail Salespersons | $30K | $23K | $45K | $31K | $42K | $29K | 63% | 37% | 1.7M | 76,833 |
Accountants and Auditors | $58K | $51K | $85K | $61K | $85K | $60K | 41% | 59% | 1.7M | 85,524 |
Janitors and Building Cleaners | $24K | $19K | $31K | $22K | $33K | $24K | 74% | 26% | 1.7M | 74,742 |
Laborers and Freight, Stock, and Material Movers, Hand | $25K | $21K | $34K | $26K | $37K | $30K | 83% | 17% | 1.5M | 69,348 |
Chefs and Cooks | $22K | $20K | $28K | $21K | $29K | $22K | 68% | 32% | 1.5M | 60,069 |
Computer Scientists and Systems Analysts/Network systems Analysts/Web Developers | $58K | $53K | $85K | $72K | $92K | $75K | 71% | 29% | 1.5M | 71,645 |
Construction Laborers | $26K | $26K | $31K | $29K | $32K | $32K | 97% | 3% | 1.4M | 58,108 |
Sales Representatives, Wholesale and Manufacturing | $50K | $46K | $76K | $63K | $76K | $58K | 74% | 26% | 1.2M | 61,087 |
Software Developers, Applications and Systems Software | $84K | $76K | $114K | $94K | $115K | $95K | 81% | 19% | 1.2M | 58,938 |
First-Line Supervisors of Office and Administrative Support Workers | $41K | $37K | $60K | $48K | $68K | $51K | 40% | 60% | 1.2M | 59,351 |
Nursing, Psychiatric, and Home Health Aides | $25K | $23K | $32K | $27K | $33K | $29K | 14% | 86% | 1.2M | 48,667 |
Chief executives and legislators/public administration | $79K | $72K | $136K | $106K | $147K | $108K | 76% | 24% | 1.1M | 57,858 |
Financial Managers | $61K | $44K | $110K | $69K | $118K | $71K | 47% | 53% | 1.1M | 53,799 |
Add explanatory text. The gt
package makes it easy to add titles, subtitles, footnotes, and data notes. We don’t need to specify footnote numbers, because gt
assigns them automatically based on their first appearance in the table. The md
function returns markdown formatted text.
wage.gt.step3 <- wage.gt.step2 %>%
# add explanatory text
tab_header(title = "Median annual wage for early-, mid-, and late-career workers by gender",
subtitle = md("includes all full-time workers in the top 20 most common occupational categories from the 2014-18 [American Community Survey](https://www.census.gov/programs-surveys/acs)")) %>%
tab_stubhead("occupation category") %>%
tab_source_note(md("**Data source:** Steven Ruggles, Sarah Flood, Ronald Goeken, Josia Grover, Erin Meyer, Jose Pacas
and Matthew Sobek. *IPUMS USA: Version 10.0* [dataset]. Minneapolis, MN: IPUMS 2020. [https://doi.org/10.18128/D010.V10.0](https://doi.org/10.18128/D010.V10.0).
Dollars are all in 2018 values. Full-time employment is defined as at least 37 hours per typical week.")) %>%
tab_footnote(footnote = md("IPUMS USA variable [*OCC2010*](https://usa.ipums.org/usa-action/variables/OCC2010#description_section), which is ''a harmonized occupation coding
scheme based on the Census Bureau's 2010 ACS occupation classification scheme.''"),
locations = cells_stubhead()) %>%
tab_footnote(footnote = "The weighted estimate of total full-time workers in the United States",
locations = cells_column_labels("n")) %>%
tab_footnote(footnote = "The unweighted sample count",
locations = cells_column_labels("obs")) %>%
tab_options(heading.title.font.size = 24,
heading.title.font.weight = "bolder",
column_labels.font.weight = "bold")
wage.gt.step3 %>% tab_options(container.height = px(500))
Median annual wage for early-, mid-, and late-career workers by gender | ||||||||||
---|---|---|---|---|---|---|---|---|---|---|
includes all full-time workers in the top 20 most common occupational categories from the 2014-18 American Community Survey | ||||||||||
occupation category1 | Under 35 | 35 to 50 | 50 plus | total jobs | n2 | obs3 | ||||
Male | Female | Male | Female | Male | Female | Male | Female | |||
Managers, nec (including Postmasters) | $55K | $50K | $87K | $74K | $93K | $74K | 66% | 34% | 4.1M | 201,003 |
First-Line Supervisors of Sales Workers | $38K | $32K | $55K | $42K | $56K | $41K | 61% | 39% | 3.8M | 178,512 |
Driver/Sales Workers and Truck Drivers | $32K | $25K | $42K | $31K | $43K | $32K | 95% | 5% | 3.0M | 137,864 |
Elementary and Middle School Teachers | $45K | $42K | $58K | $53K | $64K | $58K | 22% | 78% | 3.0M | 158,680 |
Secretaries and Administrative Assistants | $33K | $31K | $47K | $37K | $50K | $41K | 6% | 94% | 2.2M | 118,095 |
Registered Nurses | $60K | $55K | $79K | $69K | $82K | $75K | 13% | 87% | 2.1M | 102,224 |
Customer Service Representatives | $30K | $28K | $43K | $36K | $46K | $37K | 35% | 65% | 1.9M | 84,546 |
Retail Salespersons | $30K | $23K | $45K | $31K | $42K | $29K | 63% | 37% | 1.7M | 76,833 |
Accountants and Auditors | $58K | $51K | $85K | $61K | $85K | $60K | 41% | 59% | 1.7M | 85,524 |
Janitors and Building Cleaners | $24K | $19K | $31K | $22K | $33K | $24K | 74% | 26% | 1.7M | 74,742 |
Laborers and Freight, Stock, and Material Movers, Hand | $25K | $21K | $34K | $26K | $37K | $30K | 83% | 17% | 1.5M | 69,348 |
Chefs and Cooks | $22K | $20K | $28K | $21K | $29K | $22K | 68% | 32% | 1.5M | 60,069 |
Computer Scientists and Systems Analysts/Network systems Analysts/Web Developers | $58K | $53K | $85K | $72K | $92K | $75K | 71% | 29% | 1.5M | 71,645 |
Construction Laborers | $26K | $26K | $31K | $29K | $32K | $32K | 97% | 3% | 1.4M | 58,108 |
Sales Representatives, Wholesale and Manufacturing | $50K | $46K | $76K | $63K | $76K | $58K | 74% | 26% | 1.2M | 61,087 |
Software Developers, Applications and Systems Software | $84K | $76K | $114K | $94K | $115K | $95K | 81% | 19% | 1.2M | 58,938 |
First-Line Supervisors of Office and Administrative Support Workers | $41K | $37K | $60K | $48K | $68K | $51K | 40% | 60% | 1.2M | 59,351 |
Nursing, Psychiatric, and Home Health Aides | $25K | $23K | $32K | $27K | $33K | $29K | 14% | 86% | 1.2M | 48,667 |
Chief executives and legislators/public administration | $79K | $72K | $136K | $106K | $147K | $108K | 76% | 24% | 1.1M | 57,858 |
Financial Managers | $61K | $44K | $110K | $69K | $118K | $71K | 47% | 53% | 1.1M | 53,799 |
Data source: Steven Ruggles, Sarah Flood, Ronald Goeken, Josia Grover, Erin Meyer, Jose Pacas and Matthew Sobek. IPUMS USA: Version 10.0 [dataset]. Minneapolis, MN: IPUMS 2020. https://doi.org/10.18128/D010.V10.0. Dollars are all in 2018 values. Full-time employment is defined as at least 37 hours per typical week. | ||||||||||
1
IPUMS USA variable OCC2010, which is ''a harmonized occupation coding
scheme based on the Census Bureau's 2010 ACS occupation classification scheme.''
2
The weighted estimate of total full-time workers in the United States
3
The unweighted sample count
|
Out of the box, gt
provides the ability to target a column by name and rows by a logical condition for formatting. It doesn’t currently have the built-in ability to simply format a cell by applying a function the value contained in that cell. In this table, I want to do some something more complicated yet.
One interesting aspect of the table is the size of the gender wage gap in each age/occupation pair. This requires the viewer to do some mental math. To help the viewer compare wage gaps across the table, I want to shade each cell by the size of the wage gap between it and its partner cell.
Doing this takes several steps. First, make a tibble
with the same rows and columns that we want to format in our gt
object. Except, the values in the cells of this tibble
will be the wage gaps–the values we want to color the gt
cells by–not the median wages. Then, write a function which uses a for
loop to target each cell in the gt
object individually, coloring its background based on the value in the wage gap tibble
.
# first, calculate the size of the wage gap for each gender/age pair
wage.gap.values <- income.sex.age %>%
pivot_wider(names_from = SEX, values_from = median_wage) %>%
mutate(wage_gap = Male - Female)
# create a table with same column names and row order as the gt object
wage.gap.table <- wage.gap.values %>%
pivot_longer(cols = c("Male", "Female")) %>%
select(OCC2010, age_category, name, wage_gap) %>%
pivot_wider(names_from = c(age_category, name), values_from = wage_gap) %>%
# arrange from most jobs to fewest
mutate(OCC2010 = factor(OCC2010, levels = top.jobs)) %>%
arrange(OCC2010)
First, I make a function which returns a color name in my chosen palette for each wage gap value. I create this function using the useful colorNumeric
function from the leaflet
package. Because all of the wage gaps have the same sign (men make more than women) a sequential color palette is appropriate. I’m using the “Purples” palette from ColorBrewer.
# borrow the colorNumeric function from {leaflet}
# this new function, purple_pal, returns a color for each value
purple_pal <- leaflet::colorNumeric(palette = "Purples",
domain = wage.gap.values$wage_gap)
purple_pal(wage.gap.values$wage_gap[10]) # for example
## [1] "#F2F0F7"
Some of the background colors are quite dark, so in these cases I need to switch the font color from black to white. To do this, I create two helper functions–luminance
calculates the brightness of each color and text_color
chooses between “black” or “white” based on that value. My luminance
function is based on the helpful discussion beneath this StackOverflow question.
luminance <- function(col){
# RGB values of the color
rgb.values <- col2rgb(col)
# calculate luminance
red.value <- rgb.values[1] * 0.299
green.value <- rgb.values[2] * 0.587
blue.value <- rgb.values[3] * 0.114
# add the values together and divide by 255 (the max value)
(red.value + green.value + blue.value)/255
}
luminance("orange") # for example
## [1] 0.6788235
# this function returns a text color based on luminance of background color
text_color <- function(color){
if(luminance(color) < 0.6){
"white"
} else {
"black"
}
}
text_color("orange") # for example
## [1] "black"
Finally, I combine all these helper functions into one function we can be applied to each relevant column in the gt
object. Thanks to Arun Kirshna for a helpful suggestion on this step.
# create a function which formats each cell individually
format_wage_gap <- function(gtobj, column){
# generate color for each cell in the column
color_value <- purple_pal(wage.gap.table %>% pull(sym(column)))
# for each row in the column, perform these formatting steps
for(i in seq_along(wage.tibble %>% pull(sym(column)))){
gtobj <- gtobj %>%
tab_style(style = list(
# fill the cell background
cell_fill(color = color_value[i]),
# color the text black or white depending on the luminence of the background color
cell_text(color = text_color(color_value[i]))
),
locations = cells_body(columns = column, rows = i))
}
gtobj
}
wage.gt.step4 <- wage.gt.step3 %>%
format_wage_gap("Under 35_Male") %>%
format_wage_gap("Under 35_Female") %>%
format_wage_gap("35 to 50_Male") %>%
format_wage_gap("35 to 50_Female") %>%
format_wage_gap("50 plus_Male") %>%
format_wage_gap("50 plus_Female")
wage.gt.step4 %>% tab_options(container.height = px(500))
Median annual wage for early-, mid-, and late-career workers by gender | ||||||||||
---|---|---|---|---|---|---|---|---|---|---|
includes all full-time workers in the top 20 most common occupational categories from the 2014-18 American Community Survey | ||||||||||
occupation category1 | Under 35 | 35 to 50 | 50 plus | total jobs | n2 | obs3 | ||||
Male | Female | Male | Female | Male | Female | Male | Female | |||
Managers, nec (including Postmasters) | $55K | $50K | $87K | $74K | $93K | $74K | 66% | 34% | 4.1M | 201,003 |
First-Line Supervisors of Sales Workers | $38K | $32K | $55K | $42K | $56K | $41K | 61% | 39% | 3.8M | 178,512 |
Driver/Sales Workers and Truck Drivers | $32K | $25K | $42K | $31K | $43K | $32K | 95% | 5% | 3.0M | 137,864 |
Elementary and Middle School Teachers | $45K | $42K | $58K | $53K | $64K | $58K | 22% | 78% | 3.0M | 158,680 |
Secretaries and Administrative Assistants | $33K | $31K | $47K | $37K | $50K | $41K | 6% | 94% | 2.2M | 118,095 |
Registered Nurses | $60K | $55K | $79K | $69K | $82K | $75K | 13% | 87% | 2.1M | 102,224 |
Customer Service Representatives | $30K | $28K | $43K | $36K | $46K | $37K | 35% | 65% | 1.9M | 84,546 |
Retail Salespersons | $30K | $23K | $45K | $31K | $42K | $29K | 63% | 37% | 1.7M | 76,833 |
Accountants and Auditors | $58K | $51K | $85K | $61K | $85K | $60K | 41% | 59% | 1.7M | 85,524 |
Janitors and Building Cleaners | $24K | $19K | $31K | $22K | $33K | $24K | 74% | 26% | 1.7M | 74,742 |
Laborers and Freight, Stock, and Material Movers, Hand | $25K | $21K | $34K | $26K | $37K | $30K | 83% | 17% | 1.5M | 69,348 |
Chefs and Cooks | $22K | $20K | $28K | $21K | $29K | $22K | 68% | 32% | 1.5M | 60,069 |
Computer Scientists and Systems Analysts/Network systems Analysts/Web Developers | $58K | $53K | $85K | $72K | $92K | $75K | 71% | 29% | 1.5M | 71,645 |
Construction Laborers | $26K | $26K | $31K | $29K | $32K | $32K | 97% | 3% | 1.4M | 58,108 |
Sales Representatives, Wholesale and Manufacturing | $50K | $46K | $76K | $63K | $76K | $58K | 74% | 26% | 1.2M | 61,087 |
Software Developers, Applications and Systems Software | $84K | $76K | $114K | $94K | $115K | $95K | 81% | 19% | 1.2M | 58,938 |
First-Line Supervisors of Office and Administrative Support Workers | $41K | $37K | $60K | $48K | $68K | $51K | 40% | 60% | 1.2M | 59,351 |
Nursing, Psychiatric, and Home Health Aides | $25K | $23K | $32K | $27K | $33K | $29K | 14% | 86% | 1.2M | 48,667 |
Chief executives and legislators/public administration | $79K | $72K | $136K | $106K | $147K | $108K | 76% | 24% | 1.1M | 57,858 |
Financial Managers | $61K | $44K | $110K | $69K | $118K | $71K | 47% | 53% | 1.1M | 53,799 |
Data source: Steven Ruggles, Sarah Flood, Ronald Goeken, Josia Grover, Erin Meyer, Jose Pacas and Matthew Sobek. IPUMS USA: Version 10.0 [dataset]. Minneapolis, MN: IPUMS 2020. https://doi.org/10.18128/D010.V10.0. Dollars are all in 2018 values. Full-time employment is defined as at least 37 hours per typical week. | ||||||||||
1
IPUMS USA variable OCC2010, which is ''a harmonized occupation coding
scheme based on the Census Bureau's 2010 ACS occupation classification scheme.''
2
The weighted estimate of total full-time workers in the United States
3
The unweighted sample count
|
I also want to color the cell backgrounds of the job gap columns, but this time a diverging color palette is appropriate because some professions have more women than men and others have more men than women. I use another ColorBrewer palette, “PRGn,” because it is colorblind-safe.
# build the tibble of job gap values
job.gap.table <- total.jobs.by.sex %>%
pivot_wider(names_from = SEX, values_from = pct) %>%
mutate(job_gap = Male - Female) %>%
select(OCC2010, `total jobs_Male` = Male, `total jobs_Female` = Female, job_gap) %>%
mutate(`total jobs_Male` = job_gap, `total jobs_Female` = job_gap) %>%
select(-job_gap) %>%
mutate(OCC2010 = factor(OCC2010, levels = top.jobs)) %>%
arrange(OCC2010)
# use the PRGn diverging palette
purple_green_pal <- leaflet::colorNumeric(palette = "PRGn", domain = job.gap.table$`total jobs_Male`,
reverse = TRUE)
format_job_gap <- function(gtobj, column){
# generate color for each cell in the column
color_value <- purple_green_pal(job.gap.table %>% pull(sym(column)))
# for each row in the column, perform these formatting steps
for(i in seq_along(wage.tibble %>% pull(sym(column)))){
gtobj <- gtobj %>%
tab_style(style = list(
# fill the cell background
cell_fill(color = color_value[i]),
# color the text black or white depending on the luminence of the background color
cell_text(color = text_color(color_value[i]))
),
locations = cells_body(columns = column, rows = i))
}
gtobj
}
wage.gt.step5 <- wage.gt.step4 %>%
format_job_gap("total jobs_Male") %>%
format_job_gap("total jobs_Female")
wage.gt.step5 %>% tab_options(container.height = px(500))
Median annual wage for early-, mid-, and late-career workers by gender | ||||||||||
---|---|---|---|---|---|---|---|---|---|---|
includes all full-time workers in the top 20 most common occupational categories from the 2014-18 American Community Survey | ||||||||||
occupation category1 | Under 35 | 35 to 50 | 50 plus | total jobs | n2 | obs3 | ||||
Male | Female | Male | Female | Male | Female | Male | Female | |||
Managers, nec (including Postmasters) | $55K | $50K | $87K | $74K | $93K | $74K | 66% | 34% | 4.1M | 201,003 |
First-Line Supervisors of Sales Workers | $38K | $32K | $55K | $42K | $56K | $41K | 61% | 39% | 3.8M | 178,512 |
Driver/Sales Workers and Truck Drivers | $32K | $25K | $42K | $31K | $43K | $32K | 95% | 5% | 3.0M | 137,864 |
Elementary and Middle School Teachers | $45K | $42K | $58K | $53K | $64K | $58K | 22% | 78% | 3.0M | 158,680 |
Secretaries and Administrative Assistants | $33K | $31K | $47K | $37K | $50K | $41K | 6% | 94% | 2.2M | 118,095 |
Registered Nurses | $60K | $55K | $79K | $69K | $82K | $75K | 13% | 87% | 2.1M | 102,224 |
Customer Service Representatives | $30K | $28K | $43K | $36K | $46K | $37K | 35% | 65% | 1.9M | 84,546 |
Retail Salespersons | $30K | $23K | $45K | $31K | $42K | $29K | 63% | 37% | 1.7M | 76,833 |
Accountants and Auditors | $58K | $51K | $85K | $61K | $85K | $60K | 41% | 59% | 1.7M | 85,524 |
Janitors and Building Cleaners | $24K | $19K | $31K | $22K | $33K | $24K | 74% | 26% | 1.7M | 74,742 |
Laborers and Freight, Stock, and Material Movers, Hand | $25K | $21K | $34K | $26K | $37K | $30K | 83% | 17% | 1.5M | 69,348 |
Chefs and Cooks | $22K | $20K | $28K | $21K | $29K | $22K | 68% | 32% | 1.5M | 60,069 |
Computer Scientists and Systems Analysts/Network systems Analysts/Web Developers | $58K | $53K | $85K | $72K | $92K | $75K | 71% | 29% | 1.5M | 71,645 |
Construction Laborers | $26K | $26K | $31K | $29K | $32K | $32K | 97% | 3% | 1.4M | 58,108 |
Sales Representatives, Wholesale and Manufacturing | $50K | $46K | $76K | $63K | $76K | $58K | 74% | 26% | 1.2M | 61,087 |
Software Developers, Applications and Systems Software | $84K | $76K | $114K | $94K | $115K | $95K | 81% | 19% | 1.2M | 58,938 |
First-Line Supervisors of Office and Administrative Support Workers | $41K | $37K | $60K | $48K | $68K | $51K | 40% | 60% | 1.2M | 59,351 |
Nursing, Psychiatric, and Home Health Aides | $25K | $23K | $32K | $27K | $33K | $29K | 14% | 86% | 1.2M | 48,667 |
Chief executives and legislators/public administration | $79K | $72K | $136K | $106K | $147K | $108K | 76% | 24% | 1.1M | 57,858 |
Financial Managers | $61K | $44K | $110K | $69K | $118K | $71K | 47% | 53% | 1.1M | 53,799 |
Data source: Steven Ruggles, Sarah Flood, Ronald Goeken, Josia Grover, Erin Meyer, Jose Pacas and Matthew Sobek. IPUMS USA: Version 10.0 [dataset]. Minneapolis, MN: IPUMS 2020. https://doi.org/10.18128/D010.V10.0. Dollars are all in 2018 values. Full-time employment is defined as at least 37 hours per typical week. | ||||||||||
1
IPUMS USA variable OCC2010, which is ''a harmonized occupation coding
scheme based on the Census Bureau's 2010 ACS occupation classification scheme.''
2
The weighted estimate of total full-time workers in the United States
3
The unweighted sample count
|
As a final step, I add some additional footnotes explaining the cell background colors. Notice how gt
automatically renumbers all the footnotes appropriately.
# add footnotes explaining the coloring of the cells
wage.gt.step6 <- wage.gt.step5 %>%
tab_footnote(footnote = "Darker colors indicated a larger wage gap between men and women.",
locations = cells_column_spanners(c("Under 35", "35 to 50", "50 plus"))) %>%
tab_footnote(footnote = "Purple = more men than woman. Green = more women than men.",
locations = cells_column_spanners("total jobs"))
wage.gt.step6 %>% tab_options(container.height = px(500))
Median annual wage for early-, mid-, and late-career workers by gender | ||||||||||
---|---|---|---|---|---|---|---|---|---|---|
includes all full-time workers in the top 20 most common occupational categories from the 2014-18 American Community Survey | ||||||||||
occupation category1 | Under 352 | 35 to 502 | 50 plus2 | total jobs3 | n4 | obs5 | ||||
Male | Female | Male | Female | Male | Female | Male | Female | |||
Managers, nec (including Postmasters) | $55K | $50K | $87K | $74K | $93K | $74K | 66% | 34% | 4.1M | 201,003 |
First-Line Supervisors of Sales Workers | $38K | $32K | $55K | $42K | $56K | $41K | 61% | 39% | 3.8M | 178,512 |
Driver/Sales Workers and Truck Drivers | $32K | $25K | $42K | $31K | $43K | $32K | 95% | 5% | 3.0M | 137,864 |
Elementary and Middle School Teachers | $45K | $42K | $58K | $53K | $64K | $58K | 22% | 78% | 3.0M | 158,680 |
Secretaries and Administrative Assistants | $33K | $31K | $47K | $37K | $50K | $41K | 6% | 94% | 2.2M | 118,095 |
Registered Nurses | $60K | $55K | $79K | $69K | $82K | $75K | 13% | 87% | 2.1M | 102,224 |
Customer Service Representatives | $30K | $28K | $43K | $36K | $46K | $37K | 35% | 65% | 1.9M | 84,546 |
Retail Salespersons | $30K | $23K | $45K | $31K | $42K | $29K | 63% | 37% | 1.7M | 76,833 |
Accountants and Auditors | $58K | $51K | $85K | $61K | $85K | $60K | 41% | 59% | 1.7M | 85,524 |
Janitors and Building Cleaners | $24K | $19K | $31K | $22K | $33K | $24K | 74% | 26% | 1.7M | 74,742 |
Laborers and Freight, Stock, and Material Movers, Hand | $25K | $21K | $34K | $26K | $37K | $30K | 83% | 17% | 1.5M | 69,348 |
Chefs and Cooks | $22K | $20K | $28K | $21K | $29K | $22K | 68% | 32% | 1.5M | 60,069 |
Computer Scientists and Systems Analysts/Network systems Analysts/Web Developers | $58K | $53K | $85K | $72K | $92K | $75K | 71% | 29% | 1.5M | 71,645 |
Construction Laborers | $26K | $26K | $31K | $29K | $32K | $32K | 97% | 3% | 1.4M | 58,108 |
Sales Representatives, Wholesale and Manufacturing | $50K | $46K | $76K | $63K | $76K | $58K | 74% | 26% | 1.2M | 61,087 |
Software Developers, Applications and Systems Software | $84K | $76K | $114K | $94K | $115K | $95K | 81% | 19% | 1.2M | 58,938 |
First-Line Supervisors of Office and Administrative Support Workers | $41K | $37K | $60K | $48K | $68K | $51K | 40% | 60% | 1.2M | 59,351 |
Nursing, Psychiatric, and Home Health Aides | $25K | $23K | $32K | $27K | $33K | $29K | 14% | 86% | 1.2M | 48,667 |
Chief executives and legislators/public administration | $79K | $72K | $136K | $106K | $147K | $108K | 76% | 24% | 1.1M | 57,858 |
Financial Managers | $61K | $44K | $110K | $69K | $118K | $71K | 47% | 53% | 1.1M | 53,799 |
Data source: Steven Ruggles, Sarah Flood, Ronald Goeken, Josia Grover, Erin Meyer, Jose Pacas and Matthew Sobek. IPUMS USA: Version 10.0 [dataset]. Minneapolis, MN: IPUMS 2020. https://doi.org/10.18128/D010.V10.0. Dollars are all in 2018 values. Full-time employment is defined as at least 37 hours per typical week. | ||||||||||
1
IPUMS USA variable OCC2010, which is ''a harmonized occupation coding
scheme based on the Census Bureau's 2010 ACS occupation classification scheme.''
2
Darker colors indicated a larger wage gap between men and women.
3
Purple = more men than woman. Green = more women than men.
4
The weighted estimate of total full-time workers in the United States
5
The unweighted sample count
|
The function gtsave
function offers several ways to save your table, including as a png
or pdf
. In this example, I save the output as an html
file.
gtsave(wage.gt.step4, "WageAgeGenderTable.html", inline_css = TRUE,
path = "/your/file/directory/")