Tables are often treated as lesser forms of data visualization, but that is unfair: well-designed tables can support data-driven narratives, and look dapper in the process. Smarter people than me have laid out some principles for creating effective tables - see Show Me the Numbers by Stephen Few, and this 2020 paper from Jonathan A. Schwabish (summarized on this thread), which include:
The {gt} package makes it easy to implement all of this in R, using the familiar syntax of the tidyverse. For a nice tutorial, I recommend the checking out The Mockup Blog.
I like working with Chicago 311 service request data, so this document makes some tables using those data, which all available from the Chicago Data Portal. The data in these tables includes service requests created on or between 31 Oct 2019 and 31 Oct 2020, and excludes the two most common requests, 311 information requests and aircraft noise complaints, because they are not associated with specific geographic locations in the available 311 request data: all information requests are associated with the main 311 office location, and all aircraft noise complaints are associated with airport locations.
library(tidyverse)
library(xml2)
library(rvest)
library(gt)
# read 311 data
raw_311 <- try(readRDS('./data/lastyear_311_requests.rds'), silent = TRUE)
if (class(raw_311) == 'try-error') {
message('no local version of 311 data, grabbing from github...')
raw_311 <- readRDS(
url("https://github.com/jimtheflash/rstudio.tablecontest.2020/raw/main/data/lastyear_311_requests.rds","rb")
)
}
# get CA lookup for merging
ca_lu <- read_html('https://en.wikipedia.org/wiki/Community_areas_in_Chicago') %>%
html_table() %>%
`[[`(1) %>%
transmute(community_area = as.numeric(`Number[8]`),
ca_name = `Name[8]`,
ca_sq_mi = as.numeric(`Area (sq mi.)[10]`))
# read latest CA population data
raw_ca <- read_csv("https://github.com/jimtheflash/rstudio.tablecontest.2020/raw/main/data/ReferenceCCAProfiles20142018.csv") %>%
transmute(ca_name = GEOG,
ca_est_pop = floor(as.numeric(TOT_POP)))
# tidy and merge for tables
tidy_311 <- raw_311 %>%
# get rid of the top 2 types of service requests
filter(sr_type != '311 INFORMATION ONLY CALL',
sr_type != 'Aircraft Noise Complaint') %>%
mutate(grouping_sr_number = if_else(parent_sr_number == "", sr_number, parent_sr_number)) %>%
select(grouping_sr_number,
sr_number,
sr_type,
created_date,
community_area) %>%
group_by(grouping_sr_number) %>%
mutate(distinct_requests_by_grouping_sr_number = n_distinct(sr_number)) %>%
ungroup() %>%
inner_join(ca_lu, by = 'community_area') %>%
inner_join(raw_ca, by = "ca_name")
This table summarizes the most frequent service requests by type, so we can identify the most frequent types of 311 service requests. As explained in the table, a service request record arises when an individual contacts 311 to report an issue; complaints that reference the same issue are assigned the same parent issue. For instance, if there are three service requests about the same abandoned vehicle, that would be reflected in the tables below as three service requests, and one parent issue.
tidy_311 %>%
group_by(`Service Request Type` = sr_type) %>%
summarise(`Total Requests` = n_distinct(sr_number),
`Parent Issues` = n_distinct(grouping_sr_number)) %>%
mutate(`Requests Per Parent Issue` = `Total Requests` / `Parent Issues`) %>%
filter(`Total Requests` >= 10000) %>%
arrange(desc(`Total Requests`)) %>%
gt() %>%
tab_style(
style = cell_text(weight = "bold"),
locations = list(cells_title(),
cells_column_labels(columns = gt::everything())
)) %>%
tab_header(title = "Most Frequent 311 Service Requests in Chicago",
subtitle = "Includes request types with at least 10,000 total requests from Oct 31 2019 through Oct 31 2020") %>%
opt_align_table_header("left") %>%
tab_footnote(footnote = "311 information requests and complaints about aircraft noise were excluded",
locations = cells_title(groups = "subtitle")) %>%
tab_footnote(footnote = "Total Requests is the number of requests filed, which can include requests about the same parent issue (e.g. multiple requests about a particular street light malfunctioning)",
locations = cells_column_labels(vars(`Total Requests`))) %>%
tab_footnote(footnote = "Parent Issues is the number of distinct issues of a given type (e.g. if there were 20 requests about 2 street lights malfunctioning, Parent Issues is 2)",
locations = cells_column_labels(vars(`Parent Issues`))) %>%
tab_source_note("311 service request data from https://data.cityofchicago.org/Service-Requests/311-Service-Requests/v6vf-nfxy") %>%
fmt_number(columns = vars(`Total Requests`, `Parent Issues`),
decimals = 0) %>%
fmt_number(columns = vars(`Requests Per Parent Issue`),
decimals = 3) %>%
cols_align(align = 'right', columns = vars(`Total Requests`, `Parent Issues`, `Requests Per Parent Issue`)) %>%
opt_row_striping()
| Most Frequent 311 Service Requests in Chicago | |||
|---|---|---|---|
| Includes request types with at least 10,000 total requests from Oct 31 2019 through Oct 31 20201 | |||
| Service Request Type | Total Requests2 | Parent Issues3 | Requests Per Parent Issue |
| Graffiti Removal Request | 99,725 | 97,064 | 1.027 |
| Street Light Out Complaint | 77,764 | 35,390 | 2.197 |
| Garbage Cart Maintenance | 55,997 | 51,182 | 1.094 |
| Weed Removal Request | 53,376 | 51,353 | 1.039 |
| Rodent Baiting/Rat Complaint | 51,807 | 49,534 | 1.046 |
| Pothole in Street Complaint | 41,493 | 32,862 | 1.263 |
| Tree Trim Request | 38,547 | 33,688 | 1.144 |
| Tree Debris Clean-Up Request | 34,085 | 31,479 | 1.083 |
| Sign Repair Request - All Other Signs | 32,930 | 31,370 | 1.050 |
| Abandoned Vehicle Complaint | 31,865 | 31,865 | 1.000 |
| Traffic Signal Out Complaint | 21,338 | 16,449 | 1.297 |
| Alley Light Out Complaint | 19,385 | 11,198 | 1.731 |
| Building Violation | 18,563 | 16,223 | 1.144 |
| Tree Removal Request | 15,758 | 14,272 | 1.104 |
| Sanitation Code Violation | 14,464 | 13,867 | 1.043 |
| Sewer Cleaning Inspection Request | 12,399 | 11,405 | 1.087 |
| Dead Animal Pick-Up Request | 12,295 | 10,170 | 1.209 |
| Fly Dumping Complaint | 10,796 | 10,057 | 1.073 |
| Business Complaints / Reopening Issue | 10,779 | 9,449 | 1.141 |
| Stray Animal Complaint | 10,027 | 8,988 | 1.116 |
| 311 service request data from https://data.cityofchicago.org/Service-Requests/311-Service-Requests/v6vf-nfxy | |||
|
1
311 information requests and complaints about aircraft noise were excluded
2
Total Requests is the number of requests filed, which can include requests about the same parent issue (e.g. multiple requests about a particular street light malfunctioning)
3
Parent Issues is the number of distinct issues of a given type (e.g. if there were 20 requests about 2 street lights malfunctioning, Parent Issues is 2)
|
|||
\(~\)
Nothing fancy here, or necessary. The table header, footnotes, and source information are easy to read and don’t distract from the rest of the data.
This table summarizes 311 requests by community areas, so we can identify which community areas are submitting the most service requests. I thought total requests per person would be interesting as a way to gauge how frequently residents of an area were engaging with the 311 service, and I thought that parent requests per square mile might be an interesting way of estimating how many issues need addressing across community areas. This table is sorted by the Total Requests column.
tidy_311 %>%
group_by(`Community Area` = ca_name) %>%
summarise(`Est. Population` = max(ca_est_pop),
`Area (Square Miles)` = max(ca_sq_mi),
`Total Requests` = n_distinct(sr_number),
`Parent Issues` = n_distinct(grouping_sr_number)) %>%
mutate(`Requests Per Person` = `Total Requests` / `Est. Population`,
`Parent Issues Per Sq. Mi.` = `Parent Issues` / `Area (Square Miles)`) %>%
filter(`Total Requests` >= 15000) %>%
arrange(desc(`Total Requests`)) %>%
gt() %>%
tab_style(
style = cell_text(size = 'small'),
locations = list(cells_column_labels(columns = gt::everything()),
cells_body())
) %>%
tab_style(
style = cell_text(weight = "bold"),
locations = list(cells_title(),
cells_column_labels(columns = gt::everything())
)) %>%
tab_header(title = "311 Service Requests By Chicago Community Area",
subtitle = "Includes community areas with at least 15,000 total requests from Oct 31 2019 through Oct 31 2020") %>%
opt_align_table_header("left") %>%
tab_footnote(footnote = "311 information requests and complaints about aircraft noise were excluded",
locations = cells_title(groups = "subtitle")) %>%
tab_footnote(footnote = "Total Requests is the number of requests filed, which can include requests about the same parent issue (e.g. multiple requests about a particular street light malfunctioning)",
locations = cells_column_labels(vars(`Total Requests`))) %>%
tab_footnote(footnote = "Parent Issues is the number of distinct issues of a given type (e.g. if there were 20 requests about 2 street lights malfunctioning, Parent Issues is 2)",
locations = cells_column_labels(vars(`Parent Issues`))) %>%
tab_source_note("Estimated community area population taken from American Community Survey 2014-2018 via https://datahub.cmap.illinois.gov/dataset/community-data-snapshots-raw-data/resource/8c4e096e-c90c-4bef-9cf1-9028d094296e") %>%
tab_source_note("Community area square mileage taken from https://en.wikipedia.org/wiki/Community_areas_in_Chicago") %>%
tab_source_note("311 service request data from https://data.cityofchicago.org/Service-Requests/311-Service-Requests/v6vf-nfxy") %>%
tab_options(footnotes.font.size = "small",
source_notes.font.size = "small") %>%
fmt_number(columns = vars(`Est. Population`, `Total Requests`, `Parent Issues`, `Parent Issues Per Sq. Mi.`),
decimals = 0) %>%
fmt_number(columns = vars(`Area (Square Miles)`),
decimals = 2) %>%
fmt_number(columns = vars(`Requests Per Person`),
decimals = 3) %>%
cols_align(align = "right",
columns = vars(`Area (Square Miles)`, `Est. Population`, `Total Requests`, `Parent Issues`, `Requests Per Person`,`Parent Issues Per Sq. Mi.`)) %>%
opt_row_striping()
| 311 Service Requests By Chicago Community Area | ||||||
|---|---|---|---|---|---|---|
| Includes community areas with at least 15,000 total requests from Oct 31 2019 through Oct 31 20201 | ||||||
| Community Area | Est. Population | Area (Square Miles) | Total Requests2 | Parent Issues3 | Requests Per Person | Parent Issues Per Sq. Mi. |
| West Town | 84,255 | 4.58 | 29,211 | 25,653 | 0.347 | 5,601 |
| Austin | 94,762 | 7.15 | 28,906 | 24,917 | 0.305 | 3,485 |
| Logan Square | 72,724 | 3.59 | 25,321 | 22,715 | 0.348 | 6,327 |
| West Englewood | 28,236 | 3.15 | 20,996 | 19,158 | 0.744 | 6,082 |
| Portage Park | 64,954 | 3.95 | 20,374 | 17,316 | 0.314 | 4,384 |
| Belmont Cragin | 80,648 | 3.91 | 20,143 | 17,519 | 0.250 | 4,481 |
| Irving Park | 53,665 | 3.21 | 20,025 | 17,086 | 0.373 | 5,323 |
| Roseland | 41,106 | 4.82 | 19,681 | 16,252 | 0.479 | 3,372 |
| South Lawndale | 74,943 | 4.59 | 18,960 | 16,095 | 0.253 | 3,507 |
| New City | 39,463 | 4.83 | 18,925 | 16,552 | 0.480 | 3,427 |
| Auburn Gresham | 45,271 | 3.77 | 18,120 | 15,609 | 0.400 | 4,140 |
| Lake View | 100,547 | 3.12 | 17,696 | 15,990 | 0.176 | 5,125 |
| Englewood | 23,792 | 3.07 | 17,453 | 16,040 | 0.734 | 5,225 |
| Humboldt Park | 56,161 | 3.60 | 16,917 | 14,855 | 0.301 | 4,126 |
| Near West Side | 62,733 | 5.69 | 15,986 | 14,059 | 0.255 | 2,471 |
| West Ridge | 77,212 | 3.53 | 15,838 | 13,791 | 0.205 | 3,907 |
| Chicago Lawn | 52,003 | 3.53 | 15,717 | 13,504 | 0.302 | 3,825 |
| Estimated community area population taken from American Community Survey 2014-2018 via https://datahub.cmap.illinois.gov/dataset/community-data-snapshots-raw-data/resource/8c4e096e-c90c-4bef-9cf1-9028d094296e | ||||||
| Community area square mileage taken from https://en.wikipedia.org/wiki/Community_areas_in_Chicago | ||||||
| 311 service request data from https://data.cityofchicago.org/Service-Requests/311-Service-Requests/v6vf-nfxy | ||||||
|
1
311 information requests and complaints about aircraft noise were excluded
2
Total Requests is the number of requests filed, which can include requests about the same parent issue (e.g. multiple requests about a particular street light malfunctioning)
3
Parent Issues is the number of distinct issues of a given type (e.g. if there were 20 requests about 2 street lights malfunctioning, Parent Issues is 2)
|
||||||
\(~\)
This table, despite being fairly long, is still pretty digestible. While I think the default sorting (Total Requests descending) is sensible, it is also feasible that readers would want to sort by other criteria. In these cases, a table made with the {reactable} or {DT} libraries, which have sortable columns by default, may be a better option. And obviously an interactive map would be a reasonable way to visualize these data!
This table summarizes the top five service requests types by community area, sorted by community area population. This makes it possible to identify community areas with unique mixes of service request types.
tidy_311 %>%
group_by(ca_name) %>%
filter(n_distinct(sr_number) >= 15000) %>%
ungroup() %>%
group_by(`Community Area` = ca_name, `Service Request Type` = sr_type) %>%
summarise(pop = max(ca_est_pop),
`Total Requests` = n_distinct(sr_number),
`Parent Issues` = n_distinct(grouping_sr_number)) %>%
arrange(desc(pop), desc(`Total Requests`)) %>%
mutate(rank = row_number()) %>%
filter(rank <= 5) %>%
select(-pop, -rank) %>%
gt() %>%
tab_style(
style = cell_text(size = "medium"),
locations = cells_title()
) %>%
tab_style(
style = cell_text(weight = "bold"),
locations = list(cells_title(),
cells_column_labels(columns = gt::everything()),
cells_row_groups()
)) %>%
tab_style(
style = cell_fill(alpha = 0.5),
locations = cells_row_groups()
) %>%
tab_header(title = "Top 311 Service Request Types by Chicago Community Area",
subtitle = "Sorted by Community Area population; includes community areas with at least 15,000 total requests from Oct 31 2019 through Oct 31 2020") %>%
opt_align_table_header("left") %>%
tab_footnote(footnote = "311 information requests and complaints about aircraft noise were excluded",
locations = cells_title(groups = "subtitle")) %>%
tab_footnote(footnote = "Total Requests is the number of requests filed, which can include requests about the same parent issue (e.g. multiple requests about a particular street light malfunctioning)",
locations = cells_column_labels(vars(`Total Requests`))) %>%
tab_footnote(footnote = "Parent Issues is the number of distinct issues of a given type (e.g. if there were 20 requests about 2 street lights malfunctioning, Parent Issues is 2)",
locations = cells_column_labels(vars(`Parent Issues`))) %>%
tab_source_note("Estimated community area population taken from American Community Survey 2014-2018 via https://datahub.cmap.illinois.gov/dataset/community-data-snapshots-raw-data/resource/8c4e096e-c90c-4bef-9cf1-9028d094296e") %>%
tab_source_note("Community area square mileage taken from https://en.wikipedia.org/wiki/Community_areas_in_Chicago") %>%
tab_source_note("311 service request data from https://data.cityofchicago.org/Service-Requests/311-Service-Requests/v6vf-nfxy") %>%
tab_options(footnotes.font.size = "small",
source_notes.font.size = "small") %>%
fmt_number(columns = vars(`Total Requests`, `Parent Issues`),
decimals = 0) %>%
cols_align(align = "right", columns = vars(`Total Requests`, `Parent Issues`))
| Top 311 Service Request Types by Chicago Community Area | ||
|---|---|---|
| Sorted by Community Area population; includes community areas with at least 15,000 total requests from Oct 31 2019 through Oct 31 20201 | ||
| Service Request Type | Total Requests2 | Parent Issues3 |
| Lake View | ||
| Graffiti Removal Request | 3,343 | 3,275 |
| Rodent Baiting/Rat Complaint | 2,097 | 1,997 |
| Sign Repair Request - All Other Signs | 840 | 775 |
| Tree Trim Request | 831 | 721 |
| Tree Debris Clean-Up Request | 725 | 680 |
| Austin | ||
| Garbage Cart Maintenance | 2,942 | 2,638 |
| Street Light Out Complaint | 2,578 | 1,290 |
| Weed Removal Request | 2,098 | 2,066 |
| Abandoned Vehicle Complaint | 1,691 | 1,691 |
| Tree Debris Clean-Up Request | 1,555 | 1,427 |
| West Town | ||
| Graffiti Removal Request | 7,494 | 7,298 |
| Rodent Baiting/Rat Complaint | 2,907 | 2,763 |
| Street Light Out Complaint | 2,174 | 1,035 |
| Garbage Cart Maintenance | 1,375 | 1,256 |
| Tree Trim Request | 1,320 | 1,156 |
| Belmont Cragin | ||
| Graffiti Removal Request | 2,613 | 2,515 |
| Garbage Cart Maintenance | 1,984 | 1,829 |
| Tree Trim Request | 1,862 | 1,684 |
| Rodent Baiting/Rat Complaint | 1,763 | 1,714 |
| Street Light Out Complaint | 1,729 | 990 |
| West Ridge | ||
| Rodent Baiting/Rat Complaint | 1,539 | 1,477 |
| Graffiti Removal Request | 1,411 | 1,381 |
| Street Light Out Complaint | 1,125 | 604 |
| Pothole in Street Complaint | 990 | 744 |
| Garbage Cart Maintenance | 935 | 857 |
| South Lawndale | ||
| Graffiti Removal Request | 4,860 | 4,741 |
| Street Light Out Complaint | 1,701 | 858 |
| Open Fire Hydrant Complaint | 1,252 | 598 |
| Garbage Cart Maintenance | 1,223 | 1,129 |
| Rodent Baiting/Rat Complaint | 1,099 | 1,066 |
| Logan Square | ||
| Graffiti Removal Request | 7,134 | 6,960 |
| Rodent Baiting/Rat Complaint | 2,140 | 2,068 |
| Street Light Out Complaint | 1,584 | 857 |
| Garbage Cart Maintenance | 1,190 | 1,094 |
| Pothole in Street Complaint | 1,050 | 783 |
| Portage Park | ||
| Street Light Out Complaint | 2,413 | 1,305 |
| Rodent Baiting/Rat Complaint | 2,122 | 2,027 |
| Graffiti Removal Request | 1,583 | 1,541 |
| Garbage Cart Maintenance | 1,484 | 1,367 |
| Tree Trim Request | 1,290 | 1,134 |
| Near West Side | ||
| Graffiti Removal Request | 2,594 | 2,559 |
| Pothole in Street Complaint | 1,254 | 940 |
| Traffic Signal Out Complaint | 1,191 | 981 |
| Street Light Out Complaint | 1,128 | 628 |
| Sign Repair Request - All Other Signs | 1,116 | 1,021 |
| Humboldt Park | ||
| Graffiti Removal Request | 2,366 | 2,274 |
| Garbage Cart Maintenance | 1,605 | 1,456 |
| Weed Removal Request | 1,447 | 1,408 |
| Rodent Baiting/Rat Complaint | 1,238 | 1,188 |
| Street Light Out Complaint | 1,066 | 563 |
| Irving Park | ||
| Graffiti Removal Request | 3,450 | 3,289 |
| Street Light Out Complaint | 1,684 | 820 |
| Rodent Baiting/Rat Complaint | 1,595 | 1,500 |
| Pothole in Street Complaint | 1,206 | 755 |
| Tree Debris Clean-Up Request | 1,149 | 1,059 |
| Chicago Lawn | ||
| Graffiti Removal Request | 1,759 | 1,703 |
| Garbage Cart Maintenance | 1,745 | 1,577 |
| Street Light Out Complaint | 1,470 | 649 |
| Pothole in Street Complaint | 838 | 702 |
| Rodent Baiting/Rat Complaint | 829 | 777 |
| Auburn Gresham | ||
| Weed Removal Request | 1,926 | 1,853 |
| Garbage Cart Maintenance | 1,832 | 1,653 |
| Street Light Out Complaint | 1,807 | 791 |
| Rodent Baiting/Rat Complaint | 1,185 | 1,140 |
| Sign Repair Request - All Other Signs | 891 | 881 |
| Roseland | ||
| Street Light Out Complaint | 2,512 | 1,020 |
| Weed Removal Request | 1,932 | 1,629 |
| Garbage Cart Maintenance | 1,896 | 1,700 |
| Tree Debris Clean-Up Request | 1,312 | 1,191 |
| Rodent Baiting/Rat Complaint | 1,044 | 962 |
| New City | ||
| Weed Removal Request | 3,572 | 3,524 |
| Graffiti Removal Request | 2,580 | 2,505 |
| Street Light Out Complaint | 1,631 | 726 |
| Open Fire Hydrant Complaint | 1,276 | 764 |
| Garbage Cart Maintenance | 1,025 | 918 |
| West Englewood | ||
| Weed Removal Request | 7,634 | 7,390 |
| Garbage Cart Maintenance | 1,489 | 1,347 |
| Rodent Baiting/Rat Complaint | 1,275 | 1,227 |
| Street Light Out Complaint | 1,121 | 563 |
| Tree Debris Clean-Up Request | 805 | 761 |
| Englewood | ||
| Weed Removal Request | 7,671 | 7,514 |
| Street Light Out Complaint | 873 | 457 |
| Rodent Baiting/Rat Complaint | 805 | 768 |
| Garbage Cart Maintenance | 794 | 723 |
| Sign Repair Request - All Other Signs | 519 | 515 |
| Estimated community area population taken from American Community Survey 2014-2018 via https://datahub.cmap.illinois.gov/dataset/community-data-snapshots-raw-data/resource/8c4e096e-c90c-4bef-9cf1-9028d094296e | ||
| Community area square mileage taken from https://en.wikipedia.org/wiki/Community_areas_in_Chicago | ||
| 311 service request data from https://data.cityofchicago.org/Service-Requests/311-Service-Requests/v6vf-nfxy | ||
|
1
311 information requests and complaints about aircraft noise were excluded
2
Total Requests is the number of requests filed, which can include requests about the same parent issue (e.g. multiple requests about a particular street light malfunctioning)
3
Parent Issues is the number of distinct issues of a given type (e.g. if there were 20 requests about 2 street lights malfunctioning, Parent Issues is 2)
|
||
\(~\)
This is a lot cleaner than I expected it to be! This table is very long, even after filtering to a subset of community areas; it would be nice to be able to page between community area lists (which may be feasible with {reactable} or {DT}), but for a website or blog this formatting is still digestible, I think.
Building well-designed tables in R is easy with the tidyverse and {gt} library. These tables had clear titles, footnotes, and source information; used striping, column alignments, and precision to facilitate readability; and minimized non-data ink to reduce distraction and focus attention on the data.
These tables were certainly not perfect, however. As noted above, these tables are pretty long; making them sortable or filterable would certainly enable readers to access information more quickly and easily. In addition, some of these data would be better presented in a map. But the goal here was to use {gt} to make tables that wouldn’t look out of place in a professional publication, and I think mission accomplished here.