Overview

Tables are often treated as lesser forms of data visualization, but that is unfair: well-designed tables can support data-driven narratives, and look dapper in the process. Smarter people than me have laid out some principles for creating effective tables - see Show Me the Numbers by Stephen Few, and this 2020 paper from Jonathan A. Schwabish (summarized on this thread), which include:

The {gt} package makes it easy to implement all of this in R, using the familiar syntax of the tidyverse. For a nice tutorial, I recommend the checking out The Mockup Blog.

I like working with Chicago 311 service request data, so this document makes some tables using those data, which all available from the Chicago Data Portal. The data in these tables includes service requests created on or between 31 Oct 2019 and 31 Oct 2020, and excludes the two most common requests, 311 information requests and aircraft noise complaints, because they are not associated with specific geographic locations in the available 311 request data: all information requests are associated with the main 311 office location, and all aircraft noise complaints are associated with airport locations.

library(tidyverse)
library(xml2)
library(rvest)
library(gt)

# read 311 data

raw_311 <- try(readRDS('./data/lastyear_311_requests.rds'), silent = TRUE)

if (class(raw_311) == 'try-error') {
  message('no local version of 311 data, grabbing from github...')
  raw_311 <- readRDS(
  url("https://github.com/jimtheflash/rstudio.tablecontest.2020/raw/main/data/lastyear_311_requests.rds","rb")
  )
}

# get CA lookup for merging
ca_lu <- read_html('https://en.wikipedia.org/wiki/Community_areas_in_Chicago') %>%
  html_table() %>%
  `[[`(1) %>%
  transmute(community_area = as.numeric(`Number[8]`),
            ca_name = `Name[8]`,
            ca_sq_mi = as.numeric(`Area (sq mi.)[10]`))

# read latest CA population data
raw_ca <- read_csv("https://github.com/jimtheflash/rstudio.tablecontest.2020/raw/main/data/ReferenceCCAProfiles20142018.csv") %>%
  transmute(ca_name = GEOG,
            ca_est_pop = floor(as.numeric(TOT_POP)))

# tidy and merge for tables
tidy_311 <- raw_311 %>%
  # get rid of the top 2 types of service requests
  filter(sr_type != '311 INFORMATION ONLY CALL',
         sr_type != 'Aircraft Noise Complaint') %>%
  mutate(grouping_sr_number = if_else(parent_sr_number == "", sr_number, parent_sr_number)) %>%
  select(grouping_sr_number,
         sr_number,
         sr_type,
         created_date,
         community_area) %>%
  group_by(grouping_sr_number) %>%
  mutate(distinct_requests_by_grouping_sr_number = n_distinct(sr_number)) %>%
  ungroup() %>%
  inner_join(ca_lu, by = 'community_area') %>%
  inner_join(raw_ca, by = "ca_name")

Most Frequent Service Requests

This table summarizes the most frequent service requests by type, so we can identify the most frequent types of 311 service requests. As explained in the table, a service request record arises when an individual contacts 311 to report an issue; complaints that reference the same issue are assigned the same parent issue. For instance, if there are three service requests about the same abandoned vehicle, that would be reflected in the tables below as three service requests, and one parent issue.

tidy_311 %>%
  group_by(`Service Request Type` = sr_type) %>%
  summarise(`Total Requests` = n_distinct(sr_number),
            `Parent Issues` = n_distinct(grouping_sr_number)) %>%
  mutate(`Requests Per Parent Issue` = `Total Requests` / `Parent Issues`) %>%
  filter(`Total Requests` >= 10000) %>%
  arrange(desc(`Total Requests`)) %>%
  gt() %>%
  tab_style(
    style = cell_text(weight = "bold"),
    locations = list(cells_title(),
                     cells_column_labels(columns = gt::everything())
  )) %>%
  tab_header(title = "Most Frequent 311 Service Requests in Chicago",
             subtitle = "Includes request types with at least 10,000 total requests from Oct 31 2019 through Oct 31 2020") %>%
  opt_align_table_header("left") %>%
  tab_footnote(footnote = "311 information requests and complaints about aircraft noise were excluded",
               locations = cells_title(groups = "subtitle")) %>%
  tab_footnote(footnote = "Total Requests is the number of requests filed, which can include requests about the same parent issue (e.g. multiple requests about a particular street light malfunctioning)",
               locations = cells_column_labels(vars(`Total Requests`))) %>%
  tab_footnote(footnote = "Parent Issues is the number of distinct issues of a given type (e.g. if there were 20 requests about 2 street lights malfunctioning, Parent Issues is 2)",
               locations = cells_column_labels(vars(`Parent Issues`))) %>%
  tab_source_note("311 service request data from https://data.cityofchicago.org/Service-Requests/311-Service-Requests/v6vf-nfxy") %>%
  fmt_number(columns = vars(`Total Requests`, `Parent Issues`),
             decimals = 0) %>%
  fmt_number(columns = vars(`Requests Per Parent Issue`),
             decimals = 3) %>%
  cols_align(align = 'right', columns = vars(`Total Requests`, `Parent Issues`, `Requests Per Parent Issue`)) %>%
  opt_row_striping()
Most Frequent 311 Service Requests in Chicago
Includes request types with at least 10,000 total requests from Oct 31 2019 through Oct 31 20201
Service Request Type Total Requests2 Parent Issues3 Requests Per Parent Issue
Graffiti Removal Request 99,725 97,064 1.027
Street Light Out Complaint 77,764 35,390 2.197
Garbage Cart Maintenance 55,997 51,182 1.094
Weed Removal Request 53,376 51,353 1.039
Rodent Baiting/Rat Complaint 51,807 49,534 1.046
Pothole in Street Complaint 41,493 32,862 1.263
Tree Trim Request 38,547 33,688 1.144
Tree Debris Clean-Up Request 34,085 31,479 1.083
Sign Repair Request - All Other Signs 32,930 31,370 1.050
Abandoned Vehicle Complaint 31,865 31,865 1.000
Traffic Signal Out Complaint 21,338 16,449 1.297
Alley Light Out Complaint 19,385 11,198 1.731
Building Violation 18,563 16,223 1.144
Tree Removal Request 15,758 14,272 1.104
Sanitation Code Violation 14,464 13,867 1.043
Sewer Cleaning Inspection Request 12,399 11,405 1.087
Dead Animal Pick-Up Request 12,295 10,170 1.209
Fly Dumping Complaint 10,796 10,057 1.073
Business Complaints / Reopening Issue 10,779 9,449 1.141
Stray Animal Complaint 10,027 8,988 1.116
311 service request data from https://data.cityofchicago.org/Service-Requests/311-Service-Requests/v6vf-nfxy

1 311 information requests and complaints about aircraft noise were excluded

2 Total Requests is the number of requests filed, which can include requests about the same parent issue (e.g. multiple requests about a particular street light malfunctioning)

3 Parent Issues is the number of distinct issues of a given type (e.g. if there were 20 requests about 2 street lights malfunctioning, Parent Issues is 2)

\(~\)

Nothing fancy here, or necessary. The table header, footnotes, and source information are easy to read and don’t distract from the rest of the data.

Service Requests And Community Areas

This table summarizes 311 requests by community areas, so we can identify which community areas are submitting the most service requests. I thought total requests per person would be interesting as a way to gauge how frequently residents of an area were engaging with the 311 service, and I thought that parent requests per square mile might be an interesting way of estimating how many issues need addressing across community areas. This table is sorted by the Total Requests column.

tidy_311 %>%
  group_by(`Community Area` = ca_name) %>%
  summarise(`Est. Population` = max(ca_est_pop),
            `Area (Square Miles)` = max(ca_sq_mi),
            `Total Requests` = n_distinct(sr_number),
            `Parent Issues` = n_distinct(grouping_sr_number)) %>%
  mutate(`Requests Per Person` = `Total Requests` / `Est. Population`,
         `Parent Issues Per Sq. Mi.` = `Parent Issues` / `Area (Square Miles)`) %>%
  filter(`Total Requests` >= 15000) %>%
  arrange(desc(`Total Requests`)) %>%
  gt() %>%
  tab_style(
    style = cell_text(size = 'small'),
    locations = list(cells_column_labels(columns = gt::everything()),
                     cells_body())
  ) %>%
  tab_style(
    style = cell_text(weight = "bold"),
    locations = list(cells_title(),
                     cells_column_labels(columns = gt::everything())
  )) %>%
  tab_header(title = "311 Service Requests By Chicago Community Area",
             subtitle = "Includes community areas with at least 15,000 total requests from Oct 31 2019 through Oct 31 2020") %>%
  opt_align_table_header("left") %>%
  tab_footnote(footnote = "311 information requests and complaints about aircraft noise were excluded",
               locations = cells_title(groups = "subtitle")) %>%
  tab_footnote(footnote = "Total Requests is the number of requests filed, which can include requests about the same parent issue (e.g. multiple requests about a particular street light malfunctioning)",
               locations = cells_column_labels(vars(`Total Requests`))) %>%
  tab_footnote(footnote = "Parent Issues is the number of distinct issues of a given type (e.g. if there were 20 requests about 2 street lights malfunctioning, Parent Issues is 2)",
               locations = cells_column_labels(vars(`Parent Issues`))) %>%
  tab_source_note("Estimated community area population taken from American Community Survey 2014-2018 via https://datahub.cmap.illinois.gov/dataset/community-data-snapshots-raw-data/resource/8c4e096e-c90c-4bef-9cf1-9028d094296e") %>%
  tab_source_note("Community area square mileage taken from https://en.wikipedia.org/wiki/Community_areas_in_Chicago") %>%
  tab_source_note("311 service request data from https://data.cityofchicago.org/Service-Requests/311-Service-Requests/v6vf-nfxy") %>%
  tab_options(footnotes.font.size = "small",
              source_notes.font.size = "small") %>%
  fmt_number(columns = vars(`Est. Population`, `Total Requests`, `Parent Issues`, `Parent Issues Per Sq. Mi.`),
             decimals = 0) %>%
  fmt_number(columns = vars(`Area (Square Miles)`),
             decimals = 2) %>%
  fmt_number(columns = vars(`Requests Per Person`),
             decimals = 3) %>%
  cols_align(align = "right", 
             columns = vars(`Area (Square Miles)`, `Est. Population`, `Total Requests`, `Parent Issues`, `Requests Per Person`,`Parent Issues Per Sq. Mi.`)) %>%
  opt_row_striping()
311 Service Requests By Chicago Community Area
Includes community areas with at least 15,000 total requests from Oct 31 2019 through Oct 31 20201
Community Area Est. Population Area (Square Miles) Total Requests2 Parent Issues3 Requests Per Person Parent Issues Per Sq. Mi.
West Town 84,255 4.58 29,211 25,653 0.347 5,601
Austin 94,762 7.15 28,906 24,917 0.305 3,485
Logan Square 72,724 3.59 25,321 22,715 0.348 6,327
West Englewood 28,236 3.15 20,996 19,158 0.744 6,082
Portage Park 64,954 3.95 20,374 17,316 0.314 4,384
Belmont Cragin 80,648 3.91 20,143 17,519 0.250 4,481
Irving Park 53,665 3.21 20,025 17,086 0.373 5,323
Roseland 41,106 4.82 19,681 16,252 0.479 3,372
South Lawndale 74,943 4.59 18,960 16,095 0.253 3,507
New City 39,463 4.83 18,925 16,552 0.480 3,427
Auburn Gresham 45,271 3.77 18,120 15,609 0.400 4,140
Lake View 100,547 3.12 17,696 15,990 0.176 5,125
Englewood 23,792 3.07 17,453 16,040 0.734 5,225
Humboldt Park 56,161 3.60 16,917 14,855 0.301 4,126
Near West Side 62,733 5.69 15,986 14,059 0.255 2,471
West Ridge 77,212 3.53 15,838 13,791 0.205 3,907
Chicago Lawn 52,003 3.53 15,717 13,504 0.302 3,825
Estimated community area population taken from American Community Survey 2014-2018 via https://datahub.cmap.illinois.gov/dataset/community-data-snapshots-raw-data/resource/8c4e096e-c90c-4bef-9cf1-9028d094296e
Community area square mileage taken from https://en.wikipedia.org/wiki/Community_areas_in_Chicago
311 service request data from https://data.cityofchicago.org/Service-Requests/311-Service-Requests/v6vf-nfxy

1 311 information requests and complaints about aircraft noise were excluded

2 Total Requests is the number of requests filed, which can include requests about the same parent issue (e.g. multiple requests about a particular street light malfunctioning)

3 Parent Issues is the number of distinct issues of a given type (e.g. if there were 20 requests about 2 street lights malfunctioning, Parent Issues is 2)

\(~\)

This table, despite being fairly long, is still pretty digestible. While I think the default sorting (Total Requests descending) is sensible, it is also feasible that readers would want to sort by other criteria. In these cases, a table made with the {reactable} or {DT} libraries, which have sortable columns by default, may be a better option. And obviously an interactive map would be a reasonable way to visualize these data!

Top Service Requests by Community Area

This table summarizes the top five service requests types by community area, sorted by community area population. This makes it possible to identify community areas with unique mixes of service request types.

tidy_311 %>%
  group_by(ca_name) %>%
  filter(n_distinct(sr_number) >= 15000) %>%
  ungroup() %>%
  group_by(`Community Area` = ca_name, `Service Request Type` = sr_type) %>%
  summarise(pop = max(ca_est_pop),
            `Total Requests` = n_distinct(sr_number),
            `Parent Issues` = n_distinct(grouping_sr_number)) %>%
  arrange(desc(pop), desc(`Total Requests`)) %>%
  mutate(rank = row_number()) %>%
  filter(rank <= 5) %>%
  select(-pop, -rank) %>%
  gt() %>%
  tab_style(
    style = cell_text(size = "medium"),
    locations = cells_title()
  ) %>%
  tab_style(
    style = cell_text(weight = "bold"),
    locations = list(cells_title(),
                     cells_column_labels(columns = gt::everything()),
                     cells_row_groups()
  )) %>%
  tab_style(
    style = cell_fill(alpha = 0.5),
    locations = cells_row_groups()
  ) %>%
  tab_header(title = "Top 311 Service Request Types by Chicago Community Area",
             subtitle = "Sorted by Community Area population; includes community areas with at least 15,000 total requests from Oct 31 2019 through Oct 31 2020") %>%
  opt_align_table_header("left") %>%
  tab_footnote(footnote = "311 information requests and complaints about aircraft noise were excluded",
               locations = cells_title(groups = "subtitle")) %>%
  tab_footnote(footnote = "Total Requests is the number of requests filed, which can include requests about the same parent issue (e.g. multiple requests about a particular street light malfunctioning)",
               locations = cells_column_labels(vars(`Total Requests`))) %>%
  tab_footnote(footnote = "Parent Issues is the number of distinct issues of a given type (e.g. if there were 20 requests about 2 street lights malfunctioning, Parent Issues is 2)",
               locations = cells_column_labels(vars(`Parent Issues`))) %>%
  tab_source_note("Estimated community area population taken from American Community Survey 2014-2018 via https://datahub.cmap.illinois.gov/dataset/community-data-snapshots-raw-data/resource/8c4e096e-c90c-4bef-9cf1-9028d094296e") %>%
  tab_source_note("Community area square mileage taken from https://en.wikipedia.org/wiki/Community_areas_in_Chicago") %>%
  tab_source_note("311 service request data from https://data.cityofchicago.org/Service-Requests/311-Service-Requests/v6vf-nfxy") %>%
  tab_options(footnotes.font.size = "small",
              source_notes.font.size = "small") %>%
  fmt_number(columns = vars(`Total Requests`, `Parent Issues`),
             decimals = 0) %>%
  cols_align(align = "right", columns = vars(`Total Requests`, `Parent Issues`))
Top 311 Service Request Types by Chicago Community Area
Sorted by Community Area population; includes community areas with at least 15,000 total requests from Oct 31 2019 through Oct 31 20201
Service Request Type Total Requests2 Parent Issues3
Lake View
Graffiti Removal Request 3,343 3,275
Rodent Baiting/Rat Complaint 2,097 1,997
Sign Repair Request - All Other Signs 840 775
Tree Trim Request 831 721
Tree Debris Clean-Up Request 725 680
Austin
Garbage Cart Maintenance 2,942 2,638
Street Light Out Complaint 2,578 1,290
Weed Removal Request 2,098 2,066
Abandoned Vehicle Complaint 1,691 1,691
Tree Debris Clean-Up Request 1,555 1,427
West Town
Graffiti Removal Request 7,494 7,298
Rodent Baiting/Rat Complaint 2,907 2,763
Street Light Out Complaint 2,174 1,035
Garbage Cart Maintenance 1,375 1,256
Tree Trim Request 1,320 1,156
Belmont Cragin
Graffiti Removal Request 2,613 2,515
Garbage Cart Maintenance 1,984 1,829
Tree Trim Request 1,862 1,684
Rodent Baiting/Rat Complaint 1,763 1,714
Street Light Out Complaint 1,729 990
West Ridge
Rodent Baiting/Rat Complaint 1,539 1,477
Graffiti Removal Request 1,411 1,381
Street Light Out Complaint 1,125 604
Pothole in Street Complaint 990 744
Garbage Cart Maintenance 935 857
South Lawndale
Graffiti Removal Request 4,860 4,741
Street Light Out Complaint 1,701 858
Open Fire Hydrant Complaint 1,252 598
Garbage Cart Maintenance 1,223 1,129
Rodent Baiting/Rat Complaint 1,099 1,066
Logan Square
Graffiti Removal Request 7,134 6,960
Rodent Baiting/Rat Complaint 2,140 2,068
Street Light Out Complaint 1,584 857
Garbage Cart Maintenance 1,190 1,094
Pothole in Street Complaint 1,050 783
Portage Park
Street Light Out Complaint 2,413 1,305
Rodent Baiting/Rat Complaint 2,122 2,027
Graffiti Removal Request 1,583 1,541
Garbage Cart Maintenance 1,484 1,367
Tree Trim Request 1,290 1,134
Near West Side
Graffiti Removal Request 2,594 2,559
Pothole in Street Complaint 1,254 940
Traffic Signal Out Complaint 1,191 981
Street Light Out Complaint 1,128 628
Sign Repair Request - All Other Signs 1,116 1,021
Humboldt Park
Graffiti Removal Request 2,366 2,274
Garbage Cart Maintenance 1,605 1,456
Weed Removal Request 1,447 1,408
Rodent Baiting/Rat Complaint 1,238 1,188
Street Light Out Complaint 1,066 563
Irving Park
Graffiti Removal Request 3,450 3,289
Street Light Out Complaint 1,684 820
Rodent Baiting/Rat Complaint 1,595 1,500
Pothole in Street Complaint 1,206 755
Tree Debris Clean-Up Request 1,149 1,059
Chicago Lawn
Graffiti Removal Request 1,759 1,703
Garbage Cart Maintenance 1,745 1,577
Street Light Out Complaint 1,470 649
Pothole in Street Complaint 838 702
Rodent Baiting/Rat Complaint 829 777
Auburn Gresham
Weed Removal Request 1,926 1,853
Garbage Cart Maintenance 1,832 1,653
Street Light Out Complaint 1,807 791
Rodent Baiting/Rat Complaint 1,185 1,140
Sign Repair Request - All Other Signs 891 881
Roseland
Street Light Out Complaint 2,512 1,020
Weed Removal Request 1,932 1,629
Garbage Cart Maintenance 1,896 1,700
Tree Debris Clean-Up Request 1,312 1,191
Rodent Baiting/Rat Complaint 1,044 962
New City
Weed Removal Request 3,572 3,524
Graffiti Removal Request 2,580 2,505
Street Light Out Complaint 1,631 726
Open Fire Hydrant Complaint 1,276 764
Garbage Cart Maintenance 1,025 918
West Englewood
Weed Removal Request 7,634 7,390
Garbage Cart Maintenance 1,489 1,347
Rodent Baiting/Rat Complaint 1,275 1,227
Street Light Out Complaint 1,121 563
Tree Debris Clean-Up Request 805 761
Englewood
Weed Removal Request 7,671 7,514
Street Light Out Complaint 873 457
Rodent Baiting/Rat Complaint 805 768
Garbage Cart Maintenance 794 723
Sign Repair Request - All Other Signs 519 515
Estimated community area population taken from American Community Survey 2014-2018 via https://datahub.cmap.illinois.gov/dataset/community-data-snapshots-raw-data/resource/8c4e096e-c90c-4bef-9cf1-9028d094296e
Community area square mileage taken from https://en.wikipedia.org/wiki/Community_areas_in_Chicago
311 service request data from https://data.cityofchicago.org/Service-Requests/311-Service-Requests/v6vf-nfxy

1 311 information requests and complaints about aircraft noise were excluded

2 Total Requests is the number of requests filed, which can include requests about the same parent issue (e.g. multiple requests about a particular street light malfunctioning)

3 Parent Issues is the number of distinct issues of a given type (e.g. if there were 20 requests about 2 street lights malfunctioning, Parent Issues is 2)

\(~\)

This is a lot cleaner than I expected it to be! This table is very long, even after filtering to a subset of community areas; it would be nice to be able to page between community area lists (which may be feasible with {reactable} or {DT}), but for a website or blog this formatting is still digestible, I think.

Summary

Building well-designed tables in R is easy with the tidyverse and {gt} library. These tables had clear titles, footnotes, and source information; used striping, column alignments, and precision to facilitate readability; and minimized non-data ink to reduce distraction and focus attention on the data.

These tables were certainly not perfect, however. As noted above, these tables are pretty long; making them sortable or filterable would certainly enable readers to access information more quickly and easily. In addition, some of these data would be better presented in a map. But the goal here was to use {gt} to make tables that wouldn’t look out of place in a professional publication, and I think mission accomplished here.