In data science, a well-structured table can often convey deeper insights than a standard chart. This tutorial demonstrates how to leverage the gt and gtExtras packages to transform raw data frames into publication-quality tables.
What will we learn?
Automated Summaries: Generating instant visual overviews of any dataset.
Inline Graphics: Inserting distribution plots (sparklines) and bar charts directly into table cells.
Professional Theming: Applying styles from world-class publications like ESPN, NY Times, and The Guardian.
1. Environment Setup and Instant Summaries
The gt_plt_summary() function provides a high-level visual audit of your data, showing distributions, missing values, and statistics in one grid.
Code
# Load required librarieslibrary(svglite)library(gtExtras)library(tidyverse)library(RColorBrewer)library(gt)library(gapminder)# Create an instant visual summary of the Iris datasetiris %>%gt_plt_summary(title ="Iris Dataset Summary")
Iris Dataset Summary
150 rows x 5 cols
Column
Plot Overview
Missing
Mean
Median
SD
Sepal.Length
0.0%
5.8
5.8
0.8
Sepal.Width
0.0%
3.1
3.0
0.4
Petal.Length
0.0%
3.8
4.3
1.8
Petal.Width
0.0%
1.2
1.3
0.8
Species
setosa, versicolor and virginica
0.0%
—
—
—
2. Inserting Graphics into Tables
Beyond raw numbers, we can embed “sparklines” to show the shape of the data. This requires organizing the data into list columns before passing it to the table.
Code
# Prepare data with a list-column for distributionmtcars_summary <- mtcars %>%group_by(cyl) %>%summarize(Median =round(median(mpg), 1),Mean =round(mean(mpg), 1),Distribution =list(mpg))# Visualize with sparklines (gt_plt_dist)mtcars_summary %>%gt() %>%gt_plt_dist(Distribution) %>%gt_theme_guardian() %>%tab_header(title ="Miles Per Gallon Statistics",subtitle ="Comparing performance by cylinder count")
Miles Per Gallon Statistics
Comparing performance by cylinder count
cyl
Median
Mean
Distribution
4
26.0
26.7
6
19.7
19.7
8
15.2
15.1
3. Advanced Country Analysis (Gapminder)
Step 1: Data Preparation
We will filter for the top 10 Asian countries by GDP and prepare a base table object.
Code
# Data Preparationraw_data <- gapminder %>%rename(Country = country) %>%filter(continent =="Asia") %>%group_by(Country) %>%summarise("GDP per capita"=round(mean(gdpPercap)),"Population size"=round(mean(pop)),"Life expectancy"=list(lifeExp)) %>%arrange(desc(`GDP per capita`)) %>%head(10)# Create base gt table object# Note: We store the gt object to add more layers laterbase_table <- raw_data %>%gt() %>%gt_plt_dist("Life expectancy") %>%tab_header(title ="The GDP and Population Size of Asia") %>%cols_align(align ="left")base_table %>%gt_theme_espn()
The GDP and Population Size of Asia
Country
GDP per capita
Population size
Life expectancy
Kuwait
65333
1206496
Saudi Arabia
20262
12478368
Bahrain
18078
373913
Japan
17751
111758808
Singapore
17425
2667817
Hong Kong, China
16229
4792259
Israel
14161
3845611
Oman
12139
1438205
Taiwan
10225
16874724
Korea, Rep.
8217
36499386
Step 2: Adding Percentage Bars and Heatmaps
Using gt_plt_bar_pct, we can represent numerical values as horizontal bar charts within the cells.
To draw the reader’s eye to specific outliers or points of interest, we use gt_highlight_rows().
Code
# Highlighting specific countries (Bangladesh and China)asian_table <- enhanced_plot %>%gt_highlight_rows(rows = Country %in%c("Bangladesh", "China"),fill ="#f2e6ff",alpha =0.8)asian_table
The GDP and Population Size of Asia
Country
GDP per capita
Population size
Life expectancy
Kuwait
1206496
Saudi Arabia
12478368
Bahrain
373913
Japan
111758808
Singapore
2667817
Hong Kong, China
4792259
Israel
3845611
Oman
1438205
Taiwan
16874724
Korea, Rep.
36499386
5. Exploring Professional Themes
Using Quarto’s panel-tabset, we can compare different aesthetic styles easily.