1. Introduction

In spreadsheet software like Excel, Pivot Tables are the go-to tool for summarizing large datasets. In R, we achieve the same (and much more) using the tidyverse ecosystem, specifically the dplyr package for aggregation and the tidyr package for reshaping.

Why Pivot?

Summarization: Turning thousands of rows into a meaningful summary table (e.g., Average sales per region).
Tidying Data: Converting data into a format suitable for analysis or plotting.
Reporting: Creating wide-format tables that are easier for humans to read.

2. The Dataset: Palmer Penguins

We will use the penguins dataset. It contains measurements for penguin species, islands in the Palmer Archipelago, size (flipper length, body mass), and sex.

# Load and view the first few rows
data("penguins")
head(penguins) %>% 
  kable() %>% 
  kable_styling(bootstrap_options = c("striped", "hover"))

species	island	bill_length_mm	bill_depth_mm	flipper_length_mm	body_mass_g	sex	year
Adelie	Torgersen	39.1	18.7	181	3750	male	2007
Adelie	Torgersen	39.5	17.4	186	3800	female	2007
Adelie	Torgersen	40.3	18.0	195	3250	female	2007
Adelie	Torgersen	NA	NA	NA	NA	NA	2007
Adelie	Torgersen	36.7	19.3	193	3450	female	2007
Adelie	Torgersen	39.3	20.6	190	3650	male	2007

3. Part A: Summarizing Data (The “Excel” Pivot)

In R, a pivot table summary is created by combining group_by() and summarize().

Example 1: Basic Summary

Let’s find the average body mass and count of penguins per species.

pivot_summary <- penguins %>%
  filter(!is.na(sex)) %>% # Remove rows with missing sex
  group_by(species, island) %>%
  summarize(
    avg_mass_g = mean(body_mass_g, na.rm = TRUE),
    count = n()
  ) %>%
  ungroup()

pivot_summary %>% 
  kable(caption = "Summary Pivot Table") %>% 
  kable_styling(full_width = F)

Summary Pivot Table
species	island	avg_mass_g	count
Adelie	Biscoe	3709.659	44
Adelie	Dream	3701.364	55
Adelie	Torgersen	3708.511	47
Chinstrap	Dream	3733.088	68
Gentoo	Biscoe	5092.437	119

Visualizing the Summary

Pivot tables are often precursors to visualizations.

ggplot(pivot_summary, aes(x = species, y = avg_mass_g, fill = island)) +
  geom_bar(stat = "identity", position = "dodge") +
  theme_minimal() +
  labs(title = "Average Penguin Body Mass", y = "Mass (g)", x = "Species")

Figure 1: Average Body Mass by Species and Island

4. Part B: Reshaping Data

Often, “pivoting” refers to changing the layout of the data from Long to Wide or vice-versa.

4.1 `pivot_wider()`: Making it Readable

Use pivot_wider() when you want to transform categories into columns. This is common for creating comparison reports.

Task: Create a table where species are rows, islands are columns, and the values are the average body mass.

wide_table <- pivot_summary %>%
  select(species, island, avg_mass_g) %>%
  pivot_wider(
    names_from = island, 
    values_from = avg_mass_g
  )

wide_table %>% 
  kable(caption = "Wide Format Table (Human Readable)") %>% 
  kable_styling(full_width = F)

Wide Format Table (Human Readable)
species	Biscoe	Dream	Torgersen
Adelie	3709.659	3701.364	3708.511
Chinstrap	NA	3733.088	NA
Gentoo	5092.437	NA	NA

4.2 `pivot_longer()`: Making it Analysis-Ready

Most R functions (like ggplot2) require Long Format data. Let’s assume we have data where “Year” is spread across columns and we need to collapse it.

Example:

# Creating a dummy 'wide' dataset
untidy_data <- data.frame(
  Country = c("USA", "Canada", "Mexico"),
  `2020` = c(100, 150, 120),
  `2021` = c(120, 160, 130),
  check.names = FALSE
)

# Pivoting Longer
tidy_data <- untidy_data %>%
  pivot_longer(
    cols = c(`2020`, `2021`), 
    names_to = "Year", 
    values_to = "GDP_Growth"
  )

tidy_data %>% kable() %>% kable_styling(full_width = F)

Country	Year	GDP_Growth
USA	2020	100
USA	2021	120
Canada	2020	150
Canada	2021	160
Mexico	2020	120
Mexico	2021	130

5. Advanced Pivot: Multiple Values

You can pivot multiple value columns at once. Let’s get both the mean and the max body mass into a wide format.

complex_pivot <- penguins %>%
  filter(!is.na(sex)) %>%
  group_by(species, sex) %>%
  summarize(
    avg_mass = mean(body_mass_g, na.rm = TRUE),
    max_mass = max(body_mass_g, na.rm = TRUE)
  ) %>%
  pivot_wider(
    names_from = sex,
    values_from = c(avg_mass, max_mass)
  )

complex_pivot %>% kable() %>% kable_styling(bootstrap_options = "condensed")

species	avg_mass_female	avg_mass_male	max_mass_female	max_mass_male
Adelie	3368.836	4043.493	3900	4775
Chinstrap	3527.206	3938.971	4150	4800
Gentoo	4679.741	5484.836	5200	6300

6. Summary Cheat Sheet

Goal	Function
Group Data	`group_by(column_name)`
Calculate Stats	`summarize(new_var = mean(value))`
Rows to Columns	`pivot_wider(names_from = ..., values_from = ...)`
Columns to Rows	`pivot_longer(cols = ..., names_to = ..., values_to = ...)`

7. Exercise for Students

Using the penguins dataset, create a pivot table showing the median flipper length for each species and sex.
Filter out NA values.
Pivot the result so that sex (male/female) are the column headers.

Solution Template

# Try this yourself!
penguins %>%
  filter(!is.na(sex)) %>%
  group_by(species, sex) %>%
  summarize(med_flipper = median(flipper_length_mm)) %>%
  pivot_wider(names_from = sex, values_from = med_flipper)

End of Lecture Notes. ```

How to use this in RStudio:

Open RStudio.
Go to File > New File > R Markdown…
Delete all the default text in the new window.
Paste the code provided above.
Save the file as Pivot_Table_Lecture.Rmd.
Click the Knit button (the ball of yarn icon) at the top of the editor.

Key Features of this Script:

Real Data: It uses the palmerpenguins dataset, which is a modern replacement for the Iris dataset.
Visuals: It includes a ggplot2 bar chart to show how pivot results are used in plotting.
Formatting: It uses kable and kableExtra to make the tables look like professional lecture handouts.
Comprehensive: It covers the two meanings of “pivoting” (aggregation and reshaping).

Lecture: Mastering Pivot Tables and Data Reshaping in R

fadumo Abdi Faarah

2026-04-16

1. Introduction

Why Pivot?

2. The Dataset: Palmer Penguins

3. Part A: Summarizing Data (The “Excel” Pivot)

Example 1: Basic Summary

Visualizing the Summary

4. Part B: Reshaping Data

4.1 `pivot_wider()`: Making it Readable

4.2 `pivot_longer()`: Making it Analysis-Ready

5. Advanced Pivot: Multiple Values

6. Summary Cheat Sheet

7. Exercise for Students

Solution Template

How to use this in RStudio:

Key Features of this Script:

Lecture: Mastering Pivot Tables and Data Reshaping in R

fadumo Abdi Faarah

2026-04-16

1. Introduction

Why Pivot?

2. The Dataset: Palmer Penguins

3. Part A: Summarizing Data (The “Excel” Pivot)

Example 1: Basic Summary

Visualizing the Summary

4. Part B: Reshaping Data

4.1 pivot_wider(): Making it Readable

4.2 pivot_longer(): Making it Analysis-Ready

5. Advanced Pivot: Multiple Values

6. Summary Cheat Sheet

7. Exercise for Students

Solution Template

How to use this in RStudio:

Key Features of this Script:

4.1 `pivot_wider()`: Making it Readable

4.2 `pivot_longer()`: Making it Analysis-Ready