1. Introduction

In spreadsheet software like Excel, a Pivot Table is a tool used to summarize, sort, reorganize, group, count, total, or average data stored in a table.

In R, we achieve “Pivot Table” functionality using two primary packages from the Tidyverse: 1. dplyr: For summarizing and aggregating data (The “Summary” part). 2. tidyr: For reshaping data between “long” and “wide” formats (The “Layout” part).

2. The Dataset

We will use the palmerpenguins dataset, which contains observations of three penguin species in Antarctica.

# Load data
data("penguins")

# View the first few rows
head(penguins) %>% kable()

species	island	bill_length_mm	bill_depth_mm	flipper_length_mm	body_mass_g	sex	year
Adelie	Torgersen	39.1	18.7	181	3750	male	2007
Adelie	Torgersen	39.5	17.4	186	3800	female	2007
Adelie	Torgersen	40.3	18.0	195	3250	female	2007
Adelie	Torgersen	NA	NA	NA	NA	NA	2007
Adelie	Torgersen	36.7	19.3	193	3450	female	2007
Adelie	Torgersen	39.3	20.6	190	3650	male	2007

3. Part 1: Aggregation (Summarizing Data)

The most common use of a pivot table is to group data and calculate a metric (Sum, Mean, Count).

3.1 Basic Pivot: Mean Body Mass by Species

In Excel, you would drag “Species” to Rows and “Body Mass” to Values. In R:

pivot_summary <- penguins %>%
  group_by(species) %>%
  summarize(
    avg_mass_g = mean(body_mass_g, na.rm = TRUE),
    count = n()
  )

pivot_summary %>% kable()

species	avg_mass_g	count
Adelie	3700.662	152
Chinstrap	3733.088	68
Gentoo	5076.016	124

3.2 Multi-Level Pivot: Species and Island

To see the average mass across both species and the island they inhabit:

multi_pivot <- penguins %>%
  group_by(species, island) %>%
  summarize(
    avg_mass = mean(body_mass_g, na.rm = TRUE)
  ) %>%
  ungroup()

multi_pivot %>% kable()

species	island	avg_mass
Adelie	Biscoe	3709.659
Adelie	Dream	3688.393
Adelie	Torgersen	3706.373
Chinstrap	Dream	3733.088
Gentoo	Biscoe	5076.016

4. Part 2: Reshaping Data (`pivot_wider`)

The output above is in Long Format. However, a traditional Pivot Table often puts one variable in the columns. We use pivot_wider() to achieve this.

4.1 Creating a Cross-Tabulation

Let’s turn the islands into columns to compare species mass side-by-side.

wide_pivot <- multi_pivot %>%
  pivot_wider(
    names_from = island, 
    values_from = avg_mass
  )

wide_pivot %>% kable(caption = "Average Body Mass (g) by Island")

Average Body Mass (g) by Island
species	Biscoe	Dream	Torgersen
Adelie	3709.659	3688.393	3706.373
Chinstrap	NA	3733.088	NA
Gentoo	5076.016	NA	NA

names_from: The column whose values will become the new column headers.
values_from: The column whose values will fill the cells.

5. Part 3: Reshaping Data (`pivot_longer`)

Often, data arrives in a “Wide” format (like an Excel sheet) that is hard for R to plot. We use pivot_longer() to make it “Tidy.”

5.1 The Scenario

Imagine we have a dataset of yearly sales:

sales_data <- data.frame(
  Region = c("North", "South"),
  Year_2021 = c(100, 150),
  Year_2022 = c(120, 180),
  Year_2023 = c(130, 200)
)

sales_data %>% kable()

Region	Year_2021	Year_2022	Year_2023
North	100	120	130
South	150	180	200

5.2 Converting to Long Format

To analyze this properly, we want “Year” in one column and “Revenue” in another.

long_sales <- sales_data %>%
  pivot_longer(
    cols = starts_with("Year"), 
    names_to = "Year", 
    values_to = "Revenue",
    names_prefix = "Year_"
  )

long_sales %>% kable()

Region	Year	Revenue
North	2021	100
North	2022	120
North	2023	130
South	2021	150
South	2022	180
South	2023	200

6. Visualizing Pivot Tables

A visual pivot table is often represented as a Heatmap or a Grouped Bar Chart.

6.1 Heatmap (The Visual Pivot Table)

Using our multi_pivot data (Species vs. Island):

ggplot(multi_pivot, aes(x = island, y = species, fill = avg_mass)) +
  geom_tile(color = "white") +
  scale_fill_gradient(low = "lightblue", high = "darkblue") +
  labs(title = "Heatmap of Penguin Body Mass",
       subtitle = "Visualizing the aggregated pivot table",
       x = "Island",
       y = "Species",
       fill = "Avg Mass (g)") +
  theme_minimal()

7. Summary Table: Excel vs. R

Excel Action	R Function (Tidyverse)
Rows / Columns	`group_by()`
Values (Sum, Avg, Count)	`summarize()`
Move Field to Columns	`pivot_wider()`
Clean messy wide data	`pivot_longer()`
Filters	`filter()`

8. Lab Exercise

Task: Using the penguins dataset: 1. Filter the data for female penguins only. 2. Create a pivot table showing the maximum flipper_length_mm grouped by species and sex. 3. Use pivot_wider to make the sex the columns.

# Try it yourself!
exercise_result <- penguins %>%
  filter(sex == "female") %>%
  group_by(species, island) %>%
  summarize(max_flipper = max(flipper_length_mm, na.rm = TRUE)) %>%
  pivot_wider(names_from = island, values_from = max_flipper)

exercise_result %>% kable()

species	Biscoe	Dream	Torgersen
Adelie	199	202	196
Chinstrap	NA	202	NA
Gentoo	222	NA	NA

```

How to use this:

Open RStudio.
Go to File -> New File -> R Markdown....
Clear all the default text.
Paste the code above.
Save the file as .Rmd.
Click the Knit button (top of the editor).

Key Features of this Note:

Dynamic Data: It uses the palmerpenguins library, which is the modern standard for data science teaching.
Tidyverse Syntax: It uses pivot_longer and pivot_wider (the modern replacements for melt and cast).
Visuals: It includes a ggplot2 heatmap to help students bridge the gap between tables and charts.
Comparison: A handy table helps Excel users translate their knowledge to R code.

Data Science Lecture: Mastering Pivot Tables in R

Hamse Faisal Muhumed

2026-04-16

1. Introduction

2. The Dataset

3. Part 1: Aggregation (Summarizing Data)

3.1 Basic Pivot: Mean Body Mass by Species

3.2 Multi-Level Pivot: Species and Island

4. Part 2: Reshaping Data (`pivot_wider`)

4.1 Creating a Cross-Tabulation

5. Part 3: Reshaping Data (`pivot_longer`)

5.1 The Scenario

5.2 Converting to Long Format

6. Visualizing Pivot Tables

6.1 Heatmap (The Visual Pivot Table)

7. Summary Table: Excel vs. R

8. Lab Exercise

How to use this:

Key Features of this Note:

Data Science Lecture: Mastering Pivot Tables in R

Hamse Faisal Muhumed

2026-04-16

1. Introduction

2. The Dataset

3. Part 1: Aggregation (Summarizing Data)

3.1 Basic Pivot: Mean Body Mass by Species

3.2 Multi-Level Pivot: Species and Island

4. Part 2: Reshaping Data (pivot_wider)

4.1 Creating a Cross-Tabulation

5. Part 3: Reshaping Data (pivot_longer)

5.1 The Scenario

5.2 Converting to Long Format

6. Visualizing Pivot Tables

6.1 Heatmap (The Visual Pivot Table)

7. Summary Table: Excel vs. R

8. Lab Exercise

How to use this:

Key Features of this Note:

4. Part 2: Reshaping Data (`pivot_wider`)

5. Part 3: Reshaping Data (`pivot_longer`)