Creating Heatmaps with ggplot2

Author

R Programming 101

1 Introduction to Heatmaps

A heatmap is a data visualization technique that displays values in a matrix format using color to represent magnitude. Heatmaps are particularly useful when you want to:

  • Visualize patterns across two dimensions
  • Identify clusters or hotspots in your data
  • Display correlation matrices or spatial data
  • Show the relationship between two variables

In ggplot2, we create heatmaps using the geom_tile() function, which draws rectangular tiles and fills them with color based on a third variable.

2 Key Functions

2.1 geom_tile()

The geom_tile() function is the core of heatmap creation in ggplot2. It requires three aesthetics:

Aesthetic Description
x The variable for the horizontal axis
y The variable for the vertical axis
fill The variable that determines the tile color

Optional arguments include:

  • color - the border color of each tile
  • linewidth - the thickness of the tile borders

2.2 scale_fill_viridis_c()

This function applies the viridis color palette to continuous data. Viridis is popular because it is:

  • Perceptually uniform (equal steps in data look like equal steps in color)
  • Accessible to people with color blindness
  • Readable when printed in black and white

Key arguments:

Argument Description
option The color palette variant: “magma”, “inferno”, “plasma”, “viridis”, “cividis”, “rocket”, “mako”, or “turbo”
name The title for the legend
direction Set to -1 to reverse the color scale

2.3 coord_fixed()

This function fixes the aspect ratio of the plot, ensuring that one unit on the x-axis is the same length as one unit on the y-axis. This is important for spatial data where the x and y axes represent real-world coordinates.

3 The Data

We will use the volcano dataset, which is built into R. This dataset contains elevation data for Maunga Whau, a volcanic cone in Auckland, New Zealand. The data is stored as an 87 × 61 matrix, where each cell represents the elevation (in meters) at a specific grid location.

Show the code
# View the structure of the volcano data
dim(volcano)
[1] 87 61

The volcano data is a matrix, but ggplot2 requires data in a “long” format (a data frame with one row per observation). We need to convert it.

4 Creating the Heatmap

4.1 Step 1: Prepare the data

Show the code
library(tidyverse)

volcano_df <- as.data.frame(volcano) %>%
  mutate(X = row_number()) %>%
  pivot_longer(
    cols = starts_with("V"),
    names_to = "Y",
    names_prefix = "V",
    values_to = "Elevation"
  ) %>%
  mutate(
    X = as.numeric(X),
    Y = as.numeric(Y)
  )
1
Converts the volcano matrix into a standard data frame so it can be used with dplyr and tidyr functions.
2
Creates a new column X based on the row index to preserve the horizontal coordinate.
3
Selects all columns that start with “V” (the default naming convention for columns created from a matrix) to be pivoted.
4
Defines “Y” as the name for the new column that will store the original column headers.
5
Removes the “V” prefix from the column names (e.g., “V1” becomes “1”) so they can be treated as numbers later.
6
Defines “Elevation” as the name for the new column that will store the actual height values from the matrix cells.
7
Converts both the X and Y coordinate columns from integers or strings into numeric doubles for plotting.

4.2 Step 2: Create a basic heatmap

Show the code
ggplot(volcano_df, aes(x = Y, y = X, fill = Elevation)) +
  geom_tile()
1
We set up the plot with ggplot() and define the aesthetics using aes(). The x aesthetic maps to our Y column (east-west position), the y aesthetic maps to our X column (north-south position), and the fill aesthetic maps to Elevation. The fill aesthetic tells ggplot2 which variable should control the color of each tile.
2
The geom_tile() function draws the heatmap. Each tile represents one cell from our original matrix. The position of the tile is determined by the x and y aesthetics, and the color is determined by the fill aesthetic.

4.3 Step 3: Improve the color scale

Show the code
ggplot(volcano_df, aes(x = Y, y = X, fill = Elevation)) +
  geom_tile() +
  scale_fill_viridis_c(option = "magma")
1
The scale_fill_viridis_c() function replaces the default blue color gradient with the viridis color palette. The “c” in the function name stands for “continuous”, meaning it works with numeric data. We use option = "magma" to select a warm color palette that transitions from dark purple (low values) through pink and orange to bright yellow (high values). This makes the elevation differences more visually striking.

4.4 Step 4: Add labels and customize the legend

Show the code
ggplot(volcano_df, aes(x = Y, y = X, fill = Elevation)) +
  geom_tile() +
  scale_fill_viridis_c(
    option = "magma",
    name = "Elevation\n(meters)" 
  ) +
  labs(
    title = "Maunga Whau Volcano Topography",
    x = "East-West Grid Line",
    y = "North-South Grid Line"
  )
1
The name argument inside scale_fill_viridis_c() sets the legend title. The \n creates a line break, so “Elevation” appears on one line and “(meters)” appears below it. This keeps the legend compact while providing complete information about what the colors represent.
2
The labs() function adds labels to the plot. We provide a descriptive title and clearer axis labels that explain what the grid lines represent.

4.5 Step 5: Final polish with theme and aspect ratio

Show the code
ggplot(volcano_df, aes(x = Y, y = X, fill = Elevation)) +
  geom_tile() +
  scale_fill_viridis_c(
    option = "magma",
    name = "Elevation\n(meters)"
  ) +
  labs(
    title = "Maunga Whau Volcano Topography",
    x = "East-West Grid Line",
    y = "North-South Grid Line"
  ) +
  theme_minimal() +
  theme(panel.grid = element_blank()) +
  coord_fixed()
1
Initializes the plot using the long-format data, mapping the coordinates to the axes and elevation to the fill color.
2
Adds a tiling layer that colors each X,Y coordinate pair based on the elevation value.
3
Applies a perceptually uniform color scale (Viridis) which is readable for colorblind viewers and prints well in black and white.
4
Selects the “magma” color palette, which uses a range of black, purple, orange, and yellow.
5
Sets the legend title and uses \n to create a line break for better formatting.
6
Adds the main title and descriptive labels for the horizontal and vertical axes.
7
Applies a clean, modern theme with a white background and subtle typography.
8
Removes the background grid lines to keep the focus entirely on the topographic heatmap.
9
Ensures a 1:1 aspect ratio so that the spatial units are equal on both axes, preventing the volcano from looking stretched.