The Art of Data Visualization: Creating Professional Heatmaps in R

Authors	Rasouli, H., Moein, F
Date	December 20, 2025
Affiliation	ABSA, Tarbiat Modares University, Tehran, Iran
Workshop	R club for researchers

Quick start

Master clustered heatmap visualization in under 60 minutes! This tutorial provides complete, ready-to-use code examples that you can immediately apply to your own datasets. Whether you’re analyzing biological data, financial metrics, survey responses, or any multivariate dataset, these techniques will accelerate your data exploration workflow.

1. Installation and Setup

# Installation
install.packages("heatmaply")

2. Opening its library

# Open its library
library(heatmaply)

3. Using built-in data as input

data <- mtcars
head(data, 4)
print(data)

4. `head ()` function output

Output:

                   mpg cyl disp  hp drat    wt  qsec vs am gear carb
Mazda RX4         21.0   6  160 110 3.90 2.620 16.46  0  1    4    4
Mazda RX4 Wag     21.0   6  160 110 3.90 2.875 17.02  0  1    4    4
Datsun 710        22.8   4  108  93 3.85 2.320 18.61  1  1    4    1
Hornet 4 Drive    21.4   6  258 110 3.08 3.215 19.44  1  0    3    1

5. `print()` function output

                     mpg cyl  disp  hp drat    wt  qsec vs am gear carb
Mazda RX4           21.0   6 160.0 110 3.90 2.620 16.46  0  1    4    4
Mazda RX4 Wag       21.0   6 160.0 110 3.90 2.875 17.02  0  1    4    4
Datsun 710          22.8   4 108.0  93 3.85 2.320 18.61  1  1    4    1
Hornet 4 Drive      21.4   6 258.0 110 3.08 3.215 19.44  1  0    3    1
Hornet Sportabout   18.7   8 360.0 175 3.15 3.440 17.02  0  0    3    2
Valiant             18.1   6 225.0 105 2.76 3.460 20.22  1  0    3    1
Duster 360          14.3   8 360.0 245 3.21 3.570 15.84  0  0    3    4
Merc 240D           24.4   4 146.7  62 3.69 3.190 20.00  1  0    4    2
Merc 230            22.8   4 140.8  95 3.92 3.150 22.90  1  0    4    2
Merc 280            19.2   6 167.6 123 3.92 3.440 18.30  1  0    4    4
Merc 280C           17.8   6 167.6 123 3.92 3.440 18.90  1  0    4    4
Merc 450SE          16.4   8 275.8 180 3.07 4.070 17.40  0  0    3    3
Merc 450SL          17.3   8 275.8 180 3.07 3.730 17.60  0  0    3    3
Merc 450SLC         15.2   8 275.8 180 3.07 3.780 18.00  0  0    3    3
Cadillac Fleetwood  10.4   8 472.0 205 2.93 5.250 17.98  0  0    3    4
Lincoln Continental 10.4   8 460.0 215 3.00 5.424 17.82  0  0    3    4
Chrysler Imperial   14.7   8 440.0 230 3.23 5.345 17.42  0  0    3    4
Fiat 128            32.4   4  78.7  66 4.08 2.200 19.47  1  1    4    1
Honda Civic         30.4   4  75.7  52 4.93 1.615 18.52  1  1    4    2
Toyota Corolla      33.9   4  71.1  65 4.22 1.835 19.90  1  1    4    1
Toyota Corona       21.5   4 120.1  97 3.70 2.465 20.01  1  0    3    1
Dodge Challenger    15.5   8 318.0 150 2.76 3.520 16.87  0  0    3    2
AMC Javelin         15.2   8 304.0 150 3.15 3.435 17.30  0  0    3    2
Camaro Z28          13.3   8 350.0 245 3.73 3.840 15.41  0  0    3    4
Pontiac Firebird    19.2   8 400.0 175 3.08 3.845 17.05  0  0    3    2
Fiat X1-9           27.3   4  79.0  66 4.08 1.935 18.90  1  1    4    1
Porsche 914-2       26.0   4 120.3  91 4.43 2.140 16.70  0  1    5    2
Lotus Europa        30.4   4  95.1 113 3.77 1.513 16.90  1  1    5    2
Ford Pantera L      15.8   8 351.0 264 4.22 3.170 14.50  0  1    5    4
Ferrari Dino        19.7   6 145.0 175 3.62 2.770 15.50  0  1    5    6
Maserati Bora       15.0   8 301.0 335 3.54 3.570 14.60  0  1    5    8
Volvo 142E          21.4   4 121.0 109 4.11 2.780 18.60  1  1    4    2

2. Simplest Usage - Basic Heatmap

# Basic heatmap
heatmaply(data)

Output: An interactive heatmap with default colors showing values of each variable.

3. Correlation Heatmap with `heatmaply_cor()`

# Plot correlation relationship between the input dataset
heatmaply_cor(
  cor(mtcars),
  xlab = "Features",
  ylab = "Features",
  k_col = 2,
  k_row = 2
)

Code Components Explained:

cor(): Input correlation matrix
xlab = "features": Title of X axis (string format)
ylab = "type": Title of Y axis (string format)
k_col = 2 and k_row = 2: Number of clusters in each column and row

Output: A correlation heatmap with specific colors

Clustering Guidelines:

Number of Variables	Clustering Type	Recommended k
10-12 variables	Simple clustering	k = 2 or 3
12-20 variables	Balanced detail/simplicity	k = 3 or 4
> 20 variables	More detailed grouping	k = 4 to 6

4. Output Interpretation

A colorful and interactive plot ranging from +1 to -1
- +1: Perfect positive correlation
- 0: No correlation
- -1: Perfect negative correlation
Two distinct colors:
- Red: Complete positive correlation
- Blue: Complete negative correlation
Interactive feature: Hover mouse cursor on each heatmap box to see numerical values

5. Practical Applications

Identifying related variables
Detecting multicollinearity for modeling
Exploratory data analysis
Feature selection

6. Customizing Gradient Colors

Before dealing with this topic, you should know some information about colors

Using color picker tools for choosing mind-catching colors in R plots

You can use this online color picker to get a wide range of beautiful colors.

👁️ Understanding Color Vision Deficiency

Color vision deficiency, commonly called color blindness, is the decreased ability to see color differences. Most types are inherited, affect more men than women, and involve difficulties distinguishing between specific colors, most commonly reds and greens.

⚫ Desaturated (Achromatopsia / Monochromacy

Seeing any color at all; vision is in shades of black, white, and gray.

Basic Color Gradient:

# Three-color gradient
heatmaply_cor(
  cor(data),
  xlab = "Features",
  ylab = "Features",
  k_col = 2,
  k_row = 2,
  colors = c("blue", "purple", "red")  # blue → purple → red
)

Output: A heatmap with blue-purple-red gradient

# Alternative color combination
heatmaply_cor(
  cor(data),
  xlab = "Features",
  ylab = "Features",
  k_col = 2,
  k_row = 2,
  colors = c("green", "pink", "red")
)

Output: A heatmap with green-pink-red gradient

Built-In Continuous Color Scales:

# Example using RdYlBu
heatmaply_cor(
  cor(data),
  xlab = "Features",
  ylab = "Features",
  k_col = 2,
  k_row = 2,
  colorscale = "RdYlBu"  # Red-Yellow-Blue
)

Output: A heatmap with RdYlBu palette

7. Advanced Colors with RColorBrewer

# Install and load RColorBrewer
install.packages("RColorBrewer")
library(RColorBrewer)

# Example 1: Spectral palette
heatmaply_cor(
  cor(data),
  xlab = "Features",
  ylab = "Features",
  k_col = 2,
  k_row = 2,
  colors = brewer.pal(11, "Spectral")
)

Output: A heatmap with Spectral palette

# Example 2: YlOrRd palette
heatmaply_cor(
  cor(data),
  xlab = "Features",
  ylab = "Features",
  k_col = 2,
  k_row = 2,
  colors = brewer.pal(9, "YlOrRd")
)

Output: A heatmap with YlOrRd palette:

# Example 3: YlGn palette
heatmaply_cor(
  cor(data),
  xlab = "Features",
  ylab = "Features",
  k_col = 2,
  k_row = 2,
  colors = brewer.pal(5, "YlGn")
)

Output: A heatmap with YlGn palette

📋 RColorBrewer Palettes Reference:

Palette	Max Colors	Category	Colorblind Safe
BrBG	11	div	TRUE
PiYG	11	div	TRUE
PRGn	11	div	TRUE
PuOr	11	div	TRUE
RdBu	11	div	TRUE
RdGy	11	div	FALSE
RdYlBu	11	div	TRUE
RdYlGn	11	div	FALSE
Spectral	11	div	FALSE
Blues	9	seq	TRUE
BuGn	9	seq	TRUE
YlGn	9	seq	TRUE
YlOrBr	9	seq	TRUE
YlOrRd	9	seq	TRUE

8. Viridis Color Scales

# Install and load viridis
install.packages("viridis")
library(viridis)

Available viridis palettes:

viridis - Default viridis palette
magma - Black-purple-red-yellow
plasma - Purple-blue-yellow
inferno - Black-red-orange-yellow
cividis - Specifically for colorblind people
mako - Blue-black-green
turbo - Rainbow-like with better perceptual uniformity
rocket - Dark purple-red-orange

# Example with plasma
heatmaply_cor(
  cor(data),
  xlab = "Features",
  ylab = "Features",
  k_col = 2,
  k_row = 2,
  colors = plasma(100)
)

Output: A heatmap with plasma palette

# Example with turbo
heatmaply_cor(
  cor(data),
  xlab = "Features",
  ylab = "Features",
  k_col = 2,
  k_row = 2,
  colors = turbo(100)
)

Output:

A heatmap with Turbo palette

# Example with inferno
heatmaply_cor(
  cor(data),
  xlab = "Features",
  ylab = "Features",
  k_col = 2,
  k_row = 2,
  colors = inferno(100)
)

Output: A heatmap with inferno palette

9. Advanced Correlation with P-Values

Step 1: Calculate Correlation Matrix

# Calculates Pearson correlation coefficients between all variables
# r is an 11×11 matrix of correlation values (-1 to 1)
r <- cor(mtcars)
print(round(r, 3))

Output:

       mpg   cyl  disp    hp  drat    wt  qsec    vs    am  gear  carb
mpg   1.000 -0.852 -0.848 -0.776  0.681 -0.868  0.419  0.664  0.600  0.480 -0.551
cyl  -0.852  1.000  0.902  0.832 -0.700  0.782 -0.591 -0.811 -0.523 -0.493  0.527
disp -0.848  0.902  1.000  0.791 -0.710  0.888 -0.434 -0.710 -0.591 -0.556  0.395
hp   -0.776  0.832  0.791  1.000 -0.449  0.659 -0.708 -0.723 -0.243 -0.126  0.750
drat  0.681 -0.700 -0.710 -0.449  1.000 -0.712  0.091  0.440  0.713  0.700 -0.091
wt   -0.868  0.782  0.888  0.659 -0.712  1.000 -0.175 -0.555 -0.692 -0.583  0.428
qsec  0.419 -0.591 -0.434 -0.708  0.091 -0.175  1.000  0.745 -0.230 -0.213 -0.656
vs    0.664 -0.811 -0.710 -0.723  0.440 -0.555  0.745  1.000  0.168  0.206 -0.570
am    0.600 -0.523 -0.591 -0.243  0.713 -0.692 -0.230  0.168  1.000  0.794  0.058
gear  0.480 -0.493 -0.556 -0.126  0.700 -0.583 -0.213  0.206  0.794  1.000  0.274
carb -0.551  0.527  0.395  0.750 -0.091  0.428 -0.656 -0.570  0.058  0.274  1.000

Step 2: Create P-value Matrix Function

# Creates a matrix of p-values from correlation tests
# Uses cor.test() to test if each correlation is statistically significant
# outer() applies the test to all variable pairs
# Returns a matrix where each cell is the p-value for that correlation pair
cor.test.p <- function(x){
  FUN <- function(x, y) cor.test(x, y)[["p.value"]]
  z <- outer(
    colnames(x), 
    colnames(x), 
    Vectorize(function(i,j) FUN(x[,i], x[,j]))
  )
  dimnames(z) <- list(colnames(x), colnames(x))
  z
}

Step 3: Calculate P-value Matrix

# Applies the function to mtcars
# p is an 11×11 matrix of p-values (0 to 1)
# Small p-value (< 0.05) means correlation is statistically significant
p <- cor.test.p(mtcars)
print(round(p, 4))

Sample Output:

       mpg    cyl   disp     hp   drat     wt   qsec     vs     am   gear   carb
mpg  0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0171 0.0000 0.0002 0.0054 0.0011
cyl  0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0004 0.0000 0.0022 0.0042 0.0019
disp 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0131 0.0000 0.0004 0.0010 0.0250
hp   0.0000 0.0000 0.0000 0.0000 0.0090 0.0000 0.0000 0.0000 0.1800 0.4930 0.0000
drat 0.0000 0.0000 0.0000 0.0090 0.0000 0.0000 0.6190 0.0134 0.0000 0.0000 0.6220
wt   0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.3380 0.0009 0.0000 0.0003 0.0134
qsec 0.0171 0.0004 0.0131 0.0000 0.6190 0.3380 0.0000 0.0000 0.2050 0.2420 0.0000
vs   0.0000 0.0000 0.0000 0.0000 0.0134 0.0009 0.0000 0.0000 0.3380 0.2570 0.0010
am   0.0002 0.0022 0.0004 0.1800 0.0000 0.0000 0.2050 0.3380 0.0000 0.0000 0.7540
gear 0.0054 0.0042 0.0010 0.4930 0.0000 0.0003 0.2420 0.2570 0.0000 0.0000 0.1290
carb 0.0011 0.0019 0.0250 0.0000 0.6220 0.0134 0.0000 0.0010 0.7540 0.1290 0.0000

Step 4: Create Scatterplot Heatmap

heatmaply_cor(
  r,
  node_type = "scatter",
  point_size_mat = -log10(p), 
  point_size_name = "-log10(p-value)",
  label_names = c("x", "y", "Correlation"),
  main = "Functional Correlation Plot with Significance"
)

Output:

A Scatterplot Heatmap

10. Data transformation (scaling, normalize, and percentize)

If we assume that all variables follow a normal distribution, standardizing them—by subtracting the mean and dividing by the standard deviation—would align them closely with the standard normal distribution. In this standardized form, each data point expresses its distance from the mean in terms of standard deviation units. The scale parameter in heatmaply enables scaling along columns, rows, or both, and can be implemented as described below.

Key Parameters Explained:

`node_type = "scatter"`

Changes from traditional heatmap squares to scatterplot points
Each variable pair becomes a point instead of a colored square

`point_size_mat = -log10(p)`

Transformation: -log10(p-value)
- p-value 0.05 → -log10(0.05) = 1.30
- p-value 0.01 → -log10(0.01) = 2.00
- p-value 0.001 → -log10(0.001) = 3.00
Why this transformation?
- Larger values for more significant correlations
- Points get bigger as p-values get smaller
- Makes highly significant correlations more visible

Visual Interpretation:

Large red point: Strong positive AND statistically significant correlation
Large blue point: Strong negative AND statistically significant correlation
Small light-colored point: Weak correlation AND not statistically significant

📋 10. Summary

This tutorial covers:

✅ 1. Basic correlation heatmap creation

Using heatmaply_cor() to display correlation matrices
Setting axis titles with xlab and ylab

✅ 2. Data clustering

Using k_col and k_row for grouping
Guidelines for selecting cluster numbers based on variable count

✅ 3. Color customization

Basic colors with colors = c("color1", "color2", "color3")
RColorBrewer palettes with brewer.pal()
Viridis palettes for modern designs

✅ 4. Advanced analysis with p-values

Calculating statistical significance of correlations
Displaying scatterplot with node_type = "scatter"
Point size based on significance level

✅ 5. Practical applications

Identifying related variables
Detecting multicollinearity
Exploratory data analysis
Feature selection for modeling

Key Insights:

Strong positive correlation = Dark red color
Strong negative correlation = Dark blue color
No correlation = Light/neutral colors
Statistical significance = Larger point size in scatterplot mode

Next Steps for Learning:

Apply to larger datasets
Combine with other visualization techniques
Use in professional reports and presentations
Integrate with Shiny for interactive dashboards

The heatmaply package provides a powerful tool for interactive visualization of relationships in data and can be used at various stages of data analysis.

🎥 Previous Session Video Archive

Session Part	Filename	Direct Link
Part 1	part-1-R452-2025.mp4	📥 Download
Part 2	Part-2-2.mp4	📥 Download
Part 3	Part-3.mp4	📥 Download
Part 4	par-4.mp4	📥 Download

File Format: All videos are provided in the MP4 format. This is a universal container format commonly used for streaming and storing digital video and audio, ensuring broad compatibility with media players.

The Art of Data Visualization: Creating Professional Heatmaps in R

Quick start

1. Installation and Setup

2. Opening its library

3. Using built-in data as input

4. head () function output

5. print() function output

2. Simplest Usage - Basic Heatmap

3. Correlation Heatmap with heatmaply_cor()

Code Components Explained:

Clustering Guidelines:

4. Output Interpretation

5. Practical Applications

6. Customizing Gradient Colors

Using color picker tools for choosing mind-catching colors in R plots

👁️ Understanding Color Vision Deficiency

🟢 Green-Blind (Deuteranopia)

🔴 Red-Blind (Protanopia)

🔵 Blue-Blind (Tritanopia)

⚫ Desaturated (Achromatopsia / Monochromacy

Basic Color Gradient:

Built-In Continuous Color Scales:

7. Advanced Colors with RColorBrewer

📋 RColorBrewer Palettes Reference:

8. Viridis Color Scales

Available viridis palettes:

9. Advanced Correlation with P-Values

Step 1: Calculate Correlation Matrix

Step 2: Create P-value Matrix Function

Step 3: Calculate P-value Matrix

Step 4: Create Scatterplot Heatmap

10. Data transformation (scaling, normalize, and percentize)

Key Parameters Explained:

node_type = "scatter"

point_size_mat = -log10(p)

Visual Interpretation:

📋 10. Summary

✅ 1. Basic correlation heatmap creation

✅ 2. Data clustering

✅ 3. Color customization

✅ 4. Advanced analysis with p-values

✅ 5. Practical applications

Key Insights:

Next Steps for Learning:

🎥 Previous Session Video Archive

4. `head ()` function output

5. `print()` function output

3. Correlation Heatmap with `heatmaply_cor()`

`node_type = "scatter"`

`point_size_mat = -log10(p)`