🎨 imager

Image Processing & Art Analysis in R

Khoi Nguyen

2026-05-21

What is `imager`?

An R package for image processing and analysis
Built on the CImg C++ library — fast and powerful
Treats images as 4D arrays: (x, y, color, frame)
Bridges the gap between visual art and data science

install.packages("imager")
library(imager)

Why `imager` for Art Analysis?

🖼️ Load any .jpg, .png, .bmp painting or artwork
🎨 Extract dominant color palettes from paintings
📊 Analyze pixel distributions — brightness, saturation, hue
🔍 Apply convolution filters (the same math behind CNNs)
🗂️ Compare artworks quantitatively across styles or periods

“imager turns a painting into data — and data back into insight.”

Key Functions

Function	What it does
`load.image()`	Load any image file
`grayscale()`	Convert to grayscale
`as.data.frame()`	Convert pixels to a tidy data frame
`imresize()`	Resize an image
`imgradient()`	Compute edge gradients
`convolve()`	Apply a convolution filter
`imsplit()`	Split into color channels (R, G, B)
`plot()`	Display the image

Demo 1: Loading & Exploring a Painting

We’ll use a public domain painting downloaded from Wikimedia Commons.

library(imager)
library(ggplot2)
library(dplyr)

# Load a painting (Van Gogh - Starry Night, saved locally)
# download.file("https://upload.wikimedia.org/wikipedia/commons/thumb/e/ea/Van_Gogh_-_Starry_Night_-_Google_Art_Project.jpg/1280px-Van_Gogh_-_Starry_Night_-_Google_Art_Project.jpg",
#               destfile = "starry_night.jpg")

img <- load.image("starry_night.jpg")

# Basic info
print(img)

Image. Width: 1280 pix Height: 1014 pix Depth: 1 Colour channels: 3

cat("Width:", width(img), "| Height:", height(img), "| Channels:", spectrum(img))

Width: 1280 | Height: 1014 | Channels: 3

Demo 1 (cont.): Plot the Painting

Code

# Display the painting
plot(img, axes = FALSE, main = "Van Gogh — The Starry Night (1889)")

Demo 2: Color Palette Extraction

Turn pixel data into a tidy data frame and extract dominant colors.

Code

# Convert image to data frame (one row per pixel)
df <- as.data.frame(img) |>
  filter(cc <= 3) |>                  # Keep only R, G, B channels
  mutate(channel = case_when(
    cc == 1 ~ "Red",
    cc == 2 ~ "Green",
    cc == 3 ~ "Blue"
  ))

# Plot color channel distributions
ggplot(df, aes(x = value, fill = channel)) +
  geom_density(alpha = 0.6, color = NA) +
  scale_fill_manual(values = c("Red" = "#e63946", 
                                "Green" = "#52b788", 
                                "Blue" = "#4895ef")) +
  labs(title = "Pixel Color Distribution — The Starry Night",
       x = "Pixel Intensity (0–1)", y = "Density",
       fill = "Channel") +
  theme_minimal(base_size = 13) +
  theme(plot.background = element_rect(fill = "#1a1a2e", color = NA),
        panel.background = element_rect(fill = "#1a1a2e", color = NA),
        text = element_text(color = "white"),
        axis.text = element_text(color = "grey80"),
        panel.grid = element_line(color = "#333355"))

Demo 2 (cont.): Dominant Color Swatches

Code

# Sample pixels and build dominant color swatches using k-means
library(tidyr)

# Get RGB values per pixel
rgb_df <- as.data.frame(img) |>
  filter(cc <= 3) |>
  pivot_wider(names_from = cc, values_from = value, 
              names_prefix = "ch") |>
  rename(R = ch1, G = ch2, B = ch3) |>
  na.omit()

# Sample 5000 pixels for speed
set.seed(42)
sample_px <- rgb_df |> sample_n(5000)

# K-means clustering to find 6 dominant colors
km <- kmeans(sample_px[, c("R","G","B")], centers = 6, nstart = 10)

# Build swatch data frame
swatches <- as.data.frame(km$centers) |>
  mutate(cluster = row_number(),
         hex = rgb(R, G, B),
         count = km$size)

# Plot swatches
ggplot(swatches, aes(x = reorder(cluster, -count), y = count, fill = hex)) +
  geom_col(width = 0.85, color = "white", linewidth = 0.3) +
  scale_fill_identity() +
  labs(title = "6 Dominant Colors — The Starry Night",
       x = "Color Cluster", y = "Pixel Count") +
  theme_minimal(base_size = 13) +
  theme(plot.background = element_rect(fill = "#1a1a2e", color = NA),
        panel.background = element_rect(fill = "#1a1a2e", color = NA),
        text = element_text(color = "white"),
        axis.text = element_text(color = "grey80"),
        panel.grid = element_line(color = "#333355"))

Demo 3: Convolution — Edge Detection

Convolution filters are the math behind how CNNs “see” structure in images.

Code

# Convert to grayscale first
img_gray <- grayscale(img)

# Compute image gradient (Sobel-style edge detection)
grad <- imgradient(img_gray, "xy")

# Magnitude of gradient = edge strength
edges <- with(grad, sqrt(x^2 + y^2))

# Plot original (grayscale) vs edges
plot(img_gray, axes = FALSE, main = "Grayscale")
plot(edges,    axes = FALSE, main = "Edge Detection")

Demo 3 (cont.): What is Convolution?

A convolution filter slides a small matrix (kernel) across every pixel and computes a weighted sum of neighbors.

Sobel kernel (horizontal edges):

\[K = \begin{bmatrix} -1 & 0 & 1 \\ -2 & 0 & 2 \\ -1 & 0 & 1 \end{bmatrix}\]

This is the exact same operation in the first layer of a CNN like ResNet or VGG.

Comparing Two Paintings

Does Monet use more blues than Van Gogh? Let’s check.

Code

# Download Monet's Water Lilies
# download.file("https://upload.wikimedia.org/wikipedia/commons/a/aa/Claude_Monet_-_Water_Lilies_-_1906%2C_Ryerson.jpg",
#               destfile = "water_lilies.jpg")

img2 <- load.image("water_lilies.jpg")

extract_blue <- function(painting, name) {
  as.data.frame(painting) |>
    filter(cc == 3) |>          # Blue channel only
    summarise(mean_blue = mean(value), sd_blue = sd(value)) |>
    mutate(painting = name)
}

comparison <- bind_rows(
  extract_blue(img,  "Van Gogh — Starry Night"),
  extract_blue(img2, "Monet — Water Lilies")
)

ggplot(comparison, aes(x = painting, y = mean_blue, fill = painting)) +
  geom_col(width = 0.5) +
  geom_errorbar(aes(ymin = mean_blue - sd_blue, 
                    ymax = mean_blue + sd_blue), width = 0.15) +
  scale_fill_manual(values = c("#4895ef", "#52b788")) +
  labs(title = "Average Blue Channel Intensity",
       subtitle = "Higher = more blue pixels on average",
       x = "", y = "Mean Blue Intensity (0–1)") +
  theme_minimal(base_size = 13) +
  theme(legend.position = "none",
        plot.background = element_rect(fill = "#16213e", color = NA),
        panel.background = element_rect(fill = "#16213e", color = NA),
        text = element_text(color = "white"),
        axis.text = element_text(color = "grey80"),
        panel.grid = element_line(color = "#333355"))

Why R + `imager` over Python?

Python strengths

Broader CV ecosystem (OpenCV, PIL)
Better GPU / deep learning support
Dominant in production ML

R + imager strengths

🔗 Native tidyverse integration
📊 as.data.frame() → instant ggplot2
📈 Statistical modeling is first-class
📄 Quarto reproducible reporting
⚡ C++ speed — no performance penalty

“If your pipeline lives in R, imager keeps it there — no context switching, no friction.”

This is a question that almost always comes up when presenting an R package for image analysis — “why not just use Python?”

And it’s a fair question. Python’s computer vision ecosystem IS broader. OpenCV, PIL, scikit-image, and PyTorch are all excellent tools.

But there are four specific reasons to choose imager in R:

First, workflow integration. If your entire analysis — data cleaning, statistics, visualization, reporting — is already in R, switching to Python just for image processing creates unnecessary friction. imager keeps everything in one place.

Second, the tidyverse bridge. as.data.frame() converts any image into a tidy table that works directly with dplyr and ggplot2. Getting pixel arrays into a clean pandas DataFrame in Python takes several more steps.

Third, statistical thinking. When you want to run a t-test on color intensity, fit a linear model on image features, or build confidence intervals — R is the natural home for that kind of analysis.

Fourth, reproducibility. imager plus Quarto gives you code, visualizations, and interpretation in one document. For academic and research contexts, that’s extremely valuable.

And importantly — imager is built on C++, so it’s fast. There’s no performance penalty for choosing R here.

Summary

✅ imager loads and manipulates images as 4D data arrays
✅ as.data.frame() bridges images and tidyverse workflows
✅ Convolution filters like imgradient() reveal structural patterns
✅ Color analysis lets us compare artworks quantitatively
✅ The same convolution math powers CNNs in deep learning

Where to learn more:

📦 CRAN: install.packages("imager")
📖 Docs: dahtah.github.io/imager
🎨 Your own paintings!

Thank You!

Questions?

“Every painting is just a matrix waiting to be analyzed.”

Code

library(imager)
img <- load.image("starry_night.jpg")  
plot(img, axes = FALSE)

🎨 imager

What is imager?

Why imager for Art Analysis?

Key Functions

Demo 1: Loading & Exploring a Painting

Demo 1 (cont.): Plot the Painting

Demo 2: Color Palette Extraction

Demo 2 (cont.): Dominant Color Swatches

Demo 3: Convolution — Edge Detection

Demo 3 (cont.): What is Convolution?

Comparing Two Paintings

Why R + imager over Python?

Summary

Thank You!

Questions?

What is `imager`?

Why `imager` for Art Analysis?

Why R + `imager` over Python?