🎨 imager

Image Processing & Art Analysis in R

Khoi Nguyen

2026-05-21

What is imager?

  • An R package for image processing and analysis
  • Built on the CImg C++ library — fast and powerful
  • Treats images as 4D arrays: (x, y, color, frame)
  • Bridges the gap between visual art and data science
install.packages("imager")
library(imager)

Why imager for Art Analysis?

  • 🖼️ Load any .jpg, .png, .bmp painting or artwork
  • 🎨 Extract dominant color palettes from paintings
  • 📊 Analyze pixel distributions — brightness, saturation, hue
  • 🔍 Apply convolution filters (the same math behind CNNs)
  • 🗂️ Compare artworks quantitatively across styles or periods

“imager turns a painting into data — and data back into insight.”

Key Functions

Function What it does
load.image() Load any image file
grayscale() Convert to grayscale
as.data.frame() Convert pixels to a tidy data frame
imresize() Resize an image
imgradient() Compute edge gradients
convolve() Apply a convolution filter
imsplit() Split into color channels (R, G, B)
plot() Display the image

Demo 1: Loading & Exploring a Painting

We’ll use a public domain painting downloaded from Wikimedia Commons.

library(imager)
library(ggplot2)
library(dplyr)

# Load a painting (Van Gogh - Starry Night, saved locally)
# download.file("https://upload.wikimedia.org/wikipedia/commons/thumb/e/ea/Van_Gogh_-_Starry_Night_-_Google_Art_Project.jpg/1280px-Van_Gogh_-_Starry_Night_-_Google_Art_Project.jpg",
#               destfile = "starry_night.jpg")

img <- load.image("starry_night.jpg")

# Basic info
print(img)
Image. Width: 1280 pix Height: 1014 pix Depth: 1 Colour channels: 3 
cat("Width:", width(img), "| Height:", height(img), "| Channels:", spectrum(img))
Width: 1280 | Height: 1014 | Channels: 3

Demo 1 (cont.): Plot the Painting

Code
# Display the painting
plot(img, axes = FALSE, main = "Van Gogh — The Starry Night (1889)")

Demo 2: Color Palette Extraction

Turn pixel data into a tidy data frame and extract dominant colors.

Code
# Convert image to data frame (one row per pixel)
df <- as.data.frame(img) |>
  filter(cc <= 3) |>                  # Keep only R, G, B channels
  mutate(channel = case_when(
    cc == 1 ~ "Red",
    cc == 2 ~ "Green",
    cc == 3 ~ "Blue"
  ))

# Plot color channel distributions
ggplot(df, aes(x = value, fill = channel)) +
  geom_density(alpha = 0.6, color = NA) +
  scale_fill_manual(values = c("Red" = "#e63946", 
                                "Green" = "#52b788", 
                                "Blue" = "#4895ef")) +
  labs(title = "Pixel Color Distribution — The Starry Night",
       x = "Pixel Intensity (0–1)", y = "Density",
       fill = "Channel") +
  theme_minimal(base_size = 13) +
  theme(plot.background = element_rect(fill = "#1a1a2e", color = NA),
        panel.background = element_rect(fill = "#1a1a2e", color = NA),
        text = element_text(color = "white"),
        axis.text = element_text(color = "grey80"),
        panel.grid = element_line(color = "#333355"))

Demo 2 (cont.): Dominant Color Swatches

Code
# Sample pixels and build dominant color swatches using k-means
library(tidyr)

# Get RGB values per pixel
rgb_df <- as.data.frame(img) |>
  filter(cc <= 3) |>
  pivot_wider(names_from = cc, values_from = value, 
              names_prefix = "ch") |>
  rename(R = ch1, G = ch2, B = ch3) |>
  na.omit()

# Sample 5000 pixels for speed
set.seed(42)
sample_px <- rgb_df |> sample_n(5000)

# K-means clustering to find 6 dominant colors
km <- kmeans(sample_px[, c("R","G","B")], centers = 6, nstart = 10)

# Build swatch data frame
swatches <- as.data.frame(km$centers) |>
  mutate(cluster = row_number(),
         hex = rgb(R, G, B),
         count = km$size)

# Plot swatches
ggplot(swatches, aes(x = reorder(cluster, -count), y = count, fill = hex)) +
  geom_col(width = 0.85, color = "white", linewidth = 0.3) +
  scale_fill_identity() +
  labs(title = "6 Dominant Colors — The Starry Night",
       x = "Color Cluster", y = "Pixel Count") +
  theme_minimal(base_size = 13) +
  theme(plot.background = element_rect(fill = "#1a1a2e", color = NA),
        panel.background = element_rect(fill = "#1a1a2e", color = NA),
        text = element_text(color = "white"),
        axis.text = element_text(color = "grey80"),
        panel.grid = element_line(color = "#333355"))

Demo 3: Convolution — Edge Detection

Convolution filters are the math behind how CNNs “see” structure in images.

Code
# Convert to grayscale first
img_gray <- grayscale(img)

# Compute image gradient (Sobel-style edge detection)
grad <- imgradient(img_gray, "xy")

# Magnitude of gradient = edge strength
edges <- with(grad, sqrt(x^2 + y^2))

# Plot original (grayscale) vs edges
plot(img_gray, axes = FALSE, main = "Grayscale")
plot(edges,    axes = FALSE, main = "Edge Detection")

Demo 3 (cont.): What is Convolution?

A convolution filter slides a small matrix (kernel) across every pixel and computes a weighted sum of neighbors.

Sobel kernel (horizontal edges):

\[K = \begin{bmatrix} -1 & 0 & 1 \\ -2 & 0 & 2 \\ -1 & 0 & 1 \end{bmatrix}\]

This is the exact same operation in the first layer of a CNN like ResNet or VGG.

Comparing Two Paintings

Does Monet use more blues than Van Gogh? Let’s check.

Code
# Download Monet's Water Lilies
# download.file("https://upload.wikimedia.org/wikipedia/commons/a/aa/Claude_Monet_-_Water_Lilies_-_1906%2C_Ryerson.jpg",
#               destfile = "water_lilies.jpg")

img2 <- load.image("water_lilies.jpg")

extract_blue <- function(painting, name) {
  as.data.frame(painting) |>
    filter(cc == 3) |>          # Blue channel only
    summarise(mean_blue = mean(value), sd_blue = sd(value)) |>
    mutate(painting = name)
}

comparison <- bind_rows(
  extract_blue(img,  "Van Gogh — Starry Night"),
  extract_blue(img2, "Monet — Water Lilies")
)

ggplot(comparison, aes(x = painting, y = mean_blue, fill = painting)) +
  geom_col(width = 0.5) +
  geom_errorbar(aes(ymin = mean_blue - sd_blue, 
                    ymax = mean_blue + sd_blue), width = 0.15) +
  scale_fill_manual(values = c("#4895ef", "#52b788")) +
  labs(title = "Average Blue Channel Intensity",
       subtitle = "Higher = more blue pixels on average",
       x = "", y = "Mean Blue Intensity (0–1)") +
  theme_minimal(base_size = 13) +
  theme(legend.position = "none",
        plot.background = element_rect(fill = "#16213e", color = NA),
        panel.background = element_rect(fill = "#16213e", color = NA),
        text = element_text(color = "white"),
        axis.text = element_text(color = "grey80"),
        panel.grid = element_line(color = "#333355"))

Why R + imager over Python?

Python strengths

  • Broader CV ecosystem (OpenCV, PIL)
  • Better GPU / deep learning support
  • Dominant in production ML

R + imager strengths

  • 🔗 Native tidyverse integration
  • 📊 as.data.frame() → instant ggplot2
  • 📈 Statistical modeling is first-class
  • 📄 Quarto reproducible reporting
  • ⚡ C++ speed — no performance penalty

“If your pipeline lives in R, imager keeps it there — no context switching, no friction.”

Summary

  • imager loads and manipulates images as 4D data arrays
  • as.data.frame() bridges images and tidyverse workflows
  • ✅ Convolution filters like imgradient() reveal structural patterns
  • ✅ Color analysis lets us compare artworks quantitatively
  • ✅ The same convolution math powers CNNs in deep learning

Where to learn more:

Thank You!

Questions?

“Every painting is just a matrix waiting to be analyzed.”

Code
library(imager)
img <- load.image("starry_night.jpg")  
plot(img, axes = FALSE)