Image Processing & Art Analysis in R
2026-05-21
imager?imager for Art Analysis?.jpg, .png, .bmp painting or artwork“imager turns a painting into data — and data back into insight.”
| Function | What it does |
|---|---|
load.image() |
Load any image file |
grayscale() |
Convert to grayscale |
as.data.frame() |
Convert pixels to a tidy data frame |
imresize() |
Resize an image |
imgradient() |
Compute edge gradients |
convolve() |
Apply a convolution filter |
imsplit() |
Split into color channels (R, G, B) |
plot() |
Display the image |
We’ll use a public domain painting downloaded from Wikimedia Commons.
library(imager)
library(ggplot2)
library(dplyr)
# Load a painting (Van Gogh - Starry Night, saved locally)
# download.file("https://upload.wikimedia.org/wikipedia/commons/thumb/e/ea/Van_Gogh_-_Starry_Night_-_Google_Art_Project.jpg/1280px-Van_Gogh_-_Starry_Night_-_Google_Art_Project.jpg",
# destfile = "starry_night.jpg")
img <- load.image("starry_night.jpg")
# Basic info
print(img)Image. Width: 1280 pix Height: 1014 pix Depth: 1 Colour channels: 3
Width: 1280 | Height: 1014 | Channels: 3
Turn pixel data into a tidy data frame and extract dominant colors.
# Convert image to data frame (one row per pixel)
df <- as.data.frame(img) |>
filter(cc <= 3) |> # Keep only R, G, B channels
mutate(channel = case_when(
cc == 1 ~ "Red",
cc == 2 ~ "Green",
cc == 3 ~ "Blue"
))
# Plot color channel distributions
ggplot(df, aes(x = value, fill = channel)) +
geom_density(alpha = 0.6, color = NA) +
scale_fill_manual(values = c("Red" = "#e63946",
"Green" = "#52b788",
"Blue" = "#4895ef")) +
labs(title = "Pixel Color Distribution — The Starry Night",
x = "Pixel Intensity (0–1)", y = "Density",
fill = "Channel") +
theme_minimal(base_size = 13) +
theme(plot.background = element_rect(fill = "#1a1a2e", color = NA),
panel.background = element_rect(fill = "#1a1a2e", color = NA),
text = element_text(color = "white"),
axis.text = element_text(color = "grey80"),
panel.grid = element_line(color = "#333355"))# Sample pixels and build dominant color swatches using k-means
library(tidyr)
# Get RGB values per pixel
rgb_df <- as.data.frame(img) |>
filter(cc <= 3) |>
pivot_wider(names_from = cc, values_from = value,
names_prefix = "ch") |>
rename(R = ch1, G = ch2, B = ch3) |>
na.omit()
# Sample 5000 pixels for speed
set.seed(42)
sample_px <- rgb_df |> sample_n(5000)
# K-means clustering to find 6 dominant colors
km <- kmeans(sample_px[, c("R","G","B")], centers = 6, nstart = 10)
# Build swatch data frame
swatches <- as.data.frame(km$centers) |>
mutate(cluster = row_number(),
hex = rgb(R, G, B),
count = km$size)
# Plot swatches
ggplot(swatches, aes(x = reorder(cluster, -count), y = count, fill = hex)) +
geom_col(width = 0.85, color = "white", linewidth = 0.3) +
scale_fill_identity() +
labs(title = "6 Dominant Colors — The Starry Night",
x = "Color Cluster", y = "Pixel Count") +
theme_minimal(base_size = 13) +
theme(plot.background = element_rect(fill = "#1a1a2e", color = NA),
panel.background = element_rect(fill = "#1a1a2e", color = NA),
text = element_text(color = "white"),
axis.text = element_text(color = "grey80"),
panel.grid = element_line(color = "#333355"))Convolution filters are the math behind how CNNs “see” structure in images.
# Convert to grayscale first
img_gray <- grayscale(img)
# Compute image gradient (Sobel-style edge detection)
grad <- imgradient(img_gray, "xy")
# Magnitude of gradient = edge strength
edges <- with(grad, sqrt(x^2 + y^2))
# Plot original (grayscale) vs edges
plot(img_gray, axes = FALSE, main = "Grayscale")
plot(edges, axes = FALSE, main = "Edge Detection")A convolution filter slides a small matrix (kernel) across every pixel and computes a weighted sum of neighbors.
Sobel kernel (horizontal edges):
\[K = \begin{bmatrix} -1 & 0 & 1 \\ -2 & 0 & 2 \\ -1 & 0 & 1 \end{bmatrix}\]
This is the exact same operation in the first layer of a CNN like ResNet or VGG.
Does Monet use more blues than Van Gogh? Let’s check.
# Download Monet's Water Lilies
# download.file("https://upload.wikimedia.org/wikipedia/commons/a/aa/Claude_Monet_-_Water_Lilies_-_1906%2C_Ryerson.jpg",
# destfile = "water_lilies.jpg")
img2 <- load.image("water_lilies.jpg")
extract_blue <- function(painting, name) {
as.data.frame(painting) |>
filter(cc == 3) |> # Blue channel only
summarise(mean_blue = mean(value), sd_blue = sd(value)) |>
mutate(painting = name)
}
comparison <- bind_rows(
extract_blue(img, "Van Gogh — Starry Night"),
extract_blue(img2, "Monet — Water Lilies")
)
ggplot(comparison, aes(x = painting, y = mean_blue, fill = painting)) +
geom_col(width = 0.5) +
geom_errorbar(aes(ymin = mean_blue - sd_blue,
ymax = mean_blue + sd_blue), width = 0.15) +
scale_fill_manual(values = c("#4895ef", "#52b788")) +
labs(title = "Average Blue Channel Intensity",
subtitle = "Higher = more blue pixels on average",
x = "", y = "Mean Blue Intensity (0–1)") +
theme_minimal(base_size = 13) +
theme(legend.position = "none",
plot.background = element_rect(fill = "#16213e", color = NA),
panel.background = element_rect(fill = "#16213e", color = NA),
text = element_text(color = "white"),
axis.text = element_text(color = "grey80"),
panel.grid = element_line(color = "#333355"))imager over Python?Python strengths
R + imager strengths
as.data.frame() → instant ggplot2“If your pipeline lives in R, imager keeps it there — no context switching, no friction.”
imager loads and manipulates images as 4D data arraysas.data.frame() bridges images and tidyverse workflowsimgradient() reveal structural patternsWhere to learn more:
install.packages("imager")“Every painting is just a matrix waiting to be analyzed.”
R Package of the Day — imager