Preamble

Principal Component Analysis (PCA) is a linear dimensional reduction technique that reduces the dimensionality of the data while keeping the variability of the original data.

In image compression, PCA minimizes the size of the image i.e., in this case the crane image while keeping the quality of the photo as a constant. PCA applies new reforms on the image while maintaining its quality dimensions.

Images are compressed for various reasons such as reducing the storage impact on storage devices, easy uploading and sending of images on emails, social media platforms and the like, sites with high resolution photos take a longer period of time to load.

library("jpeg")
library("png")
library("magick")

crane <- readJPEG("crane.jpg")
image_read(crane)

##Original crane size 
## [1] 1600  1207

PCA Introduction

PCA is performed on the red,green and blue colors abbreviated as r,g and b respectively. WE use these colors simply because their the primary colors and thus appear on each and every image. The caren image has width of 1600 pixels and a height of 1207 pixels with 305.61 kilobytes.

  r<-crane[,,1] 
  g<-crane[,,2]
  b<-crane[,,3]

PCA will reduce the high variances within the image so that it tonnes down on its size (Kilobytes). in case of an image, all variables are as important as the rest and therefore scaling and center will be equal to zero as they mainly apply to plots. Centering and scaling would change the original alignment of the image and yet the soul objective is to compress it.

  r.pca<-prcomp(r, center=FALSE, scale.=FALSE) 
  g.pca<-prcomp(g, center=FALSE, scale.=FALSE)
  b.pca<-prcomp(b, center=FALSE, scale.=FALSE)
  rgb.pca<-list(r.pca, g.pca, b.pca)

PRINCIPAL COMPONENTS REPRESENTATION

For the crane image, i choose 9 principal components so that i get a 95% variance during compression while the eigenvalues will indicate how much of what color is in represented in the crane image.

  library(FactoMineR)
  library(factoextra)
## Loading required package: ggplot2
## Welcome! Want to learn more? See two factoextra-related books at https://goo.gl/ve3WBa
  library(gridExtra)
  p1<-fviz_eig(r.pca, choice = 'eigenvalue', main="Red", barfill="red", ncp=9, addlabels=TRUE)
  p2<-fviz_eig(g.pca, choice = 'eigenvalue', main="Green", barfill="green", ncp=9, addlabels=TRUE)
  p3<-fviz_eig(b.pca, choice = 'eigenvalue', main="Blue", barfill="blue", ncp=9, addlabels=TRUE)
  
  grid.arrange(p1, p2, p3, ncol=3)

In the above dimensions it shows that the image has more of the green pixels followed by red and blue in the first dimensions whileas it has more blue pixels in the second dimension.

Explained variance in percentages

pa <- fviz_eig(r.pca, main = "Red", barfill = "red", ncp = 9)
pb <- fviz_eig(g.pca, main = "Green", barfill = "green", ncp = 9)
pc <- fviz_eig(b.pca, main = "Blue", barfill = "blue", ncp = 9)

grid.arrange(pa, pb, pc, ncol = 3)

# Crane image compression

According to the principal components 9, and compression of various sizes of the crane image, were able to varieties of images from those that are fiat, fair while others look better but with a compromise of the size thus image compression while keeping the quality if the image constant.

library(abind)
library(ggplot2)
for (i in c(100,270,400,550,750,810)) {
  new_image <- abind(r.pca$x[,1:i] %*% t(r.pca$rotation[,1:i]),
                     g.pca$x[,1:i] %*% t(g.pca$rotation[,1:i]),
                     b.pca$x[,1:i] %*% t(b.pca$rotation[,1:i]),
                     along = 3)
  writeJPEG(new_image, paste0('Compressed_image_with_',i, '_components.jpg'))
}

image_plot <- function(path, plot_name) {
  require('jpeg')
  img <- readJPEG(path)
  d <- dim(img)
  plot(0,0,xlim=c(0,d[2]),ylim=c(0,d[2]),xaxt='n',yaxt='n',xlab='',ylab='',bty='n')
  title(plot_name, line = -0.35)
  rasterImage(img,0,0,d[2],d[2])
}

par(mfrow = c(1,1), mar = c(0,0,2,2))
for (i in c(100,270,400,550,750,810)) {
  image_plot(paste0('Compressed_image_with_',i, '_components.jpg'), 
             paste0(round(i,0), ' Components'))
}

The displayed above photos show how the principal components contribute determine the quality of the image compared to the original image. It is evident that more components yield a better image , however even the best quality photo still has a little components as compared to the original crane image.

Below is the comparison between the size of the crane image in each component and the amount of space reduced as a result of Compression

library(knitr)

table <- matrix(0,7,3)
colnames(table) <- c("Number of components", "Crane size in kilobytes", "freed Space in kilobytes")
table[,1] <- c(100,270,400,550,750,810,"Original crane image")
table[7,2:3] <- round(c(file.info('crane.jpg')$size/1024, 0),2)
for (i in c(1:6)) {
  path <- paste0('Compressed_image_with_',table[i,1], '_components.jpg')
  table[i,2] <- round(file.info(path)$size/1024,2)
  table[i,3] <- round(as.numeric(table[7,2]) - as.numeric(table[i,2]),2)
}

kable(table)
Number of components Crane size in kilobytes freed Space in kilobytes
100 269.28 36.33
270 297.05 8.56
400 294.7 10.91
550 292.15 13.46
750 286.03 19.58
810 283 22.61
Original crane image 305.61 0

PCA gives a chance for one to try out as many times(number of components) as they wish to get to achieve a better looking quality image. At 810 components,a space of 22.61 kilobytes was freed.

In Conclusion,with the use of PCA in image compression, its evident that you can always reduce the size of an image without tampering with its quality. This subsequently helps in freeing up space on physical and the storage platforms.