PCA targets features with higher variance. Therefore scaling and centering of our data would not be necessary for image compression. This is because if we scale and the scaling coefficient is > 1, that feature will have more influence than it would have before. On the contrary, a coefficient < 1 will also mean less influence. In other words, as long as the parameters have same orders, centering and scaling may not be necessary.
red.pca <- prcomp(red, center=FALSE, scale.=FALSE)
green.pca <- prcomp(green, center=FALSE, scale.=FALSE)
blue.pca <- prcomp(blue, center=FALSE, scale.=FALSE)
PRINCIPAL COMPONENTS REPRESENTATION
The color indications in the plot below indicates the eigen values for the principal components for all the RGB components. In this case I am only displaying just seven(7) principal components.
library("factoextra")
library("gridExtra")
library("ggplot2")
f1 <- fviz_eig(red.pca, choice = 'eigenvalue', main = "Red", barfill = "red", ncp = 7, addlabels = TRUE)
f2 <- fviz_eig(green.pca, choice = 'eigenvalue', main = "Green", barfill = "green", ncp = 7, addlabels = TRUE)
f3 <- fviz_eig(blue.pca, choice = 'eigenvalue', main = "Blue", barfill = "blue", ncp = 7, addlabels = TRUE)
grid.arrange(f1, f2, f3, ncol=3)

Let us now take a look at the percentage of the explained variances in these principal components.
f11 <- fviz_eig(red.pca, main = "Red", barfill = "red", ncp = 7)
f22 <- fviz_eig(green.pca, main = "Green", barfill = "green", ncp = 7)
f33 <- fviz_eig(blue.pca, main = "Blue", barfill = "blue", ncp = 7)
grid.arrange(f11, f22, f33, ncol = 3)

It could be realized from the above screen plot that the first principal component for each color explains majority of the variances in the color component, explaining about 80% variance. The second principal components are also considered and the rest are obviously negligible since they explain very little or no variance at all.
COMPRESSION OF THE IMAGE
Here we are going to take a look at how the image is going to look like after we compress it with different number of principal components. We will notice that as the number of principal components increase, the more likeness to the original image the newly created images becomes. This progressive improvement in quality is due to the fact that as more principal components are included in the compression, the more the variance (information) is described in the output.
library(abind)
library(ggplot2)
for (i in c(10,30,60,100,150,200,250,300)) {
new_image <- abind(red.pca$x[,1:i] %*% t(red.pca$rotation[,1:i]),
green.pca$x[,1:i] %*% t(green.pca$rotation[,1:i]),
blue.pca$x[,1:i] %*% t(blue.pca$rotation[,1:i]),
along = 3)
writeJPEG(new_image, paste0('Compressed_image_with_',i, '_components.jpg'))
}
image_plot <- function(path, plot_name) {
require('jpeg')
img <- readJPEG(path)
d <- dim(img)
plot(0,0,xlim=c(0,d[2]),ylim=c(0,d[2]),xaxt='n',yaxt='n',xlab='',ylab='',bty='n')
title(plot_name, line = -0.5)
rasterImage(img,0,0,d[2],d[2])
}
par(mfrow = c(1,2), mar = c(0,0,1,1))
for (i in c(10,30,60,100,150,200,250,300)) {
image_plot(paste0('Compressed_image_with_',i, '_components.jpg'),
paste0(round(i,0), ' Components'))
}




As it can be seen from the images above, the quality of the images kept increasing as the number of principal components increased. It is also important to note that, the last image generated is only representing 300 principal components of the original image but we can see the quality in both images are almost the same.
Now, lets compare the sizes in kilobytes to know if there is a significant change in the sizes of these images to ascertain our point that, PCA for image compression helps to save disk spaces without sacrificing quality since it only removes less significant components.
library(knitr)
table <- matrix(0,9,3)
colnames(table) <- c("Number of components", "Image size (kilobytes)", "Saved Disk Space (kilobytes)")
table[,1] <- c(10,30,60,100,150,200,250,300,"Original Rock image")
table[9,2:3] <- round(c(file.info('paintrock.jpg')$size/1024, 0),2)
for (i in c(1:8)) {
path <- paste0('Compressed_image_with_',table[i,1], '_components.jpg')
table[i,2] <- round(file.info(path)$size/1024,2)
table[i,3] <- round(as.numeric(table[9,2]) - as.numeric(table[i,2]),2)
}
kable(table)
| 10 |
32.68 |
117.6 |
| 30 |
42.67 |
107.61 |
| 60 |
46.17 |
104.11 |
| 100 |
47.5 |
102.78 |
| 150 |
47.3 |
102.98 |
| 200 |
46.59 |
103.69 |
| 250 |
45.65 |
104.63 |
| 300 |
43.86 |
106.42 |
| Original Rock image |
150.28 |
0 |
CONCLUSION
Image compression with principal component analysis helped to save disk space of about 70% with little or no loss in image quality. Not only has it saved disk space but it has made it easier and efficient to transmit these images between different sets of locations.
Other fields where image compression using PCA can be applied are; recognition of patterns, processing of digital images in medicine and among others.
# REFERENCES
# https://royalsocietypublishing.org/doi/10.1098/rsta.2015.0202
# https://www.uniassignment.com/essay-samples/information-technology/the-need-for-image-compression-inforrmation-technology-essay.php?vref=1
# https://kidsactivitiesblog.com/112656/rock-painting-ideas/?utm_source=pocket_mylist