As we know that Dimensionality reduction is a machine learning (ML) or statistical technique of reducing the amount of random variables in a problem by obtaining a set of principal variables.
There are 3 ways of reducing dimensionality: Principal Component Analysis (PCA), Factor Analysis (FA), Linear Discriminant Analysis (LDA) and Truncated Singular Value Decomposition (SVD). In my project I will use PCA to reducing the size of an image.
Firstly we need to install necessary packages
library(jpeg)
library(factoextra)
## Loading required package: ggplot2
## Welcome! Want to learn more? See two factoextra-related books at https://goo.gl/ve3WBa
library(gridExtra)
library(ggplot2)
library(magick)
## Linking to ImageMagick 6.9.12.3
## Enabled features: cairo, freetype, fftw, ghostscript, heic, lcms, pango, raw, rsvg, webp
## Disabled features: fontconfig, x11
library(png)
library(Metrics)
library(knitr)
library(imgpalr)
library(abind)
Next we read an image and set the size of plot.
garfield <- readJPEG("C:/Users/user/Documents/garfield.jpg")
plot(1, type="n")
rasterImage(garfield, 0.5, 0.5, 1.5, 1.5)
In our image you can see different colors. Mainly the various tones of orange are seen. We can change this image to grey.
garfield.sum<-garfield[,,1]+garfield[,,2]+garfield[,,3]
garfield.bw<-garfield.sum/max(garfield.sum)
plot(1, type="n")
rasterImage(garfield.bw, 0.5, 0.5, 1.5, 1.5)
Principal Component Analysis (PCA) for images is to run a separate PCA on each of the R, G, and B tones. So it gives us its shadow vector. The first vector shows most of the huge change, and the later vectors slightly less. We need to add new shadows to the picture. And this new value comes from multiplying the “x” and “rotation” components of PCA. Since we are working on matrices, here each color scale (R, G, B) gets its own matrix and PCA.
r<-garfield[,,1] # individual matrix of R color component
g<-garfield[,,2] # individual matrix of G color component
b<-garfield[,,3] # individual matrix of B color component
r.pca<-prcomp(r, center=FALSE, scale.=FALSE) # PCA for r
g.pca<-prcomp(g, center=FALSE, scale.=FALSE) # PCA for g
b.pca<-prcomp(b, center=FALSE, scale.=FALSE) # PCA for b
Then we merge three Pcas in one place
rgb.pca<-list(r.pca, g.pca, b.pca)
In this step I can show the effects of PCA on dimension. You can see the relationship between percentage of explained variance and dimensions.
f1<-fviz_eig(r.pca, main="orange", barfill="orange", ncp=5, addlabels=TRUE)
f2<-fviz_eig(g.pca, main="green", barfill="green", ncp=5, addlabels=TRUE)
f3<-fviz_eig(b.pca, main="red", barfill="red", ncp=5, addlabels=TRUE)
grid.arrange(f1, f2, f3, ncol=3)
In addition we can show the change of an image from bulur to clear version. For this we will create 9 images of same image and you can see the difference between this images.
vec<-seq.int(3, round(nrow(garfield)), length.out=9)
for(i in vec){
garfield.pca<-sapply(rgb.pca, function(j) {
new.RGB<-j$x[,1:i] %*% t(j$rotation[,1:i])}, simplify="array")
assign(paste("garfield_", round(i,0), sep=""), garfield.pca) # saving as object
writeJPEG(garfield.pca, paste("garfield_", round(i,0), "_princ_comp.jpg", sep=""))
}
par(mfrow=c(3,3))
par(mar=c(1,1,1,1))
plot(image_read(get(paste("garfield_", round(vec[1],0), sep=""))))
plot(image_read(get(paste("garfield_", round(vec[2],0), sep=""))))
plot(image_read(get(paste("garfield_", round(vec[3],0), sep=""))))
plot(image_read(get(paste("garfield_", round(vec[4],0), sep=""))))
plot(image_read(get(paste("garfield_", round(vec[5],0), sep=""))))
plot(image_read(get(paste("garfield_", round(vec[6],0), sep=""))))
plot(image_read(get(paste("garfield_", round(vec[7],0), sep=""))))
plot(image_read(get(paste("garfield_", round(vec[8],0), sep=""))))
plot(image_read(get(paste("garfield_", round(vec[9],0), sep=""))))
As you see that, from the first image to the last one images are becoming more clear.
We can create our own palette from this picture. At the first glance we can see the different tones of orange mostly. However we define them exactly below.
colors1<-image_pal("C:/Users/user/Documents/garfield.jpg", n=8, type="div", saturation=c(0.75, 1), brightness=c(0.75, 1), plot=TRUE)
colors2<-image_pal("C:/Users/user/Documents/garfield.jpg", n=11, type="seq", k=2, saturation=c(0.5, 1), brightness=c(0.25, 1), seq_by="hsv", plot=TRUE)
The main goal of my project is to help the reduce the size of image. We will see the result under Principal Component compression. For this we need install the package ” metrics”. We can compare Mean Squared Error.
library(Metrics)
sizes<-matrix(0, nrow=9, ncol=4)
colnames(sizes)<-c("Number of PC", "Photo size", "Compression ratio", "MSE-Mean Squared Error")
sizes[,1]<-round(vec,0)
for(i in 1:9) {
path<-paste("garfield_", round(vec[i],0), "_princ_comp.jpg", sep="")
sizes[i,2]<-file.info(path)$size
garfield_mse<-readJPEG(path)
sizes[i,4]<-mse(garfield, garfield_mse) # from Metrics::
}
sizes[,3]<-round(as.numeric(sizes[,2])/as.numeric(sizes[9,2]),3)
sizes
## Number of PC Photo size Compression ratio MSE-Mean Squared Error
## [1,] 3 89664 0.630 0.0509757748
## [2,] 93 159200 1.119 0.0064841413
## [3,] 182 161350 1.134 0.0046093539
## [4,] 272 156259 1.098 0.0033688492
## [5,] 362 150336 1.056 0.0023020855
## [6,] 451 147357 1.035 0.0015079893
## [7,] 541 146623 1.030 0.0010093820
## [8,] 630 144379 1.015 0.0006405192
## [9,] 720 142313 1.000 0.0005139943
From the table you can see the number of PC, photo size, Compression rate MSE indicators. Mean Squared Error is the measure of the difference between original and compressed photos. So our sizes of our image has been reduced and being clearer.