Introduction

This analysis is wanted to recognize flower images. The dataset is obtained from Kaggle https://www.kaggle.com/alxmamaev/flowers-recognition

“The pictures are divided into five classes: chamomile, tulip, rose, sunflower, dandelion. For each class there are about 800 photos. Photos are not high resolution, about 320x240 pixels. Photos are not reduced to a single size, they have different proportions.”

## Linking to ImageMagick 6.9.9.14
## Enabled features: cairo, freetype, fftw, ghostscript, lcms, pango, rsvg, webp
## Disabled features: fontconfig, x11
## -- Attaching packages ----------------------------------------------------------------------------- tidyverse 1.3.0 --
## v ggplot2 3.3.0     v purrr   0.3.3
## v tibble  3.0.0     v dplyr   0.8.5
## v tidyr   1.0.2     v stringr 1.4.0
## v readr   1.3.1     v forcats 0.5.0
## -- Conflicts -------------------------------------------------------------------------------- tidyverse_conflicts() --
## x dplyr::explain() masks lime::explain()
## x dplyr::filter()  masks stats::filter()
## x dplyr::lag()     masks stats::lag()

Loading Image

In this step, we will scale the pixe values and we did not augmenting the data.

Next, by using the flow_images_from_directory, I will load the images from the defined directory before and load them into memory and resizing them.

## 
##   0   1   2   3   4 
## 884 654 696 942 692
## $Tulip
## [1] 0
## 
## $Sunflower
## [1] 1
## 
## $Rose
## [1] 2
## 
## $Dandelion
## [1] 3
## 
## $Daisy
## [1] 4

Define Model

Model used: simple sequential convolutional neural net with the following hidden layers: 2 convolutional layers, one pooling layer and one dense layer.

First, we will initialise the model.

Then we will add layers.

Here I will use fit_generator to train the model.

## `geom_smooth()` using formula 'y ~ x'

By looking the graph, we can conclude that the model created is accurate enough.

## Model
## Model: "sequential"
## ________________________________________________________________________________
## Layer (type)                        Output Shape                    Param #     
## ================================================================================
## conv2d (Conv2D)                     (None, 64, 64, 32)              896         
## ________________________________________________________________________________
## activation (Activation)             (None, 64, 64, 32)              0           
## ________________________________________________________________________________
## conv2d_1 (Conv2D)                   (None, 64, 64, 16)              4624        
## ________________________________________________________________________________
## leaky_re_lu (LeakyReLU)             (None, 64, 64, 16)              0           
## ________________________________________________________________________________
## batch_normalization (BatchNormaliza (None, 64, 64, 16)              64          
## ________________________________________________________________________________
## max_pooling2d (MaxPooling2D)        (None, 32, 32, 16)              0           
## ________________________________________________________________________________
## dropout (Dropout)                   (None, 32, 32, 16)              0           
## ________________________________________________________________________________
## flatten (Flatten)                   (None, 16384)                   0           
## ________________________________________________________________________________
## dense (Dense)                       (None, 100)                     1638500     
## ________________________________________________________________________________
## activation_1 (Activation)           (None, 100)                     0           
## ________________________________________________________________________________
## dropout_1 (Dropout)                 (None, 100)                     0           
## ________________________________________________________________________________
## dense_1 (Dense)                     (None, 5)                       505         
## ________________________________________________________________________________
## activation_2 (Activation)           (None, 5)                       0           
## ================================================================================
## Total params: 1,644,589
## Trainable params: 1,644,557
## Non-trainable params: 32
## ________________________________________________________________________________

Making Predictions and Explanation.

For this demonstration purpuse, we will take one image for each category to see the prediction.

Superpixels

I will plot the superpixels using lime package, here is the explanation of the plot_superpixels from https://www.rdocumentation.org/packages/lime/versions/0.5.1/topics/plot_superpixels

“The segmentation of an image into superpixels are an important step in generating explanations for image models. It is both important that the segmentation is correct and follows meaningful patterns in the picture, but also that the size/number of superpixels are appropriate. If the important features in the image are chopped into too many segments the permutations will probably damage the picture beyond recognition in almost all cases leading to a poor or failing explanation model. As the size of the object of interest is varying it is impossible to set up hard rules for the number of superpixels to segment into - the larger the object is relative to the size of the image, the fewer superpixels should be generated.”

Image Preparation

In this step, I will prepare images for the prediction and for the explanation example.

##           [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10] [,11] [,12] [,13]
## Tulip     0.35 0.00 0.06 0.32 0.03 0.14 0.61 0.45 0.19  0.03  0.13  0.36  0.05
## Sunflower 0.05 0.69 0.83 0.16 0.00 0.05 0.02 0.01 0.08  0.06  0.29  0.19  0.82
## Rose      0.22 0.01 0.02 0.22 0.06 0.17 0.29 0.54 0.04  0.10  0.10  0.23  0.02
## Dandelion 0.12 0.27 0.06 0.17 0.77 0.16 0.04 0.00 0.45  0.41  0.31  0.11  0.10
## Daisy     0.25 0.03 0.02 0.13 0.13 0.48 0.05 0.01 0.25  0.39  0.17  0.11  0.02
##           [,14] [,15] [,16] [,17] [,18] [,19] [,20] [,21] [,22] [,23] [,24]
## Tulip      0.23  0.66  0.11  0.16  0.13  0.19  0.03  0.31  0.04  0.07  0.94
## Sunflower  0.01  0.00  0.42  0.13  0.10  0.16  0.09  0.01  0.02  0.82  0.00
## Rose       0.53  0.34  0.05  0.14  0.29  0.23  0.16  0.50  0.02  0.03  0.06
## Dandelion  0.03  0.00  0.36  0.30  0.25  0.18  0.73  0.06  0.47  0.07  0.00
## Daisy      0.20  0.00  0.07  0.26  0.24  0.23  0.00  0.11  0.44  0.01  0.00
##           [,25] [,26] [,27] [,28] [,29] [,30] [,31] [,32]
## Tulip      0.08  0.30  0.16  0.43  0.06  0.30  0.52  0.03
## Sunflower  0.12  0.00  0.14  0.20  0.02  0.13  0.01  0.01
## Rose       0.28  0.69  0.28  0.28  0.03  0.27  0.46  0.01
## Dandelion  0.26  0.00  0.21  0.05  0.61  0.18  0.01  0.62
## Daisy      0.25  0.00  0.21  0.04  0.27  0.12  0.01  0.33
##           0           1           2           3           4 
##     "Tulip" "Sunflower"      "Rose" "Dandelion"     "Daisy"

In this chunk, we will train the explainer.

I will choose top 1 class and use 35 features.