Ho.. Ho.. Ho.. Where is Santa?

Introduction

Source of Picture

Who does not recognize Santa? He’s such a popular man, especially during Christmas. He is a white-bearded man, often with spectacles, wearing a red coat with white fur collar and cuffs, white-fur-cuffed red trousers, a red hat with white fur, and a black leather belt and boots, carrying a bag full of gifts for children [^1]

In this study, we will implement a binary classification image dataset by using a deep learning model (“Convolutional Network”) which can predict if an image is classified as “Santa Claus” or “Not Santa Clause”. The process and workflow of this study follow code from Adyatama [^2]. The dataset can be downloaded from Kaggle Website [^3].

Library And Setup

The following are the library used in this study. We also need to specify the conda environment that will be used.

Exploratory Data Analysis

First, we need to locate the folder for each target class. In here, we have two classes, which are “not-a-santa” and “santa”.

# Locate the folder for target
folder_list <- list.files("santa/train/")

# Observe the folder_list
folder_list
## [1] "not-a-santa" "santa"

We combine the directory of the train folder with the folder list name to observe pictures inside each folder.

# Combine directory and folder list target
folder_path <- paste0("santa/train/", folder_list, "/")

# Observe folder_path 
folder_path
## [1] "santa/train/not-a-santa/" "santa/train/santa/"

To create a loop and gather the file name for each folder (“not-a-santa” and “santa”), we can use the function map(). Then, pipe it with the function unlist()to combine the file name from two folders.

# Get file name
file_name <- map(folder_path, 
                 function(x) paste0(x, list.files(x))) %>% 
                 unlist()

# First 6 file name
head(file_name)
## [1] "santa/train/not-a-santa/10.not-a-santa.jpg" 
## [2] "santa/train/not-a-santa/101.not-a-santa.jpg"
## [3] "santa/train/not-a-santa/102.not-a-santa.jpg"
## [4] "santa/train/not-a-santa/106.not-a-santa.jpg"
## [5] "santa/train/not-a-santa/108.not-a-santa.jpg"
## [6] "santa/train/not-a-santa/11.not-a-santa.jpg"
# Last 6 file name 
tail(file_name)
## [1] "santa/train/santa/94.Santa.jpg" "santa/train/santa/95.Santa.jpg"
## [3] "santa/train/santa/96.Santa.jpg" "santa/train/santa/97.Santa.jpg"
## [5] "santa/train/santa/98.Santa.jpg" "santa/train/santa/99.Santa.jpg"

The total file to be used to train and validate the model is 614 as shown below output.

length(file_name)
## [1] 614

We use the function load.image() to observe the content of the file. Here, we can observe 6 samples of the content.

# Randomly select image
set.seed(500)

# Sample file_name
sample_image <- sample(file_name, 6)

# Load image into R
img <- map(sample_image, load.image)

# Plot image and Create 2 x 3 image grid
par(mfrow = c(2, 3)) 
map(img, plot)

## [[1]]
## Image. Width: 225 pix Height: 225 pix Depth: 1 Colour channels: 3 
## 
## [[2]]
## Image. Width: 178 pix Height: 264 pix Depth: 1 Colour channels: 3 
## 
## [[3]]
## Image. Width: 1024 pix Height: 667 pix Depth: 1 Colour channels: 3 
## 
## [[4]]
## Image. Width: 250 pix Height: 250 pix Depth: 1 Colour channels: 3 
## 
## [[5]]
## Image. Width: 683 pix Height: 1024 pix Depth: 1 Colour channels: 3 
## 
## [[6]]
## Image. Width: 250 pix Height: 250 pix Depth: 1 Colour channels: 3

Check Image Dimension

The information on the image dimension consists of

  • The height and width of the image in pixels.
  • The color channels describes the format of color (1 = grayscale and 3 = RGB)
# Full image description
img <- load.image(file_name[1])

# Observe `img`
img
## Image. Width: 250 pix Height: 250 pix Depth: 1 Colour channels: 3

As shown in the above output, the picture has a width and height of 250 and 250 pixels, respectively. And it has RGB colour channels. Now, we create the function to load information on the width and height of images. We call the function as get_dim().

# Function for acquiring width and height of an image
get_dim <- function(x){
  
  img <- load.image(x) 
  
  df_img <- data.frame(height = height(img),
                       width = width(img),
                       filename = x
                       )
  
  return(df_img)
}

# Observe the function to the first file name
get_dim(file_name[1])
##   height width                                   filename
## 1    250   250 santa/train/not-a-santa/10.not-a-santa.jpg

We are sampling 100 images from the file name to obtain the height and width of the images and observe the first 10 results.

# Randomly get 100 sample images
set.seed(500)
sample_file <- sample(file_name, 100)

# Run the `get_dim()` function for each image
file_dim <- map_df(sample_file, get_dim)

# Check for the first 10 data
head(file_dim, 10)
##    height width                                    filename
## 1     225   225             santa/train/santa/193.Santa.jpg
## 2     264   178             santa/train/santa/279.Santa.jpg
## 3     667  1024             santa/train/santa/311.Santa.jpg
## 4     250   250 santa/train/not-a-santa/384.not-a-santa.jpg
## 5    1024   683 santa/train/not-a-santa/534.not-a-santa.jpg
## 6     250   250  santa/train/not-a-santa/19.not-a-santa.jpg
## 7    1061  1600             santa/train/santa/546.Santa.jpg
## 8     250   250 santa/train/not-a-santa/198.not-a-santa.jpg
## 9     259   194 santa/train/not-a-santa/349.not-a-santa.jpg
## 10    189   267             santa/train/santa/502.Santa.jpg

Observe the summary of the file_dim. The ranges of images are between 150 and 5176 pixels.

summary(file_dim)
##      height           width          filename        
##  Min.   : 150.0   Min.   : 175.0   Length:100        
##  1st Qu.: 224.0   1st Qu.: 250.0   Class :character  
##  Median : 250.0   Median : 250.0   Mode  :character  
##  Mean   : 461.2   Mean   : 532.9                     
##  3rd Qu.: 278.5   3rd Qu.: 336.2                     
##  Max.   :3444.0   Max.   :5176.0

Data Processing

In this section, we determine the input size for the image, so that all input images will have similar dimensions. We use 128 x 128 pixels. Then, we set the batch size to 100.

# Desired height and width of images
target_size <- c(128,128)

# Batch size for training the model
batch_size <- 100

In this section, we will use “Image Augmentation” to increase the number of training data without acquiring new images by using an image data generator. We create the image generator with the following properties :

  • Scaling the pixel value by dividing the pixel value by 255
  • Flip the image horizontally
  • Flip the image vertically
  • Rotate image from 0 and 45 degree
  • Zoom in or zoom out by 25% (zoom 75% or 125%)
  • Use 20% of the data as a validation dataset
# Image Generator
train_data_gen <- image_data_generator(rescale = 1/255, # Scaling pixel value
                                       horizontal_flip = T, # Flip image horizontally
                                       vertical_flip = T, # Flip image vertically 
                                       rotation_range = 45, # Rotate image from 0 to 45 degrees
                                       zoom_range = 0.25, # Zoom in or zoom out range
                                       validation_split = 0.2 # 20% data as validation data
                                       )
## Loaded Tensorflow version 2.0.0

We load the train and validation data into the image generator by using flow_images_from_directory(). The directory will be santa/train/.

# Train Dataset
train_image_array_gen <- flow_images_from_directory(directory = "santa/train/", # Folder of the data
                                                    target_size = target_size, # target of the image dimension (128 x 128)  
                                                    color_mode = "rgb", # use RGB color
                                                    batch_size = batch_size , # batch size is 100
                                                    seed = 123,  # set random seed
                                                    subset = "training", # declare that this is for training data
                                                    generator = train_data_gen
                                                    )

# Validation Dataset
val_image_array_gen <- flow_images_from_directory(directory = "santa/train/",
                                                  target_size = target_size, 
                                                  color_mode = "rgb", 
                                                  batch_size = batch_size ,
                                                  seed = 123,
                                                  subset = "validation", 
                                                  generator = train_data_gen
                                                  )

In the below code, we observe the proportion of both classes in the training dataset. Also, we specify the number of target classes output_n.

# Number of training samples
train_samples <- train_image_array_gen$n

# Number of validation samples
valid_samples <- val_image_array_gen$n

# Number of target classes/categories
output_n <- n_distinct(train_image_array_gen$classes)

# Get the class proportion
table("\nClass Proportion" = factor(train_image_array_gen$classes)) %>% 
  prop.table()
## 
## Class Proportion
##   0   1 
## 0.5 0.5

The class proportion for the train dataset is balance. The index represent the label for each class ordered alphabetically (0 = not-a-santa, 1 = santa).

Model Original

Model Architecture

A Convolutional Neural Network (ConvNet/CNN) is a deep learning method that can process an input image, assign importance to learnable weights and biases to be able to differentiate one image from the other [^4]. The CNN also can be performed to speech or audio signal inputs. The following are model architectures for model original.

  • Convolutional layer is the first layer of CNN. This layer can extract features from the 2D image and identify the greater portions of the image.
  • Max Pooling layer to downsample the image features
  • Flattening layer to flatten data from the 2D array to 1D array
  • Dense layer to capture more information
  • Dense layer for output with softmax activation function.

We set the input shape, which consists of the target_size and the color channels ( RGB = 3 )

# input shape of the image
c(target_size, 3) 
## [1] 128 128   3
# Set Initial Random Weight
tensorflow::tf$random$set_seed(123)

model <- keras_model_sequential() %>% 
  
  # Convolution Layer 1
  layer_conv_2d(filters = 64,
                kernel_size = c(3,3),
                padding = "same",
                activation = "relu",
                input_shape = c(target_size, 3) 
                ) %>% 

  # Max Pooling Layer 1
  layer_max_pooling_2d(pool_size = c(2,2),
                       strides = c(2,2)) %>% 
  
  # Convolution Layer 2
  layer_conv_2d(filters = 64,
                kernel_size = c(3,3),
                padding = "same",
                activation = "relu"
                ) %>% 

  # Max Pooling Layer 2
  layer_max_pooling_2d(pool_size = c(2,2),
                       strides = c(2,2)) %>% 
  
  # Flattening Layer
  layer_flatten() %>% 
  
  # Dense Layer1
  layer_dense(units = 128,
              activation = "relu") %>%

  # Output Layer
  layer_dense(units = output_n,
              activation = "softmax",
              name = "Output")
  
model
## Model: "sequential"
## ________________________________________________________________________________
## Layer (type)                        Output Shape                    Param #     
## ================================================================================
## conv2d_1 (Conv2D)                   (None, 128, 128, 64)            1792        
## ________________________________________________________________________________
## max_pooling2d_1 (MaxPooling2D)      (None, 64, 64, 64)              0           
## ________________________________________________________________________________
## conv2d (Conv2D)                     (None, 64, 64, 64)              36928       
## ________________________________________________________________________________
## max_pooling2d (MaxPooling2D)        (None, 32, 32, 64)              0           
## ________________________________________________________________________________
## flatten (Flatten)                   (None, 65536)                   0           
## ________________________________________________________________________________
## dense (Dense)                       (None, 128)                     8388736     
## ________________________________________________________________________________
## Output (Dense)                      (None, 2)                       258         
## ================================================================================
## Total params: 8,427,714
## Trainable params: 8,427,714
## Non-trainable params: 0
## ________________________________________________________________________________
  • The input image has 128 x 128 pixels from 64 filters.
  • Set the padding = same to keep the dimension 128 x 128 pixels after being extracted.
  • Downsample for each 2 x 2 pooling area, thus the data has 64 x 64 pixels from 32 filters.
  • Set another convolutional layer and downsample, thus the data has 32 x 32 pixels from 64 filters.
  • Flatten the 2D array into 1D with 32 x 32 x 64 = 65536 nodes.
  • Extract information using a simple dense layer.
  • Output layer will transform the softmax activation function to obtain the probability of each class as the output.

Model Fitting

Since we are dealing with binary classification dataset, we define the loss = binary_crossentropy and optimizer = optimizer_adam

# Compile Model
model %>% 
  compile(
    loss = "binary_crossentropy",
    optimizer = optimizer_adam(learning_rate = 0.001),
    metrics = "accuracy"
  )

The model will use validation dataset from the generator to evaluate the model. Also, we set the epoch = 5. We save the model as object called history.

# Fit data into model
history <- model %>% 
  fit(
  # training data
  train_image_array_gen,

  # training epochs
  steps_per_epoch = as.integer(train_samples / batch_size), 
  epochs = 5, 
  
  # validation data
  validation_data = val_image_array_gen,
  validation_steps = as.integer(valid_samples / batch_size)
) 

# Plot the history
plot(history)

Model Evaluation

We evaluate the model by using a confusion matrix using the validation data from the generator. We create an object called val_data containing the file name of the image. For this file name, we extract the categorical label as the actual value of the target variable.

val_data <- data.frame(file_name = paste0("santa/train/", val_image_array_gen$filenames)) %>% 
  mutate(class = str_extract(file_name, "not-a-santa|Santa"))

# Observe the val_data
tail(val_data, 10)
##                           file_name class
## 113 santa/train/santa/193.Santa.jpg Santa
## 114 santa/train/santa/194.Santa.jpg Santa
## 115 santa/train/santa/197.Santa.jpg Santa
## 116 santa/train/santa/198.Santa.jpg Santa
## 117   santa/train/santa/2.Santa.jpg Santa
## 118  santa/train/santa/20.Santa.jpg Santa
## 119 santa/train/santa/203.Santa.jpg Santa
## 120 santa/train/santa/204.Santa.jpg Santa
## 121 santa/train/santa/205.Santa.jpg Santa
## 122 santa/train/santa/206.Santa.jpg Santa

We convert the image into an array with a dimension of 128 x 128 pixels with 3 colour channels (RGB). We perform this to ensure the testing data is in the actual image.

# Function to convert image to array
image_prep <- function(x) {
  arrays <- lapply(x, function(path) {
    img <- image_load(path, target_size = target_size, 
                      grayscale = F # Set FALSE if image is RGB
                      )
    
    x <- image_to_array(img)
    x <- array_reshape(x, c(1, dim(x)))
    x <- x/255 # rescale image pixel
  })
  do.call(abind::abind, c(arrays, list(along = 1)))
}
test_x <- image_prep(val_data$file_name)

# Check dimension of testing data set
dim(test_x)
## [1] 122 128 128   3

The validation data consists of 122 images with dimensions of 128 x 128 pixels and 3 colour channels (RGB). Then, we perform prediction on the test_x data.

# Prediction
pred_val <- predict_classes(model, test_x) 

# Observe first 10 prediction
head(pred_val, 10)
##  [1] 0 0 1 0 0 0 1 0 0 0

We convert the encoding into proper class label by using function decode().

# Convert encoding to label
decode <- function(x){
  case_when(x == 0 ~ "not-a-santa",
            x == 1 ~ "Santa"
            )
}

# Apply the function to `pred_val`
pred_val <- sapply(pred_val, decode) 

# Observe `pred_val`
head(pred_val, 10)
##  [1] "not-a-santa" "not-a-santa" "Santa"       "not-a-santa" "not-a-santa"
##  [6] "not-a-santa" "Santa"       "not-a-santa" "not-a-santa" "not-a-santa"
confusionMatrix(as.factor(pred_val), as.factor(val_data$class), positive="Santa")
## Confusion Matrix and Statistics
## 
##              Reference
## Prediction    not-a-santa Santa
##   not-a-santa          48     3
##   Santa                13    58
##                                               
##                Accuracy : 0.8689              
##                  95% CI : (0.7958, 0.9231)    
##     No Information Rate : 0.5                 
##     P-Value [Acc > NIR] : < 0.0000000000000002
##                                               
##                   Kappa : 0.7377              
##                                               
##  Mcnemar's Test P-Value : 0.02445             
##                                               
##             Sensitivity : 0.9508              
##             Specificity : 0.7869              
##          Pos Pred Value : 0.8169              
##          Neg Pred Value : 0.9412              
##              Prevalence : 0.5000              
##          Detection Rate : 0.4754              
##    Detection Prevalence : 0.5820              
##       Balanced Accuracy : 0.8689              
##                                               
##        'Positive' Class : Santa               
## 

For this study, we want to have the highest accuracy for the model. As shown in the confusion matrix, the accuracy of the model is quite good but it could be improved. We can perform model tuning and observe if the tune model can increase the accuracy of the original model.

Model Tuning

Model Architecture

The improvement model will have additional CNN layers, so that more information can be captured. The following is our improved model architecture:

  • 1st Convolutional layer to extract features from 2D image with relu activation function
  • 2nd Convolutional layer to extract features from 2D image with relu activation function
  • Max pooling layer
  • 3rd Convolutional layer to extract features from 2D image with relu activation function
  • Max pooling layer
  • 4th Convolutional layer to extract features from 2D image with relu activation function
  • Max pooling layer
  • 5th Convolutional layer to extract features from 2D image with relu activation function
  • Max pooling layer
  • Flattening layer from 2D array to 1D array
  • Dense layer to capture more information
  • Dense layer for output layer

Model Fitting

tensorflow::tf$random$set_seed(123)

model_big <- keras_model_sequential() %>% 
  
  # First convolutional layer
  layer_conv_2d(filters = 128,
                kernel_size = c(5,5), # 5 x 5 filters
                padding = "same",
                activation = "relu",
                input_shape = c(target_size, 3)
                ) %>% 
  
  # Second convolutional layer
  layer_conv_2d(filters = 128,
                kernel_size = c(3,3), # 3 x 3 filters
                padding = "same",
                activation = "relu"
                ) %>% 
  
  # Max pooling layer
  layer_max_pooling_2d(pool_size = c(2,2)) %>% 
  
  # Third convolutional layer
  layer_conv_2d(filters = 64,
                kernel_size = c(3,3),
                padding = "same",
                activation = "relu"
                ) %>% 

  # Max pooling layer
  layer_max_pooling_2d(pool_size = c(2,2)) %>% 
  
  # Fourth convolutional layer
  layer_conv_2d(filters = 128,
                kernel_size = c(3,3),
                padding = "same",
                activation = "relu"
                ) %>% 
  
  # Max pooling layer
  layer_max_pooling_2d(pool_size = c(2,2)) %>% 

  # Fifth convolutional layer
  layer_conv_2d(filters = 256,
                kernel_size = c(3,3),
                padding = "same",
                activation = "relu"
                ) %>% 
  
  # Max pooling layer
  layer_max_pooling_2d(pool_size = c(2,2)) %>% 
  
  # Flattening layer
  layer_flatten() %>% 
  
  # Dense layer
  layer_dense(units = 64,
              activation = "relu") %>% 
  
  # Output layer
  layer_dense(name = "Output",
              units = 2, 
              activation = "softmax")

model_big
## Model: "sequential_1"
## ________________________________________________________________________________
## Layer (type)                        Output Shape                    Param #     
## ================================================================================
## conv2d_6 (Conv2D)                   (None, 128, 128, 128)           9728        
## ________________________________________________________________________________
## conv2d_5 (Conv2D)                   (None, 128, 128, 128)           147584      
## ________________________________________________________________________________
## max_pooling2d_5 (MaxPooling2D)      (None, 64, 64, 128)             0           
## ________________________________________________________________________________
## conv2d_4 (Conv2D)                   (None, 64, 64, 64)              73792       
## ________________________________________________________________________________
## max_pooling2d_4 (MaxPooling2D)      (None, 32, 32, 64)              0           
## ________________________________________________________________________________
## conv2d_3 (Conv2D)                   (None, 32, 32, 128)             73856       
## ________________________________________________________________________________
## max_pooling2d_3 (MaxPooling2D)      (None, 16, 16, 128)             0           
## ________________________________________________________________________________
## conv2d_2 (Conv2D)                   (None, 16, 16, 256)             295168      
## ________________________________________________________________________________
## max_pooling2d_2 (MaxPooling2D)      (None, 8, 8, 256)               0           
## ________________________________________________________________________________
## flatten_1 (Flatten)                 (None, 16384)                   0           
## ________________________________________________________________________________
## dense_1 (Dense)                     (None, 64)                      1048640     
## ________________________________________________________________________________
## Output (Dense)                      (None, 2)                       130         
## ================================================================================
## Total params: 1,648,898
## Trainable params: 1,648,898
## Non-trainable params: 0
## ________________________________________________________________________________

We train the data with more epochs and save the model as history_tune.

model_big %>% 
  compile(
    loss = "binary_crossentropy",
    optimizer = optimizer_adam(lr = 0.001),
    metrics = "accuracy")
history_tune <- model_big %>% 
  fit_generator(
  # training data
  train_image_array_gen,
  
  # epochs
  steps_per_epoch = as.integer(train_samples / batch_size), 
  epochs = 30, 
  
  # validation data
  validation_data = val_image_array_gen,
  validation_steps = as.integer(valid_samples / batch_size),
  )

plot(history_tune)
## `geom_smooth()` using formula 'y ~ x'

Model Evaluation

We repeat the evaluate process the data to obtain the confusion matrix.

pred_tune <- predict_classes(model_big, test_x) 

head(pred_tune, 10)
##  [1] 0 0 0 0 0 0 1 0 0 0
# Apply the convert encoding
pred_tune <- sapply(pred_tune, decode) 

# Observe `pred_tune`
head(pred_tune, 10)
##  [1] "not-a-santa" "not-a-santa" "not-a-santa" "not-a-santa" "not-a-santa"
##  [6] "not-a-santa" "Santa"       "not-a-santa" "not-a-santa" "not-a-santa"

Now, let’s observe the confusion matrix.

confusionMatrix(as.factor(pred_tune), 
                as.factor(val_data$class), 
                positive="Santa"
                )
## Confusion Matrix and Statistics
## 
##              Reference
## Prediction    not-a-santa Santa
##   not-a-santa          54     2
##   Santa                 7    59
##                                              
##                Accuracy : 0.9262             
##                  95% CI : (0.8646, 0.9657)   
##     No Information Rate : 0.5                
##     P-Value [Acc > NIR] : <0.0000000000000002
##                                              
##                   Kappa : 0.8525             
##                                              
##  Mcnemar's Test P-Value : 0.1824             
##                                              
##             Sensitivity : 0.9672             
##             Specificity : 0.8852             
##          Pos Pred Value : 0.8939             
##          Neg Pred Value : 0.9643             
##              Prevalence : 0.5000             
##          Detection Rate : 0.4836             
##    Detection Prevalence : 0.5410             
##       Balanced Accuracy : 0.9262             
##                                              
##        'Positive' Class : Santa              
## 

The accuracy of the model improves with tuning the model. Thus, we can perform the prediction on our test dataset.

Prediction Testing Dataset

We will process our test dataset similar with validation dataset using image_data_generator().

# Generate test_data_gen
test_data_gen <- image_data_generator(rescale = 1/255, # Scaling pixel value
                                       horizontal_flip = T, # Flip image horizontally
                                       vertical_flip = T, # Flip image vertically 
                                       rotation_range = 45, # Rotate image from 0 to 45 degrees
                                       zoom_range = 0.25, # Zoom in or zoom out range
                                       validation_split = 0.5 # 50% data as validation data
                                       )
# Generate array for the test_data_gen
test_image_array_gen <- flow_images_from_directory(directory = "santa/test/", # Folder of the data
                                                    target_size = target_size, # target of the image dimension (128 x 128)  
                                                    color_mode = "rgb", # use RGB color
                                                    batch_size = batch_size , # batch size is 100
                                                    seed = 123,  # set random seed
                                                    generator = test_data_gen,
                                                    subset="validation"
                                                    )
# Create test_data
test_data <- data.frame(file_name = paste0("santa/test/", test_image_array_gen$filenames)) %>% 
  mutate(class = str_extract(file_name, "not-a-santa|Santa"))

# Observe `test_data`
tail(test_data)
##                          file_name class
## 303 santa/test/santa/386.Santa.jpg Santa
## 304 santa/test/santa/388.Santa.jpg Santa
## 305 santa/test/santa/389.Santa.jpg Santa
## 306  santa/test/santa/39.Santa.jpg Santa
## 307 santa/test/santa/391.Santa.jpg Santa
## 308 santa/test/santa/393.Santa.jpg Santa

Check dimension of the test dataset

# Create the test data set
test <- image_prep(test_data$file_name)

# Check dimension of testing data set
dim(test)
## [1] 308 128 128   3

We have 308 images and the dimension is already converted to 128 x 128 pixels. Now, we generate the prediction and apply the conversion to label the prediction.

# Generate the prediction 
pred_test <- predict_classes(model_big, test) 


# Apply the convert encoding
pred_test <- sapply(pred_test, decode) 

# Observe `pred_tune`
head(pred_test, 10)
##  [1] "not-a-santa" "not-a-santa" "not-a-santa" "not-a-santa" "not-a-santa"
##  [6] "not-a-santa" "not-a-santa" "not-a-santa" "not-a-santa" "not-a-santa"
confusionMatrix(as.factor(pred_test), 
                as.factor(test_data$class), 
                positive="Santa"
                )
## Confusion Matrix and Statistics
## 
##              Reference
## Prediction    not-a-santa Santa
##   not-a-santa         136     8
##   Santa                18   146
##                                               
##                Accuracy : 0.9156              
##                  95% CI : (0.8788, 0.9441)    
##     No Information Rate : 0.5                 
##     P-Value [Acc > NIR] : < 0.0000000000000002
##                                               
##                   Kappa : 0.8312              
##                                               
##  Mcnemar's Test P-Value : 0.07756             
##                                               
##             Sensitivity : 0.9481              
##             Specificity : 0.8831              
##          Pos Pred Value : 0.8902              
##          Neg Pred Value : 0.9444              
##              Prevalence : 0.5000              
##          Detection Rate : 0.4740              
##    Detection Prevalence : 0.5325              
##       Balanced Accuracy : 0.9156              
##                                               
##        'Positive' Class : Santa               
## 

The model is really good to predict the test dataset. It is shown by high value of accuracy.

Conclusion

The conventional Network technique can be used to predict the image classification (Santa or Not Santa). The tuning model improves the accuracy of original model. It is such a powerful technique to be used to predict image classification. However, the training time to perform the model is quite long.