Congratulations! This is the end of Neural Network and our Machine Learning Specializations. The last part of this course is closed by filling this quiz.

To complete this assignment, you need to build your classification model to classify the categories of fashion image using Neural Network algorithms in one of the frameworks that is Keras by following these steps:

1 Data Preparation

Let us start our neural network experience by preparing the data first. In this quiz, you will use the fashionmnist dataset. The data is stored as a csv file in this repository as fashionmnist folder. Please load the fashionmnist data under the data_input folder. The fashionmnist folder contains train and test set of 10 different categories for 28 x 28 pixel sized fashion images, use the following glossary for your target labels:

categories <- c("T-shirt", "Trouser", "Pullover", "Dress", 
    "Coat", "Sandal", "Shirt", "Sneaker", "Bag", "Boot")

1.1 Load the library and data

library(readr)
library(keras)
library(caret)

## Loading required package: lattice

## Loading required package: ggplot2

library(dplyr)

## 
## Attaching package: 'dplyr'

## The following objects are masked from 'package:stats':
## 
##     filter, lag

## The following objects are masked from 'package:base':
## 
##     intersect, setdiff, setequal, union

In this phase, please load and investigate our fashionmnist data and store it under fashion_train and fashion_test object. Please use the read_csv() function from the readr package to speed up when reading the data.

fashion_train <- read_csv("data_input/train.csv")

## Parsed with column specification:
## cols(
##   .default = col_double()
## )

## See spec(...) for full column specifications.

fashion_test <- read_csv("data_input/test.csv")

## Parsed with column specification:
## cols(
##   .default = col_double()
## )
## See spec(...) for full column specifications.

head(fashion_test)

## # A tibble: 6 x 785
##   label pixel1 pixel2 pixel3 pixel4 pixel5 pixel6 pixel7 pixel8 pixel9
##   <dbl>  <dbl>  <dbl>  <dbl>  <dbl>  <dbl>  <dbl>  <dbl>  <dbl>  <dbl>
## 1     0      0      0      0      0      0      0      0      9      8
## 2     1      0      0      0      0      0      0      0      0      0
## 3     2      0      0      0      0      0      0     14     53     99
## 4     2      0      0      0      0      0      0      0      0      0
## 5     3      0      0      0      0      0      0      0      0      0
## 6     2      0      0      0      0      0     44    105     44     10
## # ... with 775 more variables: pixel10 <dbl>, pixel11 <dbl>,
## #   pixel12 <dbl>, pixel13 <dbl>, pixel14 <dbl>, pixel15 <dbl>,
## #   pixel16 <dbl>, pixel17 <dbl>, pixel18 <dbl>, pixel19 <dbl>,
## #   pixel20 <dbl>, pixel21 <dbl>, pixel22 <dbl>, pixel23 <dbl>,
## #   pixel24 <dbl>, pixel25 <dbl>, pixel26 <dbl>, pixel27 <dbl>,
## #   pixel28 <dbl>, pixel29 <dbl>, pixel30 <dbl>, pixel31 <dbl>,
## #   pixel32 <dbl>, pixel33 <dbl>, pixel34 <dbl>, pixel35 <dbl>,
## #   pixel36 <dbl>, pixel37 <dbl>, pixel38 <dbl>, pixel39 <dbl>,
## #   pixel40 <dbl>, pixel41 <dbl>, pixel42 <dbl>, pixel43 <dbl>,
## #   pixel44 <dbl>, pixel45 <dbl>, pixel46 <dbl>, pixel47 <dbl>,
## #   pixel48 <dbl>, pixel49 <dbl>, pixel50 <dbl>, pixel51 <dbl>,
## #   pixel52 <dbl>, pixel53 <dbl>, pixel54 <dbl>, pixel55 <dbl>,
## #   pixel56 <dbl>, pixel57 <dbl>, pixel58 <dbl>, pixel59 <dbl>,
## #   pixel60 <dbl>, pixel61 <dbl>, pixel62 <dbl>, pixel63 <dbl>,
## #   pixel64 <dbl>, pixel65 <dbl>, pixel66 <dbl>, pixel67 <dbl>,
## #   pixel68 <dbl>, pixel69 <dbl>, pixel70 <dbl>, pixel71 <dbl>,
## #   pixel72 <dbl>, pixel73 <dbl>, pixel74 <dbl>, pixel75 <dbl>,
## #   pixel76 <dbl>, pixel77 <dbl>, pixel78 <dbl>, pixel79 <dbl>,
## #   pixel80 <dbl>, pixel81 <dbl>, pixel82 <dbl>, pixel83 <dbl>,
## #   pixel84 <dbl>, pixel85 <dbl>, pixel86 <dbl>, pixel87 <dbl>,
## #   pixel88 <dbl>, pixel89 <dbl>, pixel90 <dbl>, pixel91 <dbl>,
## #   pixel92 <dbl>, pixel93 <dbl>, pixel94 <dbl>, pixel95 <dbl>,
## #   pixel96 <dbl>, pixel97 <dbl>, pixel98 <dbl>, pixel99 <dbl>,
## #   pixel100 <dbl>, pixel101 <dbl>, pixel102 <dbl>, pixel103 <dbl>,
## #   pixel104 <dbl>, pixel105 <dbl>, pixel106 <dbl>, pixel107 <dbl>,
## #   pixel108 <dbl>, pixel109 <dbl>, ...

head(fashion_train)

## # A tibble: 6 x 785
##   label pixel1 pixel2 pixel3 pixel4 pixel5 pixel6 pixel7 pixel8 pixel9
##   <dbl>  <dbl>  <dbl>  <dbl>  <dbl>  <dbl>  <dbl>  <dbl>  <dbl>  <dbl>
## 1     2      0      0      0      0      0      0      0      0      0
## 2     9      0      0      0      0      0      0      0      0      0
## 3     6      0      0      0      0      0      0      0      5      0
## 4     0      0      0      0      1      2      0      0      0      0
## 5     3      0      0      0      0      0      0      0      0      0
## 6     4      0      0      0      5      4      5      5      3      5
## # ... with 775 more variables: pixel10 <dbl>, pixel11 <dbl>,
## #   pixel12 <dbl>, pixel13 <dbl>, pixel14 <dbl>, pixel15 <dbl>,
## #   pixel16 <dbl>, pixel17 <dbl>, pixel18 <dbl>, pixel19 <dbl>,
## #   pixel20 <dbl>, pixel21 <dbl>, pixel22 <dbl>, pixel23 <dbl>,
## #   pixel24 <dbl>, pixel25 <dbl>, pixel26 <dbl>, pixel27 <dbl>,
## #   pixel28 <dbl>, pixel29 <dbl>, pixel30 <dbl>, pixel31 <dbl>,
## #   pixel32 <dbl>, pixel33 <dbl>, pixel34 <dbl>, pixel35 <dbl>,
## #   pixel36 <dbl>, pixel37 <dbl>, pixel38 <dbl>, pixel39 <dbl>,
## #   pixel40 <dbl>, pixel41 <dbl>, pixel42 <dbl>, pixel43 <dbl>,
## #   pixel44 <dbl>, pixel45 <dbl>, pixel46 <dbl>, pixel47 <dbl>,
## #   pixel48 <dbl>, pixel49 <dbl>, pixel50 <dbl>, pixel51 <dbl>,
## #   pixel52 <dbl>, pixel53 <dbl>, pixel54 <dbl>, pixel55 <dbl>,
## #   pixel56 <dbl>, pixel57 <dbl>, pixel58 <dbl>, pixel59 <dbl>,
## #   pixel60 <dbl>, pixel61 <dbl>, pixel62 <dbl>, pixel63 <dbl>,
## #   pixel64 <dbl>, pixel65 <dbl>, pixel66 <dbl>, pixel67 <dbl>,
## #   pixel68 <dbl>, pixel69 <dbl>, pixel70 <dbl>, pixel71 <dbl>,
## #   pixel72 <dbl>, pixel73 <dbl>, pixel74 <dbl>, pixel75 <dbl>,
## #   pixel76 <dbl>, pixel77 <dbl>, pixel78 <dbl>, pixel79 <dbl>,
## #   pixel80 <dbl>, pixel81 <dbl>, pixel82 <dbl>, pixel83 <dbl>,
## #   pixel84 <dbl>, pixel85 <dbl>, pixel86 <dbl>, pixel87 <dbl>,
## #   pixel88 <dbl>, pixel89 <dbl>, pixel90 <dbl>, pixel91 <dbl>,
## #   pixel92 <dbl>, pixel93 <dbl>, pixel94 <dbl>, pixel95 <dbl>,
## #   pixel96 <dbl>, pixel97 <dbl>, pixel98 <dbl>, pixel99 <dbl>,
## #   pixel100 <dbl>, pixel101 <dbl>, pixel102 <dbl>, pixel103 <dbl>,
## #   pixel104 <dbl>, pixel105 <dbl>, pixel106 <dbl>, pixel107 <dbl>,
## #   pixel108 <dbl>, pixel109 <dbl>, ...

Peek a fashion_train data by using head() function

head(fashion_train)

## # A tibble: 6 x 785
##   label pixel1 pixel2 pixel3 pixel4 pixel5 pixel6 pixel7 pixel8 pixel9
##   <dbl>  <dbl>  <dbl>  <dbl>  <dbl>  <dbl>  <dbl>  <dbl>  <dbl>  <dbl>
## 1     2      0      0      0      0      0      0      0      0      0
## 2     9      0      0      0      0      0      0      0      0      0
## 3     6      0      0      0      0      0      0      0      5      0
## 4     0      0      0      0      1      2      0      0      0      0
## 5     3      0      0      0      0      0      0      0      0      0
## 6     4      0      0      0      5      4      5      5      3      5
## # ... with 775 more variables: pixel10 <dbl>, pixel11 <dbl>,
## #   pixel12 <dbl>, pixel13 <dbl>, pixel14 <dbl>, pixel15 <dbl>,
## #   pixel16 <dbl>, pixel17 <dbl>, pixel18 <dbl>, pixel19 <dbl>,
## #   pixel20 <dbl>, pixel21 <dbl>, pixel22 <dbl>, pixel23 <dbl>,
## #   pixel24 <dbl>, pixel25 <dbl>, pixel26 <dbl>, pixel27 <dbl>,
## #   pixel28 <dbl>, pixel29 <dbl>, pixel30 <dbl>, pixel31 <dbl>,
## #   pixel32 <dbl>, pixel33 <dbl>, pixel34 <dbl>, pixel35 <dbl>,
## #   pixel36 <dbl>, pixel37 <dbl>, pixel38 <dbl>, pixel39 <dbl>,
## #   pixel40 <dbl>, pixel41 <dbl>, pixel42 <dbl>, pixel43 <dbl>,
## #   pixel44 <dbl>, pixel45 <dbl>, pixel46 <dbl>, pixel47 <dbl>,
## #   pixel48 <dbl>, pixel49 <dbl>, pixel50 <dbl>, pixel51 <dbl>,
## #   pixel52 <dbl>, pixel53 <dbl>, pixel54 <dbl>, pixel55 <dbl>,
## #   pixel56 <dbl>, pixel57 <dbl>, pixel58 <dbl>, pixel59 <dbl>,
## #   pixel60 <dbl>, pixel61 <dbl>, pixel62 <dbl>, pixel63 <dbl>,
## #   pixel64 <dbl>, pixel65 <dbl>, pixel66 <dbl>, pixel67 <dbl>,
## #   pixel68 <dbl>, pixel69 <dbl>, pixel70 <dbl>, pixel71 <dbl>,
## #   pixel72 <dbl>, pixel73 <dbl>, pixel74 <dbl>, pixel75 <dbl>,
## #   pixel76 <dbl>, pixel77 <dbl>, pixel78 <dbl>, pixel79 <dbl>,
## #   pixel80 <dbl>, pixel81 <dbl>, pixel82 <dbl>, pixel83 <dbl>,
## #   pixel84 <dbl>, pixel85 <dbl>, pixel86 <dbl>, pixel87 <dbl>,
## #   pixel88 <dbl>, pixel89 <dbl>, pixel90 <dbl>, pixel91 <dbl>,
## #   pixel92 <dbl>, pixel93 <dbl>, pixel94 <dbl>, pixel95 <dbl>,
## #   pixel96 <dbl>, pixel97 <dbl>, pixel98 <dbl>, pixel99 <dbl>,
## #   pixel100 <dbl>, pixel101 <dbl>, pixel102 <dbl>, pixel103 <dbl>,
## #   pixel104 <dbl>, pixel105 <dbl>, pixel106 <dbl>, pixel107 <dbl>,
## #   pixel108 <dbl>, pixel109 <dbl>, ...

The fashion_train data consists of 60000 observations and 785 variables (1 target and 784 predictors). The predictors themselves contain the pixel of the image.

1.2 Convert the data to matrix

The data we have loaded above contains the value of pixels stored in data frame. Meanwhile, we have to convert the data into the matrix before we modeled the data, hence please convert the data to be matrix format using data.matrix() function and store it the fashion_train matrix as train_m and fashion_test matrix as test_m

train_m <- data.matrix(fashion_train)
test_m <- data.matrix(fashion_test)

class(train_m)

## [1] "matrix"

class(test_m)

## [1] "matrix"

1.3 Cross Validation

After that, we should separate the predictors and target in our train_m and test_m data

# Predictor variables in `train_m`
train_x <-  train_m[,-1]
str(train_x)

##  num [1:60000, 1:784] 0 0 0 0 0 0 0 0 0 0 ...
##  - attr(*, "dimnames")=List of 2
##   ..$ : NULL
##   ..$ : chr [1:784] "pixel1" "pixel2" "pixel3" "pixel4" ...

# Predictor variables in `test_m`
test_x <- test_m[,-1]
str(test_x)

##  num [1:10000, 1:784] 0 0 0 0 0 0 0 0 0 0 ...
##  - attr(*, "dimnames")=List of 2
##   ..$ : NULL
##   ..$ : chr [1:784] "pixel1" "pixel2" "pixel3" "pixel4" ...

# Target variables in `train_m`
train_y <- train_m[,1]
str(train_y)

##  num [1:60000] 2 9 6 0 3 4 4 5 4 8 ...

# Target variables in `test_m`
test_y <- test_m[,1]
str(test_y)

##  num [1:10000] 0 1 2 2 3 2 8 6 5 0 ...

1.4 Prepare training and testing set (change to an array)

Next, for the matrix variables that contain predictor variables, we should convert it to array shape. Please use the array_reshape(data, dim(data)) to do that

train_x_array <- array_reshape(train_x, c(nrow(train_x),784 ))
test_x_array <- array_reshape(test_x, c(nrow(test_x),784))

1.5 Features scaling

Then scale the train_x_array and test_x_array by dividing to 255.

train_x.keras <- train_x_array/255
test_x.keras <- test_x_array/255

To prepare the data for the training model, we one-hot encode the vectors (train_y) into binary class matrices using to_categorical() function from Keras and stored it as train_y.keras object

train_y.keras <- to_categorical(fashion_train$label,10)

2 Build Neural Network Model

2.1 Build a model base using `keras_model_sequential()`

To organize the layers, we should create a base model, which is a Sequential model. Call a keras_model_sequential() function, and please pipe the base model with the model architecture.

2.2 Building Architecture (define layers, neurons, and activation function)

To define the architecture for each layer, we will build several models by tuning several parameters. Before building the architecture, we set the initializer to make sure the result will not change.

set.seed(100)
initializer <- initializer_random_normal(seed = 100)

First, create a model (stored it under model_init)by defining these parameters as: - the first layer contains 32 nodes, relu activation function, 784 input shape - the second layer contains 32 nodes, relu activation function - the third layer contains 10 nodes, softmax activation function

model_init <- keras_model_sequential() %>% 
  layer_dense(units = 32, activation = 'relu', input_shape = 784,
              kernel_initializer = initializer, bias_initializer = initializer) %>% 
  layer_dense(units = 32, activation = 'relu',
              kernel_initializer = initializer, bias_initializer = initializer) %>% 
  layer_dense(units = 10, activation = 'softmax', 
              kernel_initializer = initializer, bias_initializer = initializer)
summary(model_init)

## Model: "sequential"
## ___________________________________________________________________________
## Layer (type)                     Output Shape                  Param #     
## ===========================================================================
## dense (Dense)                    (None, 32)                    25120       
## ___________________________________________________________________________
## dense_1 (Dense)                  (None, 32)                    1056        
## ___________________________________________________________________________
## dense_2 (Dense)                  (None, 10)                    330         
## ===========================================================================
## Total params: 26,506
## Trainable params: 26,506
## Non-trainable params: 0
## ___________________________________________________________________________

Second, create a model (stored it under model_bigger)by defining these parameters as: - the first layer contains 512 nodes, relu activation function, 784 input shape - the second layer contains 512 nodes, relu activation function - the third layer contains 10 nodes, softmax activation function

model_bigger <- keras_model_sequential() %>% 
  layer_dense(units = 512, activation = 'relu', input_shape = 784,
              kernel_initializer = initializer, bias_initializer = initializer) %>% 
  layer_dense(units = 512, activation = 'relu',
              kernel_initializer = initializer, bias_initializer = initializer) %>% 
  layer_dense(units = 10, activation = 'softmax',
              kernel_initializer = initializer, bias_initializer = initializer)
summary(model_bigger)

## Model: "sequential_1"
## ___________________________________________________________________________
## Layer (type)                     Output Shape                  Param #     
## ===========================================================================
## dense_3 (Dense)                  (None, 512)                   401920      
## ___________________________________________________________________________
## dense_4 (Dense)                  (None, 512)                   262656      
## ___________________________________________________________________________
## dense_5 (Dense)                  (None, 10)                    5130        
## ===========================================================================
## Total params: 669,706
## Trainable params: 669,706
## Non-trainable params: 0
## ___________________________________________________________________________

2.3 Building Architecture (define cost function and optimizer)

In this step, we still need to do several settings before the model_init and model_bigger are ready for training. Then, we should compile the model by defining the loss, optimizer type, and evaluation metrics. Please compile the model by setting these parameters: - categorical crossentropy as loss function - adam as the optimizer with learning rate 0.001 - used the accuracy as the metrics

model_init %>% 
  compile(loss = 'categorical_crossentropy', 
          optimizer = optimizer_adam (lr = 0.001), 
          metrics = c('accuracy')
          )

model_bigger %>% 
  compile(loss = 'categorical_crossentropy', 
          optimizer = optimizer_adam (lr = 0.001), 
          metrics = c('accuracy')
          )

2.4 Fitting model in the training set (define epoch and batch size)

In this step, we fit our model using epochs = 10 and batch_size = 100 for those model_init and model_bigger. Please save the model in history_init and history_bigger object.

history_init <- model_init %>%
  fit(train_x.keras, 
      train_y.keras, 
      epochs = 10, 
      batch_size = 100)

history_bigger <- model_bigger %>% 
  fit(train_x.keras, 
      train_y.keras, 
      epoch = 10, 
      batch_size = 100)

3 Predicting on the testing set

After we built our model, we then predict the testing (test_x.keras) data using the model that we have built. Please predict using predict_classes() function from Keras package and store it under pred_init and pred_bigger.

pred_init <- keras::predict_classes(object = model_init, x= test_x.keras)
head(pred_init)

## [1] 0 1 2 2 4 6

pred_bigger <- keras::predict_classes(object = model_bigger, x= test_x.keras)
head(pred_bigger)

## [1] 0 1 2 2 4 6

4 Evaluating the neural network model

As the label is still in dbl type, then please decode the label based on its categories.

decode <- function(data){
  sapply(as.character(data), switch,
       "0" = "T-Shirt",
       "1" = "Trouser",
       "2" = "Pullover",
       "3" = "Dress",
       "4" = "Coat",
       "5" = "Sandal",
       "6" = "Shirt",
       "7" = "Sneaker",
       "8" = "Bag",
       "9" = "Boot")
}

Then, decode the pred_init and pred_bigger before we evaluate the model performance using confusion matrix

reference <- decode(test_y)
pred_decode_in <- decode(pred_init)
head(pred_decode_in)

##          0          1          2          2          4          6 
##  "T-Shirt"  "Trouser" "Pullover" "Pullover"     "Coat"    "Shirt"

pred_decode_big <- decode(pred_bigger)
head(pred_decode_big)

##          0          1          2          2          4          6 
##  "T-Shirt"  "Trouser" "Pullover" "Pullover"     "Coat"    "Shirt"

4.1 Confusion Matrix (classification)

After decoding the target variable, then you can evaluate the model using several metrics, in this quiz, please check the accuracy in the confusion matrix below.

Note: do not forget to do the explicit coercion as.factor.

library(caret)
confusionMatrix(as.factor(pred_decode_in), as.factor(reference))

## Confusion Matrix and Statistics
## 
##           Reference
## Prediction Bag Boot Coat Dress Pullover Sandal Shirt Sneaker T-Shirt
##   Bag      965    2    3     1        5      5    10       3      17
##   Boot       1  914    0     0        0      9     0      23       0
##   Coat       5    0  814    19       90      0    73       0       0
##   Dress      3    0   38   909       18      1    38       0      38
##   Pullover   7    0   85     8      806      0    99       0      13
##   Sandal     2   21    0     1        0    944     0      34       2
##   Shirt      7    0   60    16       65      0   605       0      88
##   Sneaker    5   63    0     0        0     39     0     940       0
##   T-Shirt    5    0    0    29       16      2   174       0     841
##   Trouser    0    0    0    17        0      0     1       0       1
##           Reference
## Prediction Trouser
##   Bag            0
##   Boot           0
##   Coat           1
##   Dress         20
##   Pullover       1
##   Sandal         1
##   Shirt          2
##   Sneaker        0
##   T-Shirt        2
##   Trouser      973
## 
## Overall Statistics
##                                           
##                Accuracy : 0.8711          
##                  95% CI : (0.8644, 0.8776)
##     No Information Rate : 0.1             
##     P-Value [Acc > NIR] : < 2.2e-16       
##                                           
##                   Kappa : 0.8568          
##                                           
##  Mcnemar's Test P-Value : NA              
## 
## Statistics by Class:
## 
##                      Class: Bag Class: Boot Class: Coat Class: Dress
## Sensitivity              0.9650      0.9140      0.8140       0.9090
## Specificity              0.9949      0.9963      0.9791       0.9827
## Pos Pred Value           0.9545      0.9652      0.8124       0.8535
## Neg Pred Value           0.9961      0.9905      0.9793       0.9898
## Prevalence               0.1000      0.1000      0.1000       0.1000
## Detection Rate           0.0965      0.0914      0.0814       0.0909
## Detection Prevalence     0.1011      0.0947      0.1002       0.1065
## Balanced Accuracy        0.9799      0.9552      0.8966       0.9458
##                      Class: Pullover Class: Sandal Class: Shirt
## Sensitivity                   0.8060        0.9440       0.6050
## Specificity                   0.9763        0.9932       0.9736
## Pos Pred Value                0.7910        0.9393       0.7177
## Neg Pred Value                0.9784        0.9938       0.9569
## Prevalence                    0.1000        0.1000       0.1000
## Detection Rate                0.0806        0.0944       0.0605
## Detection Prevalence          0.1019        0.1005       0.0843
## Balanced Accuracy             0.8912        0.9686       0.7893
##                      Class: Sneaker Class: T-Shirt Class: Trouser
## Sensitivity                  0.9400         0.8410         0.9730
## Specificity                  0.9881         0.9747         0.9979
## Pos Pred Value               0.8978         0.7867         0.9808
## Neg Pred Value               0.9933         0.9822         0.9970
## Prevalence                   0.1000         0.1000         0.1000
## Detection Rate               0.0940         0.0841         0.0973
## Detection Prevalence         0.1047         0.1069         0.0992
## Balanced Accuracy            0.9641         0.9078         0.9854

confusionMatrix(as.factor(pred_decode_big), as.factor(reference))

## Confusion Matrix and Statistics
## 
##           Reference
## Prediction Bag Boot Coat Dress Pullover Sandal Shirt Sneaker T-Shirt
##   Bag      976    0    0     2        1      1     6       0       3
##   Boot       1  961    0     0        0     12     0      52       0
##   Coat       1    0  877    51      101      0    67       0       1
##   Dress      3    0   13   881       11      0    18       0       9
##   Pullover   4    0   48     5      781      0    45       0       5
##   Sandal     2   17    0     1        0    950     0      11       1
##   Shirt      4    0   60    22       79      1   710       0      89
##   Sneaker    1   22    0     0        0     35     0     937       0
##   T-Shirt    8    0    2    29       27      1   153       0     891
##   Trouser    0    0    0     9        0      0     1       0       1
##           Reference
## Prediction Trouser
##   Bag            0
##   Boot           0
##   Coat           0
##   Dress         10
##   Pullover       0
##   Sandal         0
##   Shirt          1
##   Sneaker        0
##   T-Shirt        3
##   Trouser      986
## 
## Overall Statistics
##                                           
##                Accuracy : 0.895           
##                  95% CI : (0.8888, 0.9009)
##     No Information Rate : 0.1             
##     P-Value [Acc > NIR] : < 2.2e-16       
##                                           
##                   Kappa : 0.8833          
##                                           
##  Mcnemar's Test P-Value : NA              
## 
## Statistics by Class:
## 
##                      Class: Bag Class: Boot Class: Coat Class: Dress
## Sensitivity              0.9760      0.9610      0.8770       0.8810
## Specificity              0.9986      0.9928      0.9754       0.9929
## Pos Pred Value           0.9869      0.9366      0.7987       0.9323
## Neg Pred Value           0.9973      0.9957      0.9862       0.9869
## Prevalence               0.1000      0.1000      0.1000       0.1000
## Detection Rate           0.0976      0.0961      0.0877       0.0881
## Detection Prevalence     0.0989      0.1026      0.1098       0.0945
## Balanced Accuracy        0.9873      0.9769      0.9262       0.9369
##                      Class: Pullover Class: Sandal Class: Shirt
## Sensitivity                   0.7810        0.9500       0.7100
## Specificity                   0.9881        0.9964       0.9716
## Pos Pred Value                0.8795        0.9674       0.7350
## Neg Pred Value                0.9760        0.9945       0.9679
## Prevalence                    0.1000        0.1000       0.1000
## Detection Rate                0.0781        0.0950       0.0710
## Detection Prevalence          0.0888        0.0982       0.0966
## Balanced Accuracy             0.8846        0.9732       0.8408
##                      Class: Sneaker Class: T-Shirt Class: Trouser
## Sensitivity                  0.9370         0.8910         0.9860
## Specificity                  0.9936         0.9752         0.9988
## Pos Pred Value               0.9417         0.7998         0.9890
## Neg Pred Value               0.9930         0.9877         0.9984
## Prevalence                   0.1000         0.1000         0.1000
## Detection Rate               0.0937         0.0891         0.0986
## Detection Prevalence         0.0995         0.1114         0.0997
## Balanced Accuracy            0.9653         0.9331         0.9924

4.2 Model Tuning

It turns out; our boss wants to get the best model, then he asks you to compare one model to another model (store it under model_tuning). Now, let us try to build the model_tuning by tuning these while compiling the model : - used the sgd as the optimizer with learning rate 0.001 - the rest is the same with model_init

model_tuning <- keras_model_sequential() %>% 
  layer_dense(units = 32, activation = 'relu', input_shape = c(784)) %>% 
  layer_dense(units = 32, activation = 'relu') %>% 
  layer_dense(units = 10, activation = 'softmax')

model_tuning <- model_tuning %>% 
  compile(loss = 'categorical_crossentropy', 
          optimizer = optimizer_sgd(lr = 0.001), 
          metrics = c('accuracy'))

history_tuning <- model_tuning %>%
  fit(train_x.keras, train_y.keras, epochs = 10, batch_size = 100)

After tuning the model, please do the predict test_x.keras using model_tuning.

pred_tuning <- keras::predict_classes(object = model_tuning, x= test_x.keras)

Then, decode the pred_tuning and check the model performance using confusionMatrix.

pred_decode_tun <- decode(pred_tuning)
head(pred_decode_tun)

##          0          1          2          6          1          6 
##  "T-Shirt"  "Trouser" "Pullover"    "Shirt"  "Trouser"    "Shirt"

confusionMatrix(as.factor(pred_decode_tun), as.factor(reference))

## Confusion Matrix and Statistics
## 
##           Reference
## Prediction Bag Boot Coat Dress Pullover Sandal Shirt Sneaker T-Shirt
##   Bag      913    0    6     2       18      9    30       0      19
##   Boot       2  922    0     0        0    127     0     109       1
##   Coat       8    0  717    27      222      0   285       0       4
##   Dress     14    1   60   865        5      3    51       0      99
##   Pullover   4    0  166     6      614      3   162       0      11
##   Sandal     7   12    0     0        0    596     1      21       2
##   Shirt     37    0   46    44      131      1   238       0      90
##   Sneaker   12   65    0     0        0    258     0     870       1
##   T-Shirt    2    0    0    34        8      3   226       0     761
##   Trouser    1    0    5    22        2      0     7       0      12
##           Reference
## Prediction Trouser
##   Bag            0
##   Boot           0
##   Coat           5
##   Dress         33
##   Pullover      21
##   Sandal         0
##   Shirt          5
##   Sneaker        0
##   T-Shirt        9
##   Trouser      927
## 
## Overall Statistics
##                                           
##                Accuracy : 0.7423          
##                  95% CI : (0.7336, 0.7509)
##     No Information Rate : 0.1             
##     P-Value [Acc > NIR] : < 2.2e-16       
##                                           
##                   Kappa : 0.7137          
##                                           
##  Mcnemar's Test P-Value : NA              
## 
## Statistics by Class:
## 
##                      Class: Bag Class: Boot Class: Coat Class: Dress
## Sensitivity              0.9130      0.9220      0.7170       0.8650
## Specificity              0.9907      0.9734      0.9388       0.9704
## Pos Pred Value           0.9157      0.7941      0.5655       0.7648
## Neg Pred Value           0.9903      0.9912      0.9676       0.9848
## Prevalence               0.1000      0.1000      0.1000       0.1000
## Detection Rate           0.0913      0.0922      0.0717       0.0865
## Detection Prevalence     0.0997      0.1161      0.1268       0.1131
## Balanced Accuracy        0.9518      0.9477      0.8279       0.9177
##                      Class: Pullover Class: Sandal Class: Shirt
## Sensitivity                   0.6140        0.5960       0.2380
## Specificity                   0.9586        0.9952       0.9607
## Pos Pred Value                0.6221        0.9327       0.4020
## Neg Pred Value                0.9572        0.9568       0.9190
## Prevalence                    0.1000        0.1000       0.1000
## Detection Rate                0.0614        0.0596       0.0238
## Detection Prevalence          0.0987        0.0639       0.0592
## Balanced Accuracy             0.7863        0.7956       0.5993
##                      Class: Sneaker Class: T-Shirt Class: Trouser
## Sensitivity                  0.8700         0.7610         0.9270
## Specificity                  0.9627         0.9687         0.9946
## Pos Pred Value               0.7214         0.7296         0.9498
## Neg Pred Value               0.9852         0.9733         0.9919
## Prevalence                   0.1000         0.1000         0.1000
## Detection Rate               0.0870         0.0761         0.0927
## Detection Prevalence         0.1206         0.1043         0.0976
## Balanced Accuracy            0.9163         0.8648         0.9608

           Class: Sneaker Class: T-Shirt Class: Trouser

Sensitivity 0.9340 0.8160 0.9740 Specificity 0.9872 0.9806 0.9968 Pos Pred Value 0.8904 0.8234 0.9711 Neg Pred Value 0.9926 0.9796 0.9971 Prevalence 0.1000 0.1000 0.1000 Detection Rate 0.0934 0.0816 0.0974 Detection Prevalence 0.1049 0.0991 0.1003 Balanced Accuracy 0.9606 0.8983 0.9854