Sign Mnist

1. Introduction

Sign Language is a means of communication that is predominantly used by those who are deaf or hard of hearing. This sort of gesture-based language allows users to effortlessly transmit ideas and thoughts while overcoming hurdles created by hearing impairments.

The great majority of the world’s population lacks understanding of the language, which is a serious difficulty with this simple mode of communication. Learning Sign Language, like learning any other language, takes time and effort, which discourages the general populace from studying it.

However, in the world of Machine Learning and Image Detection, there is a clear solution to this problem. Using predictive model technology to automatically categorize Sign Language signals can be utilized to provide real-time captioning for virtual conferences such as Zoom meetings and other such things. This would considerably enhance access to such services for persons with hearing impairments since it would work in tandem with voice-based captioning, establishing an online two-way communication system for those with hearing challenges.

Data - set

A lot large Sign Language training datasets can be found on Kaggle, a popular data science resource. The one utilized in this model is called “Sign Language MNIST” and is a public-domain free-to-use dataset comprising pixel information for about 1,000 photos of each of the 24 ASL Letters, except J and Z because they are gesture-based signals.

Image Data

Pixels or picture components make up a digital image. The smallest unit of a picture in a digital platform, according to Cambridge Dictionary, is the pixel. Each pixel contains a value known as the pixel value. A pixel value runs from 0 to 255 and describes the brightness (in grayscale photos) or color intensity (in colored images).

# 2. Preparation ## Import Library

# data processing & modelling
library(rsample) # For sampling data test and data train
library(keras) # Neural Network
library(dplyr) # Wrangling
library(tensorflow) # Architectur Model

# model evaluation
library(caret)

Read Data

mnist_train <- read.csv("archive (2)/sign_mnist_train.csv")
mnist_test <- read.csv("archive (2)/sign_mnist_test.csv")

# your code here (check dimension)
dim(mnist_train)
#> [1] 27455   785

Insight :

  • There is 27,455 Rows data and 785 columns (with 1 Label columns and the rest are pixels)

How many dimension of the data ?

sqrt(784)
#> [1] 28

Insight :

  • There is 28 dimensions, it will going used in function vizTrain in sub bab EDA.

3. EDA

Check Data

head(mnist_train)

Insight :

  • Label = Label Class
  • Pixel 1 ~ Pixel 894 = Predictors in pixel value

Check Propotion Class

prop.table(table(mnist_train$label))
#> 
#>          0          1          2          3          4          5          6 
#> 0.04101257 0.03678747 0.04166818 0.04356219 0.03485704 0.04385358 0.03970133 
#>          7          8         10         11         12         13         14 
#> 0.03689674 0.04232380 0.04057549 0.04520124 0.03842652 0.04192315 0.04356219 
#>         15         16         17         18         19         20         21 
#> 0.03962848 0.04658532 0.04713167 0.04367146 0.04319796 0.04228738 0.03940994 
#>         22         23         24 
#> 0.04461847 0.04239665 0.04072118

Insight :

  • Proportion label of class seems balance within range 3.6% - 4.4%

Data Visualization

# function to visualize image data from csv
vizTrain <- function(input){
  
  dimmax <- sqrt(ncol(input[,-1]))
  
  dimn <- ceiling(sqrt(nrow(input)))
  par(mfrow=c(dimn, dimn), mar=c(.1, .1, .1, .1))
  
  for (i in 1:nrow(input)){
      m1 <- as.matrix(input[i,2:785])
      dim(m1) <- c(28,28)
      
      m1 <- apply(apply(m1, 1, rev), 1, t)
      
      image(1:28, 1:28, 
            m1, col=grey.colors(255), 
            # remove axis text
            xaxt = 'n', yaxt = 'n')
      text(2, 20, col="white", cex=1.2, input[i, 1])
  }
  
}

vizTrain(head(mnist_train,9))

4. Data Pre-processing

Split Dataset

library(rsample)
set.seed(100)

index <- initial_split(data = mnist_train, # data train
                       prop = 0.8, # 80% data train
                       strata = "label") # divided based on label
  
data_mnist_train <- training(index)
data_mnist_test <- testing(index)

Scale Dataset and Seperation

library(dplyr)
 
train_x <- data_mnist_train %>%
  select(-label) %>% 
  as.matrix()/255

train_y <- data_mnist_train$label # target variable

test_x <- data_mnist_test %>% # predictor
  select(-label) %>% 
  as.matrix()/255

test_y <- data_mnist_test$label # target variable

Change Data Type as a Array

Keras only accept predictors in the form of array, and labels in the form of one-hot-encoded categories. One-hot encoding means that we are giving a “binary” code for each class.

library(keras)
train_x <- array_reshape(x = train_x, dim = dim(train_x))
test_x <- array_reshape(x = test_x, dim = dim(test_x))
# One hot encoding target variable
train_y <- to_categorical(train_y)
test_y <- to_categorical(test_y)
input_dim <- ncol(train_x) #Number of dimension
num_class <- n_distinct(data_mnist_train$label) #Number of label

5. The Machine Learning Model

DNN

we will employ a Deep Neural Network (DNN). DNN is a neural network with more than three layers, including the input and output layers.

# your code here
# set seed bobot awal
set_random_seed(100)

# your code here (membuat arsitektur)
model_tuning1 <- keras_model_sequential(name = "model_tuning1") %>% # name of model should be same
  
  # input layer + hidden layer 1
  layer_dense(input_shape = input_dim, # input size (number of predictors)
              units = 512, # number of node
              activation = "relu", # activation function
              name = "hidden_1") %>%
  
  # hidden layer 2
  layer_dense(units = 256,
              activation = "relu", # activation function
              name = "hidden_2") %>% 
  
  # hidden layer 2
  layer_dense(units = 128,
              activation = "relu", # activation function
              name = "hidden_3") %>% 
  # hidden layer 2
  layer_dense(units = 64,
              activation = "relu", # activation function
              name = "hidden_4") %>%
  
  # output layer
  layer_dense(units = 25, # number of class (0-24)
              activation = "softmax",
              name = "output")

  #compile
  model_tuning1 %>% compile(
    loss = "categorical_crossentropy",
    optimizer = optimizer_adamax(learning_rate = 0.00146),
    metrics="accuracy"
  )
  
  # fit model
  history_tuning1 <- model_tuning1 %>%  fit(x = train_x,
                                           y= train_y,
                                           epoch=20,
                                           validation_data = list(test_x , test_y),
                                           verbose = 1)

plot(history_tuning1)

pred <- predict(model_tuning1, test_x) %>% 
  k_argmax() %>% 
  as.array() %>% 
  as.factor()
confusionMatrix(data = pred, reference = as.factor(data_mnist_test$label))
#> Confusion Matrix and Statistics
#> 
#>           Reference
#> Prediction   0   1   2   3   4   5   6   7   8  10  11  12  13  14  15  16  17
#>         0  196   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0
#>         1    0 211   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0
#>         2    0   0 246   0   0   0   0   0   0   0   0   0   0   0   0   0   0
#>         3    0   0   0 239   0   0   0   0   0   0   0   0   0   0   0   0   0
#>         4    0   0   0   0 191   0   0   0   0   0   0   0   0   0   0   0   0
#>         5    0   0   0   0   0 250   0   0   0   0   0   0   0   0   0   0   0
#>         6    0   0   0   0   0   0 213   0   0   0   0   0   0   0   0   0   0
#>         7    0   0   0   0   0   0   0 207   0   0   0   0   0   0   0   0   0
#>         8    0   0   0   0   0   0   0   0 231   0   0   0   0   0   0   0   0
#>         10   0   0   0   0   0   0   0   0   0 236   0   0   0   0   0   0   0
#>         11   0   0   0   0   0   0   0   0   0   0 236   0   0   0   0   0   0
#>         12   0   0   0   0   0   0   0   0   0   0   0 209   0   0   0   0   0
#>         13   0   0   0   0   0   0   0   0   0   0   0   0 229   0   0   0   0
#>         14   0   0   0   0   0   0   0   0   0   0   0   0   0 264   0   0   0
#>         15   0   0   0   0   0   0   0   0   0   0   0   0   0   0 217   0   0
#>         16   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0 243   0
#>         17   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0 249
#>         18   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0
#>         19   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0
#>         20   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0
#>         21   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0
#>         22   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0
#>         23   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0
#>         24   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0
#>           Reference
#> Prediction  18  19  20  21  22  23  24
#>         0    0   0   0   0   0   0   0
#>         1    0   0   0   0   0   0   0
#>         2    0   0   0   0   0   0   0
#>         3    0   0   0   0   0   0   0
#>         4    0   0   0   0   0   0   0
#>         5    0   0   0   0   0   0   0
#>         6    0   0   0   0   0   0   0
#>         7    0   0   0   0   0   0   0
#>         8    0   0   0   0   0   0   0
#>         10   0   0   0   0   0   0   0
#>         11   0   0   0   0   0   0   0
#>         12   0   0   0   0   0   0   0
#>         13   0   0   0   0   0   0   0
#>         14   0   0   0   0   0   0   0
#>         15   0   0   0   0   0   0   0
#>         16   0   0   0   0   0   0   0
#>         17   0   0   0   0   0   0   0
#>         18 227   0   0   0   0   0   0
#>         19   0 249   0   0   0   0   0
#>         20   0   0 215   0   0   0   0
#>         21   0   0   0 207   0   0   0
#>         22   0   0   0   0 255   0   0
#>         23   0   0   0   0   0 236   0
#>         24   0   0   0   0   0   0 237
#> 
#> Overall Statistics
#>                                                
#>                Accuracy : 1                    
#>                  95% CI : (0.9993, 1)          
#>     No Information Rate : 0.0481               
#>     P-Value [Acc > NIR] : < 0.00000000000000022
#>                                                
#>                   Kappa : 1                    
#>                                                
#>  Mcnemar's Test P-Value : NA                   
#> 
#> Statistics by Class:
#> 
#>                      Class: 0 Class: 1 Class: 2 Class: 3 Class: 4 Class: 5
#> Sensitivity           1.00000  1.00000  1.00000  1.00000  1.00000  1.00000
#> Specificity           1.00000  1.00000  1.00000  1.00000  1.00000  1.00000
#> Pos Pred Value        1.00000  1.00000  1.00000  1.00000  1.00000  1.00000
#> Neg Pred Value        1.00000  1.00000  1.00000  1.00000  1.00000  1.00000
#> Prevalence            0.03568  0.03841  0.04478  0.04351  0.03477  0.04551
#> Detection Rate        0.03568  0.03841  0.04478  0.04351  0.03477  0.04551
#> Detection Prevalence  0.03568  0.03841  0.04478  0.04351  0.03477  0.04551
#> Balanced Accuracy     1.00000  1.00000  1.00000  1.00000  1.00000  1.00000
#>                      Class: 6 Class: 7 Class: 8 Class: 10 Class: 11 Class: 12
#> Sensitivity           1.00000  1.00000  1.00000   1.00000   1.00000   1.00000
#> Specificity           1.00000  1.00000  1.00000   1.00000   1.00000   1.00000
#> Pos Pred Value        1.00000  1.00000  1.00000   1.00000   1.00000   1.00000
#> Neg Pred Value        1.00000  1.00000  1.00000   1.00000   1.00000   1.00000
#> Prevalence            0.03878  0.03768  0.04205   0.04296   0.04296   0.03805
#> Detection Rate        0.03878  0.03768  0.04205   0.04296   0.04296   0.03805
#> Detection Prevalence  0.03878  0.03768  0.04205   0.04296   0.04296   0.03805
#> Balanced Accuracy     1.00000  1.00000  1.00000   1.00000   1.00000   1.00000
#>                      Class: 13 Class: 14 Class: 15 Class: 16 Class: 17
#> Sensitivity            1.00000   1.00000    1.0000   1.00000   1.00000
#> Specificity            1.00000   1.00000    1.0000   1.00000   1.00000
#> Pos Pred Value         1.00000   1.00000    1.0000   1.00000   1.00000
#> Neg Pred Value         1.00000   1.00000    1.0000   1.00000   1.00000
#> Prevalence             0.04169   0.04806    0.0395   0.04424   0.04533
#> Detection Rate         0.04169   0.04806    0.0395   0.04424   0.04533
#> Detection Prevalence   0.04169   0.04806    0.0395   0.04424   0.04533
#> Balanced Accuracy      1.00000   1.00000    1.0000   1.00000   1.00000
#>                      Class: 18 Class: 19 Class: 20 Class: 21 Class: 22
#> Sensitivity            1.00000   1.00000   1.00000   1.00000   1.00000
#> Specificity            1.00000   1.00000   1.00000   1.00000   1.00000
#> Pos Pred Value         1.00000   1.00000   1.00000   1.00000   1.00000
#> Neg Pred Value         1.00000   1.00000   1.00000   1.00000   1.00000
#> Prevalence             0.04133   0.04533   0.03914   0.03768   0.04642
#> Detection Rate         0.04133   0.04533   0.03914   0.03768   0.04642
#> Detection Prevalence   0.04133   0.04533   0.03914   0.03768   0.04642
#> Balanced Accuracy      1.00000   1.00000   1.00000   1.00000   1.00000
#>                      Class: 23 Class: 24
#> Sensitivity            1.00000   1.00000
#> Specificity            1.00000   1.00000
#> Pos Pred Value         1.00000   1.00000
#> Neg Pred Value         1.00000   1.00000
#> Prevalence             0.04296   0.04315
#> Detection Rate         0.04296   0.04315
#> Detection Prevalence   0.04296   0.04315
#> Balanced Accuracy      1.00000   1.00000

Insight : - Our model have a great accuracy almost 100% with label test. but we should double check once again with data test

Evaluation Model

test_x1 <- mnist_test %>% # prediktor
  select(-label) %>% 
  as.matrix()/255

pred1 <- predict(model_tuning1,test_x1) %>% 
  k_argmax() %>% 
  as.array() %>% 
  as.factor()
confusionMatrix(data = pred1, reference = as.factor(mnist_test$label))
#> Confusion Matrix and Statistics
#> 
#>           Reference
#> Prediction   0   1   2   3   4   5   6   7   8  10  11  12  13  14  15  16  17
#>         0  330   0   0   0   0   0   0   0   0   0   0   0  42   0   0   0   0
#>         1    0 432   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0
#>         2    0   0 289   0   0  21   0   0   0   0  28   0   0  21   0   0   0
#>         3    0   0   0 225   0   0   0   0   0   0   0   0   0   0   0   0   0
#>         4    0   0   0   0 497   0   0   0   0   0   0  21   2   0   0   0   0
#>         5    0   0  21   0   0 213   0   0   0   0   0   0   0   0   0   0   0
#>         6    0   0   0   0   0   0 279  20   0   0   0   0   0   6   0   0   0
#>         7    0   0   0   0   0   0  26 413   0   0   0   0   0   0   0   0   0
#>         8    0   0   0   0   0   0   0   0 203  22   0   0   0   0   0   0   0
#>         10   0   0   0   0   0   0   0   0   0 181   0   0   0   0   0   0   0
#>         11   0   0   0   0   0  13   0   0   0   0 160   0   0   0   0   0   0
#>         12   0   0   0   0   0   0   0   0   0   0   0 303   0   0   0   0   0
#>         13   1   0   0   0   0   0  21   0   7   0   0  24 176   0   0   0   0
#>         14   0   0   0   0   0   0   0   0   0   0   0   0  25 197   0   0   0
#>         15   0   0   0   0   0   0   0   0   0   0   0   0   0   0 347   0   0
#>         16   0   0   0   0   0   0  20   0   0   0   0   0  29   2   0 163   0
#>         17   0   0   0   0   0   0   0   0  42  43   0   0   0   0   0   0 103
#>         18   0   0   0   0   1   0   0   0   0   0   0  46   0   0   0   1   0
#>         19   0   0   0   0   0   0   2   3  13   0   0   0  17  20   0   0   0
#>         20   0   0   0   0   0   0   0   0   0  62   0   0   0   0   0   0  20
#>         21   0   0   0   0   0   0   0   0   0   2   1   0   0   0   0   0  21
#>         22   0   0   0   0   0   0   0   0   0  21   0   0   0   0   0   0   0
#>         23   0   0   0  20   0   0   0   0   0   0  20   0   0   0   0   0   0
#>         24   0   0   0   0   0   0   0   0  23   0   0   0   0   0   0   0   0
#>           Reference
#> Prediction  18  19  20  21  22  23  24
#>         0    0   0   0   0   0   0   0
#>         1    0   0   0   0   0   0   4
#>         2    0   0   0   0   0   0   0
#>         3    0   4  20   0   0   0   0
#>         4   21   0   0   0   0   0   0
#>         5    0   0   0  23   1   0   0
#>         6    0   0   0   9   0   0   0
#>         7    0   0   0   0   0   0   0
#>         8   22   0   0   0  20   0  33
#>         10   0   0  32   5   0   0  20
#>         11   0   3   0   0   0   3  21
#>         12  41   0   0   0   0   0   0
#>         13   0   0   0   0   0   0   0
#>         14   0  17   0   0   0   0   0
#>         15   0   0   0  20   0   0   0
#>         16  21   0   0   0   0   0   0
#>         17   0   0  29   0  19   0  14
#>         18 141   0   0   0   0   0  16
#>         19   0 166   0  26   0   0  21
#>         20   0   0 180   0  20   0   0
#>         21   0   0   3 236  21   0  20
#>         22   0   0   0  26 125  60   0
#>         23   0  58   0   1   0 204   0
#>         24   0   0   2   0   0   0 183
#> 
#> Overall Statistics
#>                                                
#>                Accuracy : 0.8012               
#>                  95% CI : (0.7917, 0.8104)     
#>     No Information Rate : 0.0694               
#>     P-Value [Acc > NIR] : < 0.00000000000000022
#>                                                
#>                   Kappa : 0.7919               
#>                                                
#>  Mcnemar's Test P-Value : NA                   
#> 
#> Statistics by Class:
#> 
#>                      Class: 0 Class: 1 Class: 2 Class: 3 Class: 4 Class: 5
#> Sensitivity           0.99698  1.00000  0.93226  0.91837  0.99799  0.86235
#> Specificity           0.99386  0.99941  0.98980  0.99654  0.99341  0.99350
#> Pos Pred Value        0.88710  0.99083  0.80501  0.90361  0.91867  0.82558
#> Neg Pred Value        0.99985  1.00000  0.99692  0.99711  0.99985  0.99508
#> Prevalence            0.04615  0.06023  0.04322  0.03416  0.06944  0.03444
#> Detection Rate        0.04601  0.06023  0.04030  0.03137  0.06930  0.02970
#> Detection Prevalence  0.05187  0.06079  0.05006  0.03472  0.07543  0.03597
#> Balanced Accuracy     0.99542  0.99970  0.96103  0.95745  0.99570  0.92792
#>                      Class: 6 Class: 7 Class: 8 Class: 10 Class: 11 Class: 12
#> Sensitivity           0.80172  0.94725  0.70486   0.54683   0.76555   0.76904
#> Specificity           0.99487  0.99614  0.98591   0.99167   0.99426   0.99395
#> Pos Pred Value        0.88854  0.94077  0.67667   0.76050   0.80000   0.88081
#> Neg Pred Value        0.98994  0.99658  0.98763   0.97837   0.99297   0.98667
#> Prevalence            0.04852  0.06079  0.04016   0.04615   0.02914   0.05494
#> Detection Rate        0.03890  0.05759  0.02830   0.02524   0.02231   0.04225
#> Detection Prevalence  0.04378  0.06121  0.04183   0.03318   0.02789   0.04796
#> Balanced Accuracy     0.89830  0.97169  0.84539   0.76925   0.87990   0.88149
#>                      Class: 13 Class: 14 Class: 15 Class: 16 Class: 17
#> Sensitivity            0.60481   0.80081   1.00000   0.99390   0.71528
#> Specificity            0.99230   0.99394   0.99707   0.98973   0.97908
#> Pos Pred Value         0.76856   0.82427   0.94550   0.69362   0.41200
#> Neg Pred Value         0.98344   0.99293   1.00000   0.99986   0.99408
#> Prevalence             0.04057   0.03430   0.04838   0.02287   0.02008
#> Detection Rate         0.02454   0.02747   0.04838   0.02273   0.01436
#> Detection Prevalence   0.03193   0.03332   0.05117   0.03277   0.03486
#> Balanced Accuracy      0.79855   0.89737   0.99853   0.99181   0.84718
#>                      Class: 18 Class: 19 Class: 20 Class: 21 Class: 22
#> Sensitivity            0.57317   0.66935   0.67669   0.68208   0.60680
#> Specificity            0.99076   0.98527   0.98523   0.99004   0.98464
#> Pos Pred Value         0.68780   0.61940   0.63830   0.77632   0.53879
#> Neg Pred Value         0.98493   0.98812   0.98752   0.98398   0.98833
#> Prevalence             0.03430   0.03458   0.03709   0.04824   0.02872
#> Detection Rate         0.01966   0.02315   0.02510   0.03291   0.01743
#> Detection Prevalence   0.02858   0.03737   0.03932   0.04239   0.03235
#> Balanced Accuracy      0.78197   0.82731   0.83096   0.83606   0.79572
#>                      Class: 23 Class: 24
#> Sensitivity            0.76404   0.55120
#> Specificity            0.98566   0.99635
#> Pos Pred Value         0.67327   0.87981
#> Neg Pred Value         0.99083   0.97860
#> Prevalence             0.03723   0.04629
#> Detection Rate         0.02844   0.02552
#> Detection Prevalence   0.04225   0.02900
#> Balanced Accuracy      0.87485   0.77377

Insight :

  • It seems our model over-fit which the accuracy of data training decrease into 80,20% in data test set or unseen data.

CNN

CNN is our solution to avoid model’s over-fit.

A CNN typically consists of these layers.

  • Convolutional layer, The convolution layer (CONV) uses filters that perform convolution operations as it is scanning the input with respect to its dimensions.

  • Pool , The pooling layer (POOL) is a downsampling operation, typically applied after a convolution layer, which does some spatial invariance. In particular, max and average pooling are special kinds of pooling where the maximum and average value is taken, respectively

  • Fully Connected, The fully connected layer (FC) operates on a flattened input where each input is connected to all neurons. If present, FC layers are usually found towards the end of CNN architectures and can be used to optimize objectives such as class scores.

train_x <- data_mnist_train %>%
  select(-label) %>% 
  as.matrix()/255

train_y <- data_mnist_train$label # target variable

test_x <- data_mnist_test %>% # predictor
  select(-label) %>% 
  as.matrix()/255

test_y <- data_mnist_test$label # target variable

Reshape into 4 dimension as a requirments of CNN

train_x <- array_reshape(train_x, dim=c(dim(train_x)[1],28,28,1))

test_x <- array_reshape(test_x, dim=c(dim(test_x)[1],28,28,1))

Now what does c(dim(data_train_x)[1],28,28,1) mean? Explanation:

  • dim(data_train_x) = the number of rows in the data_train_x dataset

  • 28,28 = the dimension/size of the image

  • 1 = the number of color channel(s), 1 for BW and 3 for RGB

train_y <- to_categorical(train_y)
test_y <- to_categorical(test_y)
# your code here
# set seed bobot awal
set_random_seed(100)

# your code here (membuat arsitektur)
model_tuning2 <- keras_model_sequential(name = "model_tuning2") %>% 
  layer_conv_2d(filters = 75,
                kernel_size = c(3,3), # 3 x 3 filters
                padding = "same",
                activation = "relu",
                input_shape = c(28,28,1),
                strides = 1
                ) %>%
  
  layer_batch_normalization() %>%
  
  layer_max_pooling_2d(pool_size = c(2,2), strides = 2, padding = "same") %>%
  
  layer_conv_2d(filters = 50,
                kernel_size = c(3,3), # 3 x 3 filters
                padding = "same",
                activation = "relu",
                strides = 1
                ) %>%
  
  layer_dropout(rate = 0.2) %>%
  
  layer_batch_normalization() %>%
  
  layer_max_pooling_2d(pool_size = c(2,2), strides = 2, padding = "same") %>% 
  
  layer_conv_2d(filters = 25,
                kernel_size = c(3,3), # 3 x 3 filters
                padding = "same",
                activation = "relu",
                strides = 1
                ) %>% 
  
  layer_batch_normalization() %>%
  
  layer_max_pooling_2d(pool_size = c(2,2), strides = 2, padding = "same") %>%
  
  layer_flatten() %>% 
  
  layer_dense(units = 512,
              activation = "relu") %>%
  
  layer_dropout(rate = 0.3) %>%
  
  layer_dense(units = 25,
              activation = "softmax")

model_tuning2
#> Model: "model_tuning2"
#> _____________________________________________________________________
#> Layer (type)                   Output Shape               Param #    
#> =====================================================================
#> conv2d_2 (Conv2D)              (None, 28, 28, 75)         750        
#> _____________________________________________________________________
#> batch_normalization_2 (BatchNo (None, 28, 28, 75)         300        
#> _____________________________________________________________________
#> max_pooling2d_2 (MaxPooling2D) (None, 14, 14, 75)         0          
#> _____________________________________________________________________
#> conv2d_1 (Conv2D)              (None, 14, 14, 50)         33800      
#> _____________________________________________________________________
#> dropout_1 (Dropout)            (None, 14, 14, 50)         0          
#> _____________________________________________________________________
#> batch_normalization_1 (BatchNo (None, 14, 14, 50)         200        
#> _____________________________________________________________________
#> max_pooling2d_1 (MaxPooling2D) (None, 7, 7, 50)           0          
#> _____________________________________________________________________
#> conv2d (Conv2D)                (None, 7, 7, 25)           11275      
#> _____________________________________________________________________
#> batch_normalization (BatchNorm (None, 7, 7, 25)           100        
#> _____________________________________________________________________
#> max_pooling2d (MaxPooling2D)   (None, 4, 4, 25)           0          
#> _____________________________________________________________________
#> flatten (Flatten)              (None, 400)                0          
#> _____________________________________________________________________
#> dense_1 (Dense)                (None, 512)                205312     
#> _____________________________________________________________________
#> dropout (Dropout)              (None, 512)                0          
#> _____________________________________________________________________
#> dense (Dense)                  (None, 25)                 12825      
#> =====================================================================
#> Total params: 264,562
#> Trainable params: 264,262
#> Non-trainable params: 300
#> _____________________________________________________________________
# Compile model
model_tuning2 %>% 
  compile(
    loss = "categorical_crossentropy",
    optimizer = optimizer_adamax(lr = 0.001),
    metrics = "accuracy"
  )

# Fit model using data generator
history_tuning2 <- model_tuning2 %>% fit(x = train_x,
                                         y = train_y,
                                         epoch = 7,
                                         validation_data = list(test_x , test_y),
                                         verbose = 1)

# Plot history
plot(history_tuning2)

Evaluation Model

test_x1 <- mnist_test %>% # predictor
  select(-label) %>% 
  as.matrix()/255

test_x1 <- array_reshape(test_x1, dim=c(dim(test_x1)[1],28,28,1))

pred2 <- predict(model_tuning2,test_x1) %>% 
  k_argmax() %>% 
  as.array() %>% 
  as.factor()
confusionMatrix(data = pred2, reference = as.factor(mnist_test$label))
#> Confusion Matrix and Statistics
#> 
#>           Reference
#> Prediction   0   1   2   3   4   5   6   7   8  10  11  12  13  14  15  16  17
#>         0  331   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0
#>         1    0 432   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0
#>         2    0   0 310   0   0   0   0   0   0   0   0   0   0   0   0   0   0
#>         3    0   0   0 245   0   0   0   0   0   0   0   0   0   0   0   0   0
#>         4    0   0   0   0 490   0   0   0   0   0   0   0   0   0   0   0   0
#>         5    0   0   0   0   0 247   0   0   0   0   0   0   0   0   0   0   0
#>         6    0   0   0   0   0   0 313  20   0   0   0   0   0   0   0   0   0
#>         7    0   0   0   0   0   0  13 416   0   0   0   0   0   0   0   0   0
#>         8    0   0   0   0   0   0   0   0 260   0   0   0   0   0   0   0   0
#>         10   0   0   0   0   0   0   0   0   0 315   0   0   0   0   0   0   0
#>         11   0   0   0   0   0   0   0   0   0   0 209   0   0   0   0   0   0
#>         12   0   0   0   0   0   0   0   0   0   0   0 371   0   0   0   0   0
#>         13   0   0   0   0   0   0   0   0   0   0   0  22 290   0   0   0   9
#>         14   0   0   0   0   0   0   0   0   0   0   0   0   0 246   0   0   0
#>         15   0   0   0   0   0   0   4   0   0   0   0   0   0   0 344   0   0
#>         16   0   0   0   0   0   0   0   0   0   0   0   0   0   0   3 164   0
#>         17   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0 111
#>         18   0   0   0   0   8   0   0   0   0  12   0   1   0   0   0   0   3
#>         19   0   0   0   0   0   0  18   0   0   0   0   0   0   0   0   0   0
#>         20   0   0   0   0   0   0   0   0   0   0   0   0   1   0   0   0   9
#>         21   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0  12
#>         22   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0
#>         23   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0
#>         24   0   0   0   0   0   0   0   0  28   4   0   0   0   0   0   0   0
#>           Reference
#> Prediction  18  19  20  21  22  23  24
#>         0    0   0   0   0   0   0   0
#>         1    0   0   0   0   0   0   0
#>         2    0   0   0   0   0   0   0
#>         3    0   0   0   0   0   0   0
#>         4   24   0   0   0   0   0   0
#>         5    0   0   0   0   0   0   0
#>         6    0   4   0   0   0   0   0
#>         7    0   0   0   0   0   0   0
#>         8    0   0   0   0   0   0   0
#>         10   0   0   0   0   0   0   0
#>         11   0  19   0   0   0   0   0
#>         12   0   0   0   0   0   0   0
#>         13   0   0   0   0   0   0   0
#>         14   0   0   0   0   0   0   0
#>         15   0   0   0   0   0   0   0
#>         16   0   0   0   0   0   0   0
#>         17   0   0   0   0   0   0   0
#>         18 222   0   0   0   0   0   0
#>         19   0 204   0   0   0   0   0
#>         20   0   0 266   0   0   0   0
#>         21   0   0   0 326   0   0   0
#>         22   0   0   0  20 206   0  39
#>         23   0  21   0   0   0 267   0
#>         24   0   0   0   0   0   0 293
#> 
#> Overall Statistics
#>                                                
#>                Accuracy : 0.959                
#>                  95% CI : (0.9542, 0.9635)     
#>     No Information Rate : 0.0694               
#>     P-Value [Acc > NIR] : < 0.00000000000000022
#>                                                
#>                   Kappa : 0.9571               
#>                                                
#>  Mcnemar's Test P-Value : NA                   
#> 
#> Statistics by Class:
#> 
#>                      Class: 0 Class: 1 Class: 2 Class: 3 Class: 4 Class: 5
#> Sensitivity           1.00000  1.00000  1.00000  1.00000  0.98394  1.00000
#> Specificity           1.00000  1.00000  1.00000  1.00000  0.99640  1.00000
#> Pos Pred Value        1.00000  1.00000  1.00000  1.00000  0.95331  1.00000
#> Neg Pred Value        1.00000  1.00000  1.00000  1.00000  0.99880  1.00000
#> Prevalence            0.04615  0.06023  0.04322  0.03416  0.06944  0.03444
#> Detection Rate        0.04615  0.06023  0.04322  0.03416  0.06832  0.03444
#> Detection Prevalence  0.04615  0.06023  0.04322  0.03416  0.07167  0.03444
#> Balanced Accuracy     1.00000  1.00000  1.00000  1.00000  0.99017  1.00000
#>                      Class: 6 Class: 7 Class: 8 Class: 10 Class: 11 Class: 12
#> Sensitivity           0.89943  0.95413  0.90278   0.95166   1.00000   0.94162
#> Specificity           0.99648  0.99807  1.00000   1.00000   0.99727   1.00000
#> Pos Pred Value        0.92878  0.96970  1.00000   1.00000   0.91667   1.00000
#> Neg Pred Value        0.99488  0.99703  0.99595   0.99767   1.00000   0.99662
#> Prevalence            0.04852  0.06079  0.04016   0.04615   0.02914   0.05494
#> Detection Rate        0.04364  0.05800  0.03625   0.04392   0.02914   0.05173
#> Detection Prevalence  0.04699  0.05982  0.03625   0.04392   0.03179   0.05173
#> Balanced Accuracy     0.94795  0.97610  0.95139   0.97583   0.99864   0.97081
#>                      Class: 13 Class: 14 Class: 15 Class: 16 Class: 17
#> Sensitivity            0.99656    1.0000   0.99135   1.00000   0.77083
#> Specificity            0.99549    1.0000   0.99941   0.99957   1.00000
#> Pos Pred Value         0.90343    1.0000   0.98851   0.98204   1.00000
#> Neg Pred Value         0.99985    1.0000   0.99956   1.00000   0.99533
#> Prevalence             0.04057    0.0343   0.04838   0.02287   0.02008
#> Detection Rate         0.04044    0.0343   0.04796   0.02287   0.01548
#> Detection Prevalence   0.04476    0.0343   0.04852   0.02328   0.01548
#> Balanced Accuracy      0.99603    1.0000   0.99538   0.99979   0.88542
#>                      Class: 18 Class: 19 Class: 20 Class: 21 Class: 22
#> Sensitivity            0.90244   0.82258   1.00000   0.94220   1.00000
#> Specificity            0.99653   0.99740   0.99855   0.99824   0.99153
#> Pos Pred Value         0.90244   0.91892   0.96377   0.96450   0.77736
#> Neg Pred Value         0.99653   0.99367   1.00000   0.99707   1.00000
#> Prevalence             0.03430   0.03458   0.03709   0.04824   0.02872
#> Detection Rate         0.03095   0.02844   0.03709   0.04545   0.02872
#> Detection Prevalence   0.03430   0.03095   0.03848   0.04713   0.03695
#> Balanced Accuracy      0.94949   0.90999   0.99928   0.97022   0.99577
#>                      Class: 23 Class: 24
#> Sensitivity            1.00000   0.88253
#> Specificity            0.99696   0.99532
#> Pos Pred Value         0.92708   0.90154
#> Neg Pred Value         1.00000   0.99430
#> Prevalence             0.03723   0.04629
#> Detection Rate         0.03723   0.04085
#> Detection Prevalence   0.04016   0.04532
#> Balanced Accuracy      0.99848   0.93893

Insight :

  • As we can see our accuracy models not over fit , improve accuracy from 80.20% in DNN into 95.9% in CNN.