Congratulations! This is the end of Neural Network and our Machine Learning Specializations. The last part of this course is closed by filling this quiz.
To complete this assignment, you need to build your classification model to classify the categories of fashion image using Neural Network algorithms in one of the frameworks that is Keras
by following these steps:
Let us start our neural network experience by preparing the data first. In this quiz, you will use the fashionmnist
dataset. The data is stored as a csv file in this repository as fashionmnist folder. Please load the fashionmnist
data under the data_input
folder. The fashionmnist
folder contains train and test set of 10 different categories for 28 x 28 pixel sized fashion images, use the following glossary for your target labels:
categories <- c("T-shirt", "Trouser", "Pullover", "Dress",
"Coat", "Sandal", "Shirt", "Sneaker", "Bag", "Boot")
library(readr)
library(keras)
library(caret)
## Loading required package: lattice
## Loading required package: ggplot2
library(dplyr)
##
## Attaching package: 'dplyr'
## The following objects are masked from 'package:stats':
##
## filter, lag
## The following objects are masked from 'package:base':
##
## intersect, setdiff, setequal, union
In this phase, please load and investigate our fashionmnist data and store it under fashion_train
and fashion_test
object. Please use the read_csv()
function from the readr
package to speed up when reading the data.
fashion_train <- read_csv("data_input/train.csv")
## Parsed with column specification:
## cols(
## .default = col_double()
## )
## See spec(...) for full column specifications.
fashion_test <- read_csv("data_input/test.csv")
## Parsed with column specification:
## cols(
## .default = col_double()
## )
## See spec(...) for full column specifications.
head(fashion_test)
## # A tibble: 6 x 785
## label pixel1 pixel2 pixel3 pixel4 pixel5 pixel6 pixel7 pixel8 pixel9
## <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
## 1 0 0 0 0 0 0 0 0 9 8
## 2 1 0 0 0 0 0 0 0 0 0
## 3 2 0 0 0 0 0 0 14 53 99
## 4 2 0 0 0 0 0 0 0 0 0
## 5 3 0 0 0 0 0 0 0 0 0
## 6 2 0 0 0 0 0 44 105 44 10
## # ... with 775 more variables: pixel10 <dbl>, pixel11 <dbl>,
## # pixel12 <dbl>, pixel13 <dbl>, pixel14 <dbl>, pixel15 <dbl>,
## # pixel16 <dbl>, pixel17 <dbl>, pixel18 <dbl>, pixel19 <dbl>,
## # pixel20 <dbl>, pixel21 <dbl>, pixel22 <dbl>, pixel23 <dbl>,
## # pixel24 <dbl>, pixel25 <dbl>, pixel26 <dbl>, pixel27 <dbl>,
## # pixel28 <dbl>, pixel29 <dbl>, pixel30 <dbl>, pixel31 <dbl>,
## # pixel32 <dbl>, pixel33 <dbl>, pixel34 <dbl>, pixel35 <dbl>,
## # pixel36 <dbl>, pixel37 <dbl>, pixel38 <dbl>, pixel39 <dbl>,
## # pixel40 <dbl>, pixel41 <dbl>, pixel42 <dbl>, pixel43 <dbl>,
## # pixel44 <dbl>, pixel45 <dbl>, pixel46 <dbl>, pixel47 <dbl>,
## # pixel48 <dbl>, pixel49 <dbl>, pixel50 <dbl>, pixel51 <dbl>,
## # pixel52 <dbl>, pixel53 <dbl>, pixel54 <dbl>, pixel55 <dbl>,
## # pixel56 <dbl>, pixel57 <dbl>, pixel58 <dbl>, pixel59 <dbl>,
## # pixel60 <dbl>, pixel61 <dbl>, pixel62 <dbl>, pixel63 <dbl>,
## # pixel64 <dbl>, pixel65 <dbl>, pixel66 <dbl>, pixel67 <dbl>,
## # pixel68 <dbl>, pixel69 <dbl>, pixel70 <dbl>, pixel71 <dbl>,
## # pixel72 <dbl>, pixel73 <dbl>, pixel74 <dbl>, pixel75 <dbl>,
## # pixel76 <dbl>, pixel77 <dbl>, pixel78 <dbl>, pixel79 <dbl>,
## # pixel80 <dbl>, pixel81 <dbl>, pixel82 <dbl>, pixel83 <dbl>,
## # pixel84 <dbl>, pixel85 <dbl>, pixel86 <dbl>, pixel87 <dbl>,
## # pixel88 <dbl>, pixel89 <dbl>, pixel90 <dbl>, pixel91 <dbl>,
## # pixel92 <dbl>, pixel93 <dbl>, pixel94 <dbl>, pixel95 <dbl>,
## # pixel96 <dbl>, pixel97 <dbl>, pixel98 <dbl>, pixel99 <dbl>,
## # pixel100 <dbl>, pixel101 <dbl>, pixel102 <dbl>, pixel103 <dbl>,
## # pixel104 <dbl>, pixel105 <dbl>, pixel106 <dbl>, pixel107 <dbl>,
## # pixel108 <dbl>, pixel109 <dbl>, ...
head(fashion_train)
## # A tibble: 6 x 785
## label pixel1 pixel2 pixel3 pixel4 pixel5 pixel6 pixel7 pixel8 pixel9
## <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
## 1 2 0 0 0 0 0 0 0 0 0
## 2 9 0 0 0 0 0 0 0 0 0
## 3 6 0 0 0 0 0 0 0 5 0
## 4 0 0 0 0 1 2 0 0 0 0
## 5 3 0 0 0 0 0 0 0 0 0
## 6 4 0 0 0 5 4 5 5 3 5
## # ... with 775 more variables: pixel10 <dbl>, pixel11 <dbl>,
## # pixel12 <dbl>, pixel13 <dbl>, pixel14 <dbl>, pixel15 <dbl>,
## # pixel16 <dbl>, pixel17 <dbl>, pixel18 <dbl>, pixel19 <dbl>,
## # pixel20 <dbl>, pixel21 <dbl>, pixel22 <dbl>, pixel23 <dbl>,
## # pixel24 <dbl>, pixel25 <dbl>, pixel26 <dbl>, pixel27 <dbl>,
## # pixel28 <dbl>, pixel29 <dbl>, pixel30 <dbl>, pixel31 <dbl>,
## # pixel32 <dbl>, pixel33 <dbl>, pixel34 <dbl>, pixel35 <dbl>,
## # pixel36 <dbl>, pixel37 <dbl>, pixel38 <dbl>, pixel39 <dbl>,
## # pixel40 <dbl>, pixel41 <dbl>, pixel42 <dbl>, pixel43 <dbl>,
## # pixel44 <dbl>, pixel45 <dbl>, pixel46 <dbl>, pixel47 <dbl>,
## # pixel48 <dbl>, pixel49 <dbl>, pixel50 <dbl>, pixel51 <dbl>,
## # pixel52 <dbl>, pixel53 <dbl>, pixel54 <dbl>, pixel55 <dbl>,
## # pixel56 <dbl>, pixel57 <dbl>, pixel58 <dbl>, pixel59 <dbl>,
## # pixel60 <dbl>, pixel61 <dbl>, pixel62 <dbl>, pixel63 <dbl>,
## # pixel64 <dbl>, pixel65 <dbl>, pixel66 <dbl>, pixel67 <dbl>,
## # pixel68 <dbl>, pixel69 <dbl>, pixel70 <dbl>, pixel71 <dbl>,
## # pixel72 <dbl>, pixel73 <dbl>, pixel74 <dbl>, pixel75 <dbl>,
## # pixel76 <dbl>, pixel77 <dbl>, pixel78 <dbl>, pixel79 <dbl>,
## # pixel80 <dbl>, pixel81 <dbl>, pixel82 <dbl>, pixel83 <dbl>,
## # pixel84 <dbl>, pixel85 <dbl>, pixel86 <dbl>, pixel87 <dbl>,
## # pixel88 <dbl>, pixel89 <dbl>, pixel90 <dbl>, pixel91 <dbl>,
## # pixel92 <dbl>, pixel93 <dbl>, pixel94 <dbl>, pixel95 <dbl>,
## # pixel96 <dbl>, pixel97 <dbl>, pixel98 <dbl>, pixel99 <dbl>,
## # pixel100 <dbl>, pixel101 <dbl>, pixel102 <dbl>, pixel103 <dbl>,
## # pixel104 <dbl>, pixel105 <dbl>, pixel106 <dbl>, pixel107 <dbl>,
## # pixel108 <dbl>, pixel109 <dbl>, ...
Peek a fashion_train
data by using head()
function
head(fashion_train)
## # A tibble: 6 x 785
## label pixel1 pixel2 pixel3 pixel4 pixel5 pixel6 pixel7 pixel8 pixel9
## <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
## 1 2 0 0 0 0 0 0 0 0 0
## 2 9 0 0 0 0 0 0 0 0 0
## 3 6 0 0 0 0 0 0 0 5 0
## 4 0 0 0 0 1 2 0 0 0 0
## 5 3 0 0 0 0 0 0 0 0 0
## 6 4 0 0 0 5 4 5 5 3 5
## # ... with 775 more variables: pixel10 <dbl>, pixel11 <dbl>,
## # pixel12 <dbl>, pixel13 <dbl>, pixel14 <dbl>, pixel15 <dbl>,
## # pixel16 <dbl>, pixel17 <dbl>, pixel18 <dbl>, pixel19 <dbl>,
## # pixel20 <dbl>, pixel21 <dbl>, pixel22 <dbl>, pixel23 <dbl>,
## # pixel24 <dbl>, pixel25 <dbl>, pixel26 <dbl>, pixel27 <dbl>,
## # pixel28 <dbl>, pixel29 <dbl>, pixel30 <dbl>, pixel31 <dbl>,
## # pixel32 <dbl>, pixel33 <dbl>, pixel34 <dbl>, pixel35 <dbl>,
## # pixel36 <dbl>, pixel37 <dbl>, pixel38 <dbl>, pixel39 <dbl>,
## # pixel40 <dbl>, pixel41 <dbl>, pixel42 <dbl>, pixel43 <dbl>,
## # pixel44 <dbl>, pixel45 <dbl>, pixel46 <dbl>, pixel47 <dbl>,
## # pixel48 <dbl>, pixel49 <dbl>, pixel50 <dbl>, pixel51 <dbl>,
## # pixel52 <dbl>, pixel53 <dbl>, pixel54 <dbl>, pixel55 <dbl>,
## # pixel56 <dbl>, pixel57 <dbl>, pixel58 <dbl>, pixel59 <dbl>,
## # pixel60 <dbl>, pixel61 <dbl>, pixel62 <dbl>, pixel63 <dbl>,
## # pixel64 <dbl>, pixel65 <dbl>, pixel66 <dbl>, pixel67 <dbl>,
## # pixel68 <dbl>, pixel69 <dbl>, pixel70 <dbl>, pixel71 <dbl>,
## # pixel72 <dbl>, pixel73 <dbl>, pixel74 <dbl>, pixel75 <dbl>,
## # pixel76 <dbl>, pixel77 <dbl>, pixel78 <dbl>, pixel79 <dbl>,
## # pixel80 <dbl>, pixel81 <dbl>, pixel82 <dbl>, pixel83 <dbl>,
## # pixel84 <dbl>, pixel85 <dbl>, pixel86 <dbl>, pixel87 <dbl>,
## # pixel88 <dbl>, pixel89 <dbl>, pixel90 <dbl>, pixel91 <dbl>,
## # pixel92 <dbl>, pixel93 <dbl>, pixel94 <dbl>, pixel95 <dbl>,
## # pixel96 <dbl>, pixel97 <dbl>, pixel98 <dbl>, pixel99 <dbl>,
## # pixel100 <dbl>, pixel101 <dbl>, pixel102 <dbl>, pixel103 <dbl>,
## # pixel104 <dbl>, pixel105 <dbl>, pixel106 <dbl>, pixel107 <dbl>,
## # pixel108 <dbl>, pixel109 <dbl>, ...
The fashion_train
data consists of 60000 observations and 785 variables (1 target and 784 predictors). The predictors themselves contain the pixel of the image.
The data we have loaded above contains the value of pixels stored in data frame. Meanwhile, we have to convert the data into the matrix before we modeled the data, hence please convert the data to be matrix format using data.matrix()
function and store it the fashion_train
matrix as train_m
and fashion_test
matrix as test_m
train_m <- data.matrix(fashion_train)
test_m <- data.matrix(fashion_test)
class(train_m)
## [1] "matrix"
class(test_m)
## [1] "matrix"
After that, we should separate the predictors and target in our train_m
and test_m
data
# Predictor variables in `train_m`
train_x <- train_m[,-1]
str(train_x)
## num [1:60000, 1:784] 0 0 0 0 0 0 0 0 0 0 ...
## - attr(*, "dimnames")=List of 2
## ..$ : NULL
## ..$ : chr [1:784] "pixel1" "pixel2" "pixel3" "pixel4" ...
# Predictor variables in `test_m`
test_x <- test_m[,-1]
str(test_x)
## num [1:10000, 1:784] 0 0 0 0 0 0 0 0 0 0 ...
## - attr(*, "dimnames")=List of 2
## ..$ : NULL
## ..$ : chr [1:784] "pixel1" "pixel2" "pixel3" "pixel4" ...
# Target variables in `train_m`
train_y <- train_m[,1]
str(train_y)
## num [1:60000] 2 9 6 0 3 4 4 5 4 8 ...
# Target variables in `test_m`
test_y <- test_m[,1]
str(test_y)
## num [1:10000] 0 1 2 2 3 2 8 6 5 0 ...
Next, for the matrix variables that contain predictor variables, we should convert it to array shape. Please use the array_reshape(data, dim(data))
to do that
train_x_array <- array_reshape(train_x, c(nrow(train_x),784 ))
test_x_array <- array_reshape(test_x, c(nrow(test_x),784))
Then scale the train_x_array
and test_x_array
by dividing to 255.
train_x.keras <- train_x_array/255
test_x.keras <- test_x_array/255
To prepare the data for the training model, we one-hot encode the vectors (train_y
) into binary class matrices using to_categorical()
function from Keras
and stored it as train_y.keras
object
train_y.keras <- to_categorical(fashion_train$label,10)
keras_model_sequential()
To organize the layers, we should create a base model, which is a Sequential model. Call a keras_model_sequential()
function, and please pipe the base model with the model architecture.
To define the architecture for each layer, we will build several models by tuning several parameters. Before building the architecture, we set the initializer to make sure the result will not change.
set.seed(100)
initializer <- initializer_random_normal(seed = 100)
First, create a model (stored it under model_init
)by defining these parameters as: - the first layer contains 32 nodes, relu activation function, 784 input shape - the second layer contains 32 nodes, relu activation function - the third layer contains 10 nodes, softmax activation function
model_init <- keras_model_sequential() %>%
layer_dense(units = 32, activation = 'relu', input_shape = 784,
kernel_initializer = initializer, bias_initializer = initializer) %>%
layer_dense(units = 32, activation = 'relu',
kernel_initializer = initializer, bias_initializer = initializer) %>%
layer_dense(units = 10, activation = 'softmax',
kernel_initializer = initializer, bias_initializer = initializer)
summary(model_init)
## Model: "sequential"
## ___________________________________________________________________________
## Layer (type) Output Shape Param #
## ===========================================================================
## dense (Dense) (None, 32) 25120
## ___________________________________________________________________________
## dense_1 (Dense) (None, 32) 1056
## ___________________________________________________________________________
## dense_2 (Dense) (None, 10) 330
## ===========================================================================
## Total params: 26,506
## Trainable params: 26,506
## Non-trainable params: 0
## ___________________________________________________________________________
Second, create a model (stored it under model_bigger
)by defining these parameters as: - the first layer contains 512 nodes, relu activation function, 784 input shape - the second layer contains 512 nodes, relu activation function - the third layer contains 10 nodes, softmax activation function
model_bigger <- keras_model_sequential() %>%
layer_dense(units = 512, activation = 'relu', input_shape = 784,
kernel_initializer = initializer, bias_initializer = initializer) %>%
layer_dense(units = 512, activation = 'relu',
kernel_initializer = initializer, bias_initializer = initializer) %>%
layer_dense(units = 10, activation = 'softmax',
kernel_initializer = initializer, bias_initializer = initializer)
summary(model_bigger)
## Model: "sequential_1"
## ___________________________________________________________________________
## Layer (type) Output Shape Param #
## ===========================================================================
## dense_3 (Dense) (None, 512) 401920
## ___________________________________________________________________________
## dense_4 (Dense) (None, 512) 262656
## ___________________________________________________________________________
## dense_5 (Dense) (None, 10) 5130
## ===========================================================================
## Total params: 669,706
## Trainable params: 669,706
## Non-trainable params: 0
## ___________________________________________________________________________
In this step, we still need to do several settings before the model_init
and model_bigger
are ready for training. Then, we should compile the model by defining the loss, optimizer type, and evaluation metrics. Please compile the model by setting these parameters: - categorical crossentropy as loss function - adam as the optimizer with learning rate 0.001 - used the accuracy as the metrics
model_init %>%
compile(loss = 'categorical_crossentropy',
optimizer = optimizer_adam (lr = 0.001),
metrics = c('accuracy')
)
model_bigger %>%
compile(loss = 'categorical_crossentropy',
optimizer = optimizer_adam (lr = 0.001),
metrics = c('accuracy')
)
In this step, we fit our model using epochs = 10
and batch_size = 100
for those model_init
and model_bigger
. Please save the model in history_init
and history_bigger
object.
history_init <- model_init %>%
fit(train_x.keras,
train_y.keras,
epochs = 10,
batch_size = 100)
history_bigger <- model_bigger %>%
fit(train_x.keras,
train_y.keras,
epoch = 10,
batch_size = 100)
After we built our model, we then predict the testing (test_x.keras
) data using the model that we have built. Please predict using predict_classes()
function from Keras
package and store it under pred_init
and pred_bigger
.
pred_init <- keras::predict_classes(object = model_init, x= test_x.keras)
head(pred_init)
## [1] 0 1 2 2 4 6
pred_bigger <- keras::predict_classes(object = model_bigger, x= test_x.keras)
head(pred_bigger)
## [1] 0 1 2 2 4 6
As the label is still in dbl type, then please decode the label based on its categories.
decode <- function(data){
sapply(as.character(data), switch,
"0" = "T-Shirt",
"1" = "Trouser",
"2" = "Pullover",
"3" = "Dress",
"4" = "Coat",
"5" = "Sandal",
"6" = "Shirt",
"7" = "Sneaker",
"8" = "Bag",
"9" = "Boot")
}
Then, decode the pred_init
and pred_bigger
before we evaluate the model performance using confusion matrix
reference <- decode(test_y)
pred_decode_in <- decode(pred_init)
head(pred_decode_in)
## 0 1 2 2 4 6
## "T-Shirt" "Trouser" "Pullover" "Pullover" "Coat" "Shirt"
pred_decode_big <- decode(pred_bigger)
head(pred_decode_big)
## 0 1 2 2 4 6
## "T-Shirt" "Trouser" "Pullover" "Pullover" "Coat" "Shirt"
After decoding the target variable, then you can evaluate the model using several metrics, in this quiz, please check the accuracy in the confusion matrix below.
Note: do not forget to do the explicit coercion as.factor
.
library(caret)
confusionMatrix(as.factor(pred_decode_in), as.factor(reference))
## Confusion Matrix and Statistics
##
## Reference
## Prediction Bag Boot Coat Dress Pullover Sandal Shirt Sneaker T-Shirt
## Bag 965 2 3 1 5 5 10 3 17
## Boot 1 914 0 0 0 9 0 23 0
## Coat 5 0 814 19 90 0 73 0 0
## Dress 3 0 38 909 18 1 38 0 38
## Pullover 7 0 85 8 806 0 99 0 13
## Sandal 2 21 0 1 0 944 0 34 2
## Shirt 7 0 60 16 65 0 605 0 88
## Sneaker 5 63 0 0 0 39 0 940 0
## T-Shirt 5 0 0 29 16 2 174 0 841
## Trouser 0 0 0 17 0 0 1 0 1
## Reference
## Prediction Trouser
## Bag 0
## Boot 0
## Coat 1
## Dress 20
## Pullover 1
## Sandal 1
## Shirt 2
## Sneaker 0
## T-Shirt 2
## Trouser 973
##
## Overall Statistics
##
## Accuracy : 0.8711
## 95% CI : (0.8644, 0.8776)
## No Information Rate : 0.1
## P-Value [Acc > NIR] : < 2.2e-16
##
## Kappa : 0.8568
##
## Mcnemar's Test P-Value : NA
##
## Statistics by Class:
##
## Class: Bag Class: Boot Class: Coat Class: Dress
## Sensitivity 0.9650 0.9140 0.8140 0.9090
## Specificity 0.9949 0.9963 0.9791 0.9827
## Pos Pred Value 0.9545 0.9652 0.8124 0.8535
## Neg Pred Value 0.9961 0.9905 0.9793 0.9898
## Prevalence 0.1000 0.1000 0.1000 0.1000
## Detection Rate 0.0965 0.0914 0.0814 0.0909
## Detection Prevalence 0.1011 0.0947 0.1002 0.1065
## Balanced Accuracy 0.9799 0.9552 0.8966 0.9458
## Class: Pullover Class: Sandal Class: Shirt
## Sensitivity 0.8060 0.9440 0.6050
## Specificity 0.9763 0.9932 0.9736
## Pos Pred Value 0.7910 0.9393 0.7177
## Neg Pred Value 0.9784 0.9938 0.9569
## Prevalence 0.1000 0.1000 0.1000
## Detection Rate 0.0806 0.0944 0.0605
## Detection Prevalence 0.1019 0.1005 0.0843
## Balanced Accuracy 0.8912 0.9686 0.7893
## Class: Sneaker Class: T-Shirt Class: Trouser
## Sensitivity 0.9400 0.8410 0.9730
## Specificity 0.9881 0.9747 0.9979
## Pos Pred Value 0.8978 0.7867 0.9808
## Neg Pred Value 0.9933 0.9822 0.9970
## Prevalence 0.1000 0.1000 0.1000
## Detection Rate 0.0940 0.0841 0.0973
## Detection Prevalence 0.1047 0.1069 0.0992
## Balanced Accuracy 0.9641 0.9078 0.9854
confusionMatrix(as.factor(pred_decode_big), as.factor(reference))
## Confusion Matrix and Statistics
##
## Reference
## Prediction Bag Boot Coat Dress Pullover Sandal Shirt Sneaker T-Shirt
## Bag 976 0 0 2 1 1 6 0 3
## Boot 1 961 0 0 0 12 0 52 0
## Coat 1 0 877 51 101 0 67 0 1
## Dress 3 0 13 881 11 0 18 0 9
## Pullover 4 0 48 5 781 0 45 0 5
## Sandal 2 17 0 1 0 950 0 11 1
## Shirt 4 0 60 22 79 1 710 0 89
## Sneaker 1 22 0 0 0 35 0 937 0
## T-Shirt 8 0 2 29 27 1 153 0 891
## Trouser 0 0 0 9 0 0 1 0 1
## Reference
## Prediction Trouser
## Bag 0
## Boot 0
## Coat 0
## Dress 10
## Pullover 0
## Sandal 0
## Shirt 1
## Sneaker 0
## T-Shirt 3
## Trouser 986
##
## Overall Statistics
##
## Accuracy : 0.895
## 95% CI : (0.8888, 0.9009)
## No Information Rate : 0.1
## P-Value [Acc > NIR] : < 2.2e-16
##
## Kappa : 0.8833
##
## Mcnemar's Test P-Value : NA
##
## Statistics by Class:
##
## Class: Bag Class: Boot Class: Coat Class: Dress
## Sensitivity 0.9760 0.9610 0.8770 0.8810
## Specificity 0.9986 0.9928 0.9754 0.9929
## Pos Pred Value 0.9869 0.9366 0.7987 0.9323
## Neg Pred Value 0.9973 0.9957 0.9862 0.9869
## Prevalence 0.1000 0.1000 0.1000 0.1000
## Detection Rate 0.0976 0.0961 0.0877 0.0881
## Detection Prevalence 0.0989 0.1026 0.1098 0.0945
## Balanced Accuracy 0.9873 0.9769 0.9262 0.9369
## Class: Pullover Class: Sandal Class: Shirt
## Sensitivity 0.7810 0.9500 0.7100
## Specificity 0.9881 0.9964 0.9716
## Pos Pred Value 0.8795 0.9674 0.7350
## Neg Pred Value 0.9760 0.9945 0.9679
## Prevalence 0.1000 0.1000 0.1000
## Detection Rate 0.0781 0.0950 0.0710
## Detection Prevalence 0.0888 0.0982 0.0966
## Balanced Accuracy 0.8846 0.9732 0.8408
## Class: Sneaker Class: T-Shirt Class: Trouser
## Sensitivity 0.9370 0.8910 0.9860
## Specificity 0.9936 0.9752 0.9988
## Pos Pred Value 0.9417 0.7998 0.9890
## Neg Pred Value 0.9930 0.9877 0.9984
## Prevalence 0.1000 0.1000 0.1000
## Detection Rate 0.0937 0.0891 0.0986
## Detection Prevalence 0.0995 0.1114 0.0997
## Balanced Accuracy 0.9653 0.9331 0.9924
It turns out; our boss wants to get the best model, then he asks you to compare one model to another model (store it under model_tuning
). Now, let us try to build the model_tuning
by tuning these while compiling the model : - used the sgd as the optimizer with learning rate 0.001 - the rest is the same with model_init
model_tuning <- keras_model_sequential() %>%
layer_dense(units = 32, activation = 'relu', input_shape = c(784)) %>%
layer_dense(units = 32, activation = 'relu') %>%
layer_dense(units = 10, activation = 'softmax')
model_tuning <- model_tuning %>%
compile(loss = 'categorical_crossentropy',
optimizer = optimizer_sgd(lr = 0.001),
metrics = c('accuracy'))
history_tuning <- model_tuning %>%
fit(train_x.keras, train_y.keras, epochs = 10, batch_size = 100)
After tuning the model, please do the predict test_x.keras
using model_tuning
.
pred_tuning <- keras::predict_classes(object = model_tuning, x= test_x.keras)
Then, decode the pred_tuning
and check the model performance using confusionMatrix
.
pred_decode_tun <- decode(pred_tuning)
head(pred_decode_tun)
## 0 1 2 6 1 6
## "T-Shirt" "Trouser" "Pullover" "Shirt" "Trouser" "Shirt"
confusionMatrix(as.factor(pred_decode_tun), as.factor(reference))
## Confusion Matrix and Statistics
##
## Reference
## Prediction Bag Boot Coat Dress Pullover Sandal Shirt Sneaker T-Shirt
## Bag 913 0 6 2 18 9 30 0 19
## Boot 2 922 0 0 0 127 0 109 1
## Coat 8 0 717 27 222 0 285 0 4
## Dress 14 1 60 865 5 3 51 0 99
## Pullover 4 0 166 6 614 3 162 0 11
## Sandal 7 12 0 0 0 596 1 21 2
## Shirt 37 0 46 44 131 1 238 0 90
## Sneaker 12 65 0 0 0 258 0 870 1
## T-Shirt 2 0 0 34 8 3 226 0 761
## Trouser 1 0 5 22 2 0 7 0 12
## Reference
## Prediction Trouser
## Bag 0
## Boot 0
## Coat 5
## Dress 33
## Pullover 21
## Sandal 0
## Shirt 5
## Sneaker 0
## T-Shirt 9
## Trouser 927
##
## Overall Statistics
##
## Accuracy : 0.7423
## 95% CI : (0.7336, 0.7509)
## No Information Rate : 0.1
## P-Value [Acc > NIR] : < 2.2e-16
##
## Kappa : 0.7137
##
## Mcnemar's Test P-Value : NA
##
## Statistics by Class:
##
## Class: Bag Class: Boot Class: Coat Class: Dress
## Sensitivity 0.9130 0.9220 0.7170 0.8650
## Specificity 0.9907 0.9734 0.9388 0.9704
## Pos Pred Value 0.9157 0.7941 0.5655 0.7648
## Neg Pred Value 0.9903 0.9912 0.9676 0.9848
## Prevalence 0.1000 0.1000 0.1000 0.1000
## Detection Rate 0.0913 0.0922 0.0717 0.0865
## Detection Prevalence 0.0997 0.1161 0.1268 0.1131
## Balanced Accuracy 0.9518 0.9477 0.8279 0.9177
## Class: Pullover Class: Sandal Class: Shirt
## Sensitivity 0.6140 0.5960 0.2380
## Specificity 0.9586 0.9952 0.9607
## Pos Pred Value 0.6221 0.9327 0.4020
## Neg Pred Value 0.9572 0.9568 0.9190
## Prevalence 0.1000 0.1000 0.1000
## Detection Rate 0.0614 0.0596 0.0238
## Detection Prevalence 0.0987 0.0639 0.0592
## Balanced Accuracy 0.7863 0.7956 0.5993
## Class: Sneaker Class: T-Shirt Class: Trouser
## Sensitivity 0.8700 0.7610 0.9270
## Specificity 0.9627 0.9687 0.9946
## Pos Pred Value 0.7214 0.7296 0.9498
## Neg Pred Value 0.9852 0.9733 0.9919
## Prevalence 0.1000 0.1000 0.1000
## Detection Rate 0.0870 0.0761 0.0927
## Detection Prevalence 0.1206 0.1043 0.0976
## Balanced Accuracy 0.9163 0.8648 0.9608
Class: Sneaker Class: T-Shirt Class: Trouser
Sensitivity 0.9340 0.8160 0.9740 Specificity 0.9872 0.9806 0.9968 Pos Pred Value 0.8904 0.8234 0.9711 Neg Pred Value 0.9926 0.9796 0.9971 Prevalence 0.1000 0.1000 0.1000 Detection Rate 0.0934 0.0816 0.0974 Detection Prevalence 0.1049 0.0991 0.1003 Balanced Accuracy 0.9606 0.8983 0.9854