Klasifikasi Data Fashion-MNIST dengan Neural Network
Penjelasan Data
Objektif
Data set ini merupakan data set mengenai pakaian dari artikel Zalando yang terdiri dari data train sebanyak 60 ribu dan data set sebanya 10 ribu. Data ini terdiri dari 10 kelas. Setiap data test dan data train memiliki penjelasan label sebagai berikut:
0 : T-shirt/top
1 : Trouser
2 : Pullover
3 : Dress
4 : Coat
5 : Sandal
6 : Shirt
7 : Sneaker
8 : Bag
9 : Ankle boot
Library untuk Neural Network
library(tidyverse)
library(keras)
library(dplyr)
library(rsample)
library(caret)
# checking tensorflow version
tensorflow::tf_version()## [1] '2.7'
# set seed
tensorflow::tf$random$set_seed(42)
theme_set(theme_minimal())
options(scipen = 999)Membaca Data
fm_train <- read.csv("train.csv")
fm_test <-read.csv("test.csv")
dim(fm_train)## [1] 60000 785
Melihat nama kolom pada data
colnames(fm_train)[c(1:5, 780:784)]## [1] "label" "pixel1" "pixel2" "pixel3" "pixel4" "pixel779"
## [7] "pixel780" "pixel781" "pixel782" "pixel783"
Kita dapat mengatur kolom label sebagai variabel respon dan kolom pixel 1 hingga pixel 783 merupakan variabel prediktor.
Visualisasi 25 data pertama
vizTrain <- function(input) {
dimmax <- sqrt(ncol(fm_train[, -1]))
dimn <- ceiling(sqrt(nrow(input)))
par(mfrow = c(dimn, dimn), mar = c(0.1, 0.1, 0.1,
0.1))
for (i in 1:nrow(input)) {
m1 <- matrix(input[i, 2:ncol(input)], nrow = dimmax,
byrow = T)
m1 <- apply(m1, 2, as.numeric)
m1 <- t(apply(m1, 2, rev))
image(1:dimmax, 1:dimmax, m1, col = grey.colors(255),
xaxt = "n", yaxt = "n")
cat <- sapply(as.character(fm_train[i,1]), switch,
"0" = "T-shirt",
"1" = "Trouser",
"2" = "Pullover",
"3" = "Dress",
"4" = "Coat",
"5" = "Sandal",
"6" = "Shirt",
"7" = "Sneaker",
"8" = "Bag",
"9" = "Boot")
text(2, 20, col = "white", cex = 1.2, cat)
}
}vizTrain(fm_train[1:25,])Persiapan Data
Menyiapkan data untuk menggunakan model keras dan mengubahnya menjadi matriks
data_train <- data.matrix(fm_train)
data_test <- data.matrix(fm_test)Data pada data_train
train_x <- data_train[,-1]/255 #-1 karena tidak termasuk kolom "label" pada kolom kesatu dan dibagi 255 untuk standarisasi ukuran gambar
train_y <- data_train[,1]
dim(train_x) #Cek dimensi## [1] 60000 784
Data pada data_test
test_x <- data_test[,-1] /255
test_y <- data_test[,1]
dim(test_x)## [1] 10000 784
Mengubah tipe data menjadi data array
train_x <- array_reshape(x = train_x,
dim = dim(train_x))
test_x <- array_reshape(x = test_x,
dim = dim(test_x))Kita menjadikan variabel respon tiap data dengan 10 kategori (Sesuai dengan 10 kelas label)
train_y <- to_categorical(y = train_y,
num_classes = 10)
test_y <- to_categorical(y = test_y,
num_classes = 10)Modelling
Model 1
Model pertama yang akan dibuat yaitu dengan 3 layer dan menggunakan optimizer sgd tanpa mengatur learning_rate.
model1<- keras_model_sequential() %>%
layer_dense(units = 128, input_shape = 784, activation = "relu") %>%
layer_dense(units = 64, activation = "relu") %>%
layer_dense(units = 10, activation = "softmax")
model1 %>% compile(loss="categorical_crossentropy",
optimizer = optimizer_sgd(),
metrics="accuracy")
summary(model1)## Model: "sequential"
## ________________________________________________________________________________
## Layer (type) Output Shape Param #
## ================================================================================
## dense_2 (Dense) (None, 128) 100480
## dense_1 (Dense) (None, 64) 8256
## dense (Dense) (None, 10) 650
## ================================================================================
## Total params: 109,386
## Trainable params: 109,386
## Non-trainable params: 0
## ________________________________________________________________________________
history1 <- model1 %>%
fit(x = train_x,
y = train_y,
epochs = 30,
batch_size = 128)
history1##
## Final epoch (plot to see history):
## loss: 0.3468
## accuracy: 0.8773
plot(history1)pred1 <- predict(model1, test_x) %>%
k_argmax() %>%
as.array() %>%
as.factor()confusionMatrix(data = pred1,
reference = as.factor(fm_test[,1]))## Confusion Matrix and Statistics
##
## Reference
## Prediction 0 1 2 3 4 5 6 7 8 9
## 0 857 1 12 27 1 0 179 0 3 0
## 1 4 973 2 17 2 0 3 0 0 0
## 2 19 5 804 16 70 1 100 0 9 0
## 3 35 17 11 907 31 0 36 0 4 0
## 4 1 1 116 21 858 0 98 0 5 0
## 5 3 2 1 1 0 940 0 40 8 13
## 6 65 1 45 9 34 0 569 0 4 0
## 7 0 0 0 0 0 35 0 899 4 31
## 8 16 0 9 2 4 5 15 2 961 1
## 9 0 0 0 0 0 19 0 59 2 955
##
## Overall Statistics
##
## Accuracy : 0.8723
## 95% CI : (0.8656, 0.8788)
## No Information Rate : 0.1
## P-Value [Acc > NIR] : < 0.00000000000000022
##
## Kappa : 0.8581
##
## Mcnemar's Test P-Value : NA
##
## Statistics by Class:
##
## Class: 0 Class: 1 Class: 2 Class: 3 Class: 4 Class: 5
## Sensitivity 0.8570 0.9730 0.8040 0.9070 0.8580 0.9400
## Specificity 0.9752 0.9969 0.9756 0.9851 0.9731 0.9924
## Pos Pred Value 0.7935 0.9720 0.7852 0.8713 0.7800 0.9325
## Neg Pred Value 0.9840 0.9970 0.9782 0.9896 0.9840 0.9933
## Prevalence 0.1000 0.1000 0.1000 0.1000 0.1000 0.1000
## Detection Rate 0.0857 0.0973 0.0804 0.0907 0.0858 0.0940
## Detection Prevalence 0.1080 0.1001 0.1024 0.1041 0.1100 0.1008
## Balanced Accuracy 0.9161 0.9849 0.8898 0.9461 0.9156 0.9662
## Class: 6 Class: 7 Class: 8 Class: 9
## Sensitivity 0.5690 0.8990 0.9610 0.9550
## Specificity 0.9824 0.9922 0.9940 0.9911
## Pos Pred Value 0.7827 0.9278 0.9468 0.9227
## Neg Pred Value 0.9535 0.9888 0.9957 0.9950
## Prevalence 0.1000 0.1000 0.1000 0.1000
## Detection Rate 0.0569 0.0899 0.0961 0.0955
## Detection Prevalence 0.0727 0.0969 0.1015 0.1035
## Balanced Accuracy 0.7757 0.9456 0.9775 0.9731
Hasil: Diperoleh nilai akurasi pada model 1 yaitu 87.23%
Model 2
Kita dapat menambahkan hidden layer dan juga mencoba mengubah optimizer dengan optimizer adam dengan setting learning_rate tetap default.
model2<- keras_model_sequential() %>%
layer_dense(units = 256, input_shape = 784, activation = "relu") %>%
layer_dense(units = 128, activation = "relu") %>%
layer_dense(units = 64, activation = "relu") %>%
layer_dense(units = 10, activation = "softmax")
model2 %>% compile(loss="categorical_crossentropy",
optimizer = optimizer_adam(),
metrics="accuracy")
summary(model2)## Model: "sequential_1"
## ________________________________________________________________________________
## Layer (type) Output Shape Param #
## ================================================================================
## dense_6 (Dense) (None, 256) 200960
## dense_5 (Dense) (None, 128) 32896
## dense_4 (Dense) (None, 64) 8256
## dense_3 (Dense) (None, 10) 650
## ================================================================================
## Total params: 242,762
## Trainable params: 242,762
## Non-trainable params: 0
## ________________________________________________________________________________
history2 <- model2 %>%
fit(x = train_x,
y = train_y,
epochs = 30,
batch_size = 128)
history2##
## Final epoch (plot to see history):
## loss: 0.1211
## accuracy: 0.9531
plot(history2)pred2 <- predict(model2, test_x) %>%
k_argmax() %>%
as.array() %>%
as.factor()confusionMatrix(data = pred2,
reference = as.factor(fm_test[,1]))## Confusion Matrix and Statistics
##
## Reference
## Prediction 0 1 2 3 4 5 6 7 8 9
## 0 832 1 11 16 1 0 108 0 2 1
## 1 2 989 0 8 0 0 2 0 0 0
## 2 14 1 804 11 49 0 75 0 2 0
## 3 20 8 9 907 22 1 21 0 2 0
## 4 3 0 105 29 869 0 70 0 4 0
## 5 1 1 0 1 0 954 0 14 0 5
## 6 119 0 67 26 57 0 719 0 11 0
## 7 0 0 0 0 0 26 0 940 3 21
## 8 9 0 4 2 2 2 5 1 975 0
## 9 0 0 0 0 0 17 0 45 1 973
##
## Overall Statistics
##
## Accuracy : 0.8962
## 95% CI : (0.8901, 0.9021)
## No Information Rate : 0.1
## P-Value [Acc > NIR] : < 0.00000000000000022
##
## Kappa : 0.8847
##
## Mcnemar's Test P-Value : NA
##
## Statistics by Class:
##
## Class: 0 Class: 1 Class: 2 Class: 3 Class: 4 Class: 5
## Sensitivity 0.8320 0.9890 0.8040 0.9070 0.8690 0.9540
## Specificity 0.9844 0.9987 0.9831 0.9908 0.9766 0.9976
## Pos Pred Value 0.8560 0.9880 0.8410 0.9162 0.8046 0.9775
## Neg Pred Value 0.9814 0.9988 0.9783 0.9897 0.9853 0.9949
## Prevalence 0.1000 0.1000 0.1000 0.1000 0.1000 0.1000
## Detection Rate 0.0832 0.0989 0.0804 0.0907 0.0869 0.0954
## Detection Prevalence 0.0972 0.1001 0.0956 0.0990 0.1080 0.0976
## Balanced Accuracy 0.9082 0.9938 0.8936 0.9489 0.9228 0.9758
## Class: 6 Class: 7 Class: 8 Class: 9
## Sensitivity 0.7190 0.9400 0.9750 0.9730
## Specificity 0.9689 0.9944 0.9972 0.9930
## Pos Pred Value 0.7197 0.9495 0.9750 0.9392
## Neg Pred Value 0.9688 0.9933 0.9972 0.9970
## Prevalence 0.1000 0.1000 0.1000 0.1000
## Detection Rate 0.0719 0.0940 0.0975 0.0973
## Detection Prevalence 0.0999 0.0990 0.1000 0.1036
## Balanced Accuracy 0.8439 0.9672 0.9861 0.9830
Hasil: Diperoleh akurasi pada model 2 yaitu 89.55%
Kesimpulan
Berdasarkan hasil analisis di atas, kita dapat mengetahui bahwa model 2 lebih baik dibandingkan model 1 karena nilai akurasi model pada fitting model dengan 30 epoch yaitu 95.31% dengan loss yang sangat kecil. Lalu, kemampuan prediksi model 2 juga lebih baik dengan akurasi mencapai 89.55%. Sehingga, kemungkinan penambahan layer dan perubahan optimizer berpengaruh dalam akurasi model.