Final Part 2

Part 2.

Go to Alex Krizhevsky website and look over the The CIFAR-10 dataset. This is an interesting dataset with 10 different labeled classes of images. Run all of the code from the Posit Keras website Example cfar10 and explain each step presented. Clearly describe what kind of neural network is being fitted. First change the number of epochs to a value <= 50 (current value is 200, at the top), so it does not run forever. This program will take a long time to run. Does it appear that it has run long enough?

Here we build a convolutional neural network (CNN). It consists of multiple layers including convolutional layers, activation layers, max pooling layers, dropout layers, and dense layers. The architecture starts with a 2D convolutional layer, which takes 32x32 pixel RGB images as input. Finally, the output is projected onto a 10-unit output layer using another dense layer, which typically represents the class probabilities of a multi-class classification task.

Here we choose a batch size of 32 and 20 epochs.

library(keras)

# Parameters --------------------------------------------------------------
batch_size <- 32
epochs <- 20
data_augmentation <- FALSE


# Data Preparation --------------------------------------------------------

# See ?dataset_cifar10 for more info
cifar10 <- dataset_cifar10()

# Feature scale RGB values in test and train inputs  
x_train <- cifar10$train$x/255
x_test <- cifar10$test$x/255
y_train <- cifar10$train$y
y_test <- cifar10$test$y


# Defining Model ----------------------------------------------------------

# Initialize sequential model
model <- keras_model_sequential()


if (data_augmentation) {
  data_augmentation = keras_model_sequential() %>% 
    layer_random_flip("horizontal") %>% 
    layer_random_rotation(0.2)
  
  model <- model %>% 
    data_augmentation()
}

model <- model %>%
  # Start with hidden 2D convolutional layer being fed 32x32 pixel images
  layer_conv_2d(
    filter = 16, kernel_size = c(3,3), padding = "same", 
    input_shape = c(32, 32, 3)
  ) %>%
  layer_activation_leaky_relu(0.1) %>% 

  # Second hidden layer
  layer_conv_2d(filter = 32, kernel_size = c(3,3)) %>%
  layer_activation_leaky_relu(0.1) %>% 

  # Use max pooling
  layer_max_pooling_2d(pool_size = c(2,2)) %>%
  layer_dropout(0.25) %>%
  
  # 2 additional hidden 2D convolutional layers
  layer_conv_2d(filter = 32, kernel_size = c(3,3), padding = "same") %>%
  layer_activation_leaky_relu(0.1) %>% 
  layer_conv_2d(filter = 64, kernel_size = c(3,3)) %>%
  layer_activation_leaky_relu(0.1) %>% 

  # Use max pooling once more
  layer_max_pooling_2d(pool_size = c(2,2)) %>%
  layer_dropout(0.25) %>%
  
  # Flatten max filtered output into feature vector 
  # and feed into dense layer
  layer_flatten() %>%
  layer_dense(256) %>%
  layer_activation_leaky_relu(0.1) %>% 
  layer_dropout(0.5) %>%

  # Outputs from dense layer are projected onto 10 unit output layer
  layer_dense(10)

opt <- optimizer_adamax(learning_rate = learning_rate_schedule_exponential_decay(
  initial_learning_rate = 5e-3, 
  decay_rate = 0.96, 
  decay_steps = 1500, 
  staircase = TRUE
))

model %>% compile(
  loss = loss_sparse_categorical_crossentropy(from_logits = TRUE),
  optimizer = opt,
  metrics = "accuracy"
)


# Training ----------------------------------------------------------------
model %>% fit(
  x_train, y_train,
  batch_size = batch_size,
  epochs = epochs,
  validation_data = list(x_test, y_test),
  shuffle = TRUE
)

model %>% evaluate(x_test, y_test)

     loss  accuracy 
0.6048965 0.8014000

The model takes a long time to finish the training, the model achieve a pretty good performance.