Welcome to a quick tutorial on applying Deep Learning using the Keras library in R. Keras is generally the simplest ML library to use and can be run on computers with either CPUs or GPUs. Its main strength is in its simple APIs for Convolutional Neural Networks (CNNs) though we will use a simple two layer Sequential Nerual Network in this tutorial.

Load the Necessary Packages

#install.packages(c("tidyverse","keras"))
library(tidyverse)

## -- Attaching packages ---------------------------- tidyverse 1.3.0 --

## v ggplot2 3.3.0     v purrr   0.3.4
## v tibble  3.0.1     v dplyr   0.8.5
## v tidyr   1.0.3     v stringr 1.4.0
## v readr   1.3.1     v forcats 0.5.0

## -- Conflicts ------------------------------- tidyverse_conflicts() --
## x dplyr::filter() masks stats::filter()
## x dplyr::lag()    masks stats::lag()

library(keras)

Grab the MNIST data from the Keras package and store the pre-split train and test data appropriately

mnist <- dataset_mnist()
train_images <- mnist$train$x
train_labels <- mnist$train$y
test_images <- mnist$test$x
test_labels <- mnist$test$y

Reshape and normalize images for training the model

train_images <- array_reshape(train_images, c(60000, 28*28))
train_images<- train_images/255
test_images <- array_reshape(test_images, c(10000, 28*28))
test_images <- test_images / 255

train_labels <- to_categorical(train_labels)
test_labels <- to_categorical(test_labels)#adjust digit labels from numeric to label encoding

Building Simple 2 Layer NN model

network <- keras_model_sequential() %>% 
  layer_dense(units = 512, activation = "relu", input_shape = c(28*28)) %>% 
  layer_dense(units = 10, activation = "softmax") #output layer needs 10 units/neurons since there are 10 categories of digits

Each layer in a neural network can be thought of as a data filter where representations of the data are extracted. For the first layer (unless not piping) we need to specify an input_shape which is the 28x28 array we force reshaped above. The Units refer to the number of neurons per layer, note that the second layer (our output layer) has 10 nodes one to represent each class of our outcomes, the digits between 0-9. The activation functions also need to be specified in the selected layers, these are important for addressing non-linear patterns that may arise in the data. Generally the two most common are relu and softmax.

Compile Keras Model

network %>% compile(optimizer = "rmsprop",
                    loss = "categorical_crossentropy",
                    metrics = c("accuracy"))

Next for training Deep Learning Models in Keras you need to specify the loss function, an optimizer and the metrics to monitor during training and testing. Loss Function - Model/Network’s performanceon training data Optimizer - Network’s update strategy for weights and biases, basically how the network will fit the data Metrics - For human monitoring/understanding, here we will track accuracy or the number of correct predictions

Train Keras Model

model <- network %>% fit(train_images, train_labels, epochs = 5, batch_size = 64)

plot(model, lty = 5, type = "l", col = "blue")

#Evaluate Model on test data

results <- network %>% evaluate(test_images, test_labels)
results

##       loss   accuracy 
## 0.06874596 0.98100001