Final Part 1

Part 1.

Read the excellent blog post from the Data Science Heroes website, How to create a sequential model in Keras for R. Run all of the code in this blog post in your Quarto Notebook and explain each step presented. Clearly describe what kind of neural network is being fitted. Change the neural network to use units = 4 for the first hidden layer and units = 2 for the second layer. Change the number of epochs = 40 in the fit(). How well does this neural network perform compared to the original neural network run?

Preparing the data

We create the data. The input data will be 10000 rows and three columns coming from the uniform distribution The model will recognize that the sum of the three numbers is above a threshold of 1.5, otherwise it will be 0.

# Input: 10000 rows and 3 columns of uniform distribution
x_data=matrix(data=runif(30000), nrow=10000, ncol=3)

# Output
y_data=ifelse(rowSums(x_data) > 1.5, 1, 0)

Loading packages

# install.packages("keras")
library(keras)
library(tidyverse)

── Attaching packages ─────────────────────────────────────── tidyverse 1.3.2 ──
✔ ggplot2 3.4.0      ✔ purrr   0.3.5 
✔ tibble  3.1.8      ✔ dplyr   1.0.10
✔ tidyr   1.2.1      ✔ stringr 1.4.1 
✔ readr   2.1.3      ✔ forcats 0.5.2 
── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
✖ dplyr::filter() masks stats::filter()
✖ dplyr::lag()    masks stats::lag()

To do a binary classification task, we are going to create a one-hot vector. We can use to_categorical function.

y_data_oneh=to_categorical(y_data, num_classes = 2)

head(y_data_oneh)

     [,1] [,2]
[1,]    0    1
[2,]    0    1
[3,]    1    0
[4,]    1    0
[5,]    0    1
[6,]    0    1

Creating a sequential model in Keras

Here we build a simple neural network model, it is a sequential model that consists of three fully connected (dense) layers.

The first layer has 64 units and uses the ReLU activation function.
The second layer also has 64 units and uses ReLU activation as well.
The last layer has a number of units equal to the number of categories in the output (ncol(y_data_oneh)) and uses the softmax activation function.

## Creating the sequential model
model = keras_model_sequential() %>%   
  layer_dense(units = 64, activation = "relu", input_shape = ncol(x_data)) %>%
  layer_dense(units = 64, activation = "relu") %>%
  layer_dense(units = ncol(y_data_oneh), activation = "softmax")

model

Model: "sequential"
________________________________________________________________________________
 Layer (type)                       Output Shape                    Param #     
================================================================================
 dense_2 (Dense)                    (None, 64)                      256         
 dense_1 (Dense)                    (None, 64)                      4160        
 dense (Dense)                      (None, 2)                       130         
================================================================================
Total params: 4,546
Trainable params: 4,546
Non-trainable params: 0
________________________________________________________________________________

Now, we compile the model using loss function categorical_accuracy.

compile(model, loss = "categorical_crossentropy", optimizer = optimizer_rmsprop(), metrics = "accuracy")

history = fit(model,  x_data, y_data_oneh, epochs = 20, batch_size = 128, validation_split = 0.2)

plot(history)

Validating with unseen data

Next, we create ‘unseen’ input test data (1000 rows, 3 columns) and predict on new data.

## Validating with unseen data

x_data_test=matrix(data=runif(3000), nrow=1000, ncol=3)
dim(x_data_test)

[1] 1000    3

# y_data_pred=predict_classes(model, x_data_test)
y_data_pred <- model %>% predict(x_data_test) %>% k_argmax()

glimpse(y_data_pred)

<tf.Tensor: shape=(1000), dtype=int64, numpy=…>

# y_data_pred_oneh=predict(model, x_data_test)
y_data_pred_oneh <- model %>% predict(x_data_test) 
dim(y_data_pred_oneh)

[1] 1000    2

head(y_data_pred_oneh)

             [,1]         [,2]
[1,] 0.9957333207 4.266660e-03
[2,] 0.0604446121 9.395554e-01
[3,] 0.0050984682 9.949015e-01
[4,] 1.0000000000 2.528492e-09
[5,] 0.9999995232 4.542511e-07
[6,] 0.0004546779 9.995453e-01

y_data_real=ifelse(rowSums(x_data_test) > 1.5, 1, 0)
y_data_real_oneh=to_categorical(y_data_real)

Evaluating the model on Training vs Test.

## Evaluation on training data
evaluate(model, x_data, y_data_oneh, verbose = 0)

      loss   accuracy 
0.05076142 0.97180003

## Evaluation on Test data (we need the one-hot version)
evaluate(model, x_data_test, y_data_real_oneh, verbose = 0)

      loss   accuracy 
0.04936779 0.97200000

Change the neural network

Next, we want to try the neural network with units = 4 for the first hidden layer and units = 2 for the second layer. Change the number of epochs = 40 in the fit().

model = keras_model_sequential() %>%   
  layer_dense(units = 4, activation = "relu", input_shape = ncol(x_data)) %>%
  layer_dense(units = 2, activation = "relu") %>%
  layer_dense(units = ncol(y_data_oneh), activation = "softmax")

compile(model, loss = "categorical_crossentropy", optimizer = optimizer_rmsprop(), metrics = "accuracy")

history = fit(model,  x_data, y_data_oneh, epochs = 40, batch_size = 128, validation_split = 0.2)

plot(history)

## Evaluation on training data
evaluate(model, x_data, y_data_oneh, verbose = 0)

      loss   accuracy 
0.04318167 0.99699998

## Evaluation on Test data (we need the one-hot version)
evaluate(model, x_data_test, y_data_real_oneh, verbose = 0)

      loss   accuracy 
0.04320972 0.99900001

Conclusion

After making change on neural network, the accuracy has improve from 0.976 to 0.99 on test data. So It depends on each case that we can try to adjust the model layers or training on more epochs to improve the performance of model.