Ex.1 Use the nnet package to analyze the iris data set. Use 80% of the 150 sampels as the training data and the rest for validation. Discuss the results.

library(nnet)
library(dplyr)
## Warning: package 'dplyr' was built under R version 4.1.2
## 
## Attaching package: 'dplyr'
## The following objects are masked from 'package:stats':
## 
##     filter, lag
## The following objects are masked from 'package:base':
## 
##     intersect, setdiff, setequal, union
library(tidyr)
set.seed(199)
rows <- sample(nrow(iris))
iris_rand <- iris[rows,]
train <- iris_rand[1:120,]
test <- iris_rand[121:150,]

nnet_fit <-nnet(Species  ~ ., data = train, size=5, decay=5e-4, maxit=200)
## # weights:  43
## initial  value 160.443958 
## iter  10 value 56.393902
## iter  20 value 53.611342
## iter  30 value 7.865334
## iter  40 value 7.022950
## iter  50 value 6.925476
## iter  60 value 6.891608
## iter  70 value 6.869529
## iter  80 value 6.850776
## iter  90 value 6.498715
## iter 100 value 5.020462
## iter 110 value 4.540204
## iter 120 value 4.427847
## iter 130 value 4.319533
## iter 140 value 4.163031
## iter 150 value 3.865988
## iter 160 value 2.120853
## iter 170 value 1.598539
## iter 180 value 1.485419
## iter 190 value 1.317687
## iter 200 value 1.275337
## final  value 1.275337 
## stopped after 200 iterations

After we have fitted our model we want to see how our model performs on new data so we will use the predict function to predict the species type of our test set

y_predL = predict(nnet_fit, newdata = test[,-5])
y_predL <- round(y_predL)
y_predL <- as.data.frame(y_predL)
y_predL <- y_predL %>% pivot_longer(everything()) %>% filter(value == 1) %>% mutate(name = as.factor(name))

We can compare our predictions against the test set in a confusion matrix and explore the accuracy

Wow, nnet predicted the iris species with perfect accuracy.

caret::confusionMatrix(y_predL$name, test[,5])
## Confusion Matrix and Statistics
## 
##             Reference
## Prediction   setosa versicolor virginica
##   setosa         10          0         0
##   versicolor      0         11         0
##   virginica       0          0         9
## 
## Overall Statistics
##                                      
##                Accuracy : 1          
##                  95% CI : (0.8843, 1)
##     No Information Rate : 0.3667     
##     P-Value [Acc > NIR] : 8.475e-14  
##                                      
##                   Kappa : 1          
##                                      
##  Mcnemar's Test P-Value : NA         
## 
## Statistics by Class:
## 
##                      Class: setosa Class: versicolor Class: virginica
## Sensitivity                 1.0000            1.0000              1.0
## Specificity                 1.0000            1.0000              1.0
## Pos Pred Value              1.0000            1.0000              1.0
## Neg Pred Value              1.0000            1.0000              1.0
## Prevalence                  0.3333            0.3667              0.3
## Detection Rate              0.3333            0.3667              0.3
## Detection Prevalence        0.3333            0.3667              0.3
## Balanced Accuracy           1.0000            1.0000              1.0

Ex. 2 As a mini project, install the keras package and learn how to use it. Then, carry out various tasks that may be usefule to your project and studies.

This package is very new to me. We are going to work through the installation and application as layed out in the tensorflow tutorial found here.

tensorflow tutorial

First we need to load the keras and tensorflow packages. You need to follow the tensorflow installation instructions otherwise you will run into trouble in short order.

That being said, I could not get ‘reticulate’ to build on my machine and ended up using ‘r-miniconda’ which I believe is why I’m am getting messages below.

library(keras)
## Warning: package 'keras' was built under R version 4.1.2
library(tensorflow)
## Warning: package 'tensorflow' was built under R version 4.1.2

We are using the mnist number dataset to perform some image classification. First we are loading the data into a training and test set. Then we are converting from integers to floating point.

c(c(x_train, y_train), c(x_test, y_test)) %<-% keras::dataset_mnist()
x_train <- x_train / 255
x_test <-  x_test / 255

Now we are building our model. We have an input layer. We flatten the data into a single vector. We create a densely connected nn layer. Our drop out layer extinguishes 20% of our signal (sets it to 0) to prevent overfitting

model <- keras_model_sequential(input_shape = c(28, 28)) %>%
  layer_flatten() %>%
  layer_dense(128, activation = "relu") %>%
  layer_dropout(0.2) %>%
  layer_dense(10)

Our model returns log-odds for each class.

predictions <- predict(model, x_train[1:2, , ])
predictions
##            [,1]      [,2]        [,3]       [,4]       [,5]       [,6]
## [1,] 0.35004312 0.4894799  0.16333592 0.09530532 -0.2847350 -0.8690768
## [2,] 0.06318349 0.4005950 -0.06793498 0.09019655  0.1042261 -0.1768343
##           [,7]        [,8]      [,9]      [,10]
## [1,] 0.5155104  0.01514847 0.7744809 -0.1699134
## [2,] 0.7210672 -0.27364829 0.7415282 -0.1057346

Here we are converting log-odds to probabilities

tf$nn$softmax(predictions)
## tf.Tensor(
## [[0.11629786 0.13369906 0.09649078 0.09014476 0.06164404 0.03436487
##   0.137225   0.08320105 0.17778811 0.06914447]
##  [0.0862497  0.12086305 0.07565081 0.08861132 0.08986326 0.06784521
##   0.16652248 0.06158478 0.16996478 0.07284461]], shape=(2, 10), dtype=float64)

Here we are defining our loss function that will be used in training.

loss_fn <- loss_sparse_categorical_crossentropy(from_logits = TRUE)

“This loss is equal to the negative log probability of the true class” - TensorFlow for R

A loss of zero means we are certain of the class.

loss_fn(y_train[1:2], predictions)
## tf.Tensor(2.9106146370132016, shape=(), dtype=float64)

We need to configure and compile the model before it can be used.

model %>% compile(
  optimizer = "adam",
  loss = loss_fn,
  metrics = "accuracy"
)

This is pretty cool. Here we fit the model and get to see the decrease in loss and the improvement in accuracy.

model %>% fit(x_train, y_train, epochs = 5)

Next lets check the accuracy. 97% not too shabby!

model %>% evaluate(x_test,  y_test, verbose = 2)
##       loss   accuracy 
## 0.07202643 0.97770000

Lastly we want to return the classes with the highest probability.

probability_model <- keras_model_sequential() %>%
  model() %>%
  layer_activation_softmax() %>%
  layer_lambda(tf$argmax)
probability_model(x_test[1:5, , ])
## tf.Tensor([3 2 1 0 4 2 2 0 2 4], shape=(10), dtype=int64)