Neural network

Neural network have become a corner stone of machine learning in the last decade. Created in the late 1940s with the intention to create computer programs who mimics the way neurons process information.

Create Test Data

set.seed(42)
x<-runif(200, -10, 10)
y<-sin(x)
plot(x,y)

Randomize the initial weights

A network function is made of three components: the network of neurons, the weight of each connection between neuron and the activation function of each neuron. For this example, we’ll use a feed-forward neural network and the logistic activation which are the defaults for the package nnet. We take one number as input of our neural network and we want one number as the output so the size of the input and output layer are both of one. For the hidden layer, we’ll start with three neurons. It’s good practice to randomize the initial weights, so create a vector of 10 random values, picked in the interval [-1,1].

weight<-runif(10, -1, 1)

Create Training Set and Test Set

Neural networks have a strong tendency of overfitting your data, meaning they become really good at describing the relationship between the values in your data set, but are not effective with data that wasn’t used to train your model. As a consequence, we need to cross-validate our model. Set the seed to 42, then create a training set containing 75% of the values in your initial data set and a test set containing the rest of your data.

set.seed(42)
index<-sample(1:length(x),round(0.75*length(x)),replace=FALSE)
reg.train<-data.frame(X=x[index],Y=y[index])
reg.test<-data.frame(X=x[-index],Y=y[-index])
head(reg.train)
##           X          Y
## 1 -3.660495  0.4959275
## 2  8.358081  0.8756100
## 3  3.545537 -0.3930479
## 4 -6.060110  0.2212295
## 5 -2.100539 -0.8629371
## 6 -5.656846  0.5861828
head(reg.test)
##           X          Y
## 1 -4.277209  0.9067943
## 2  6.608953  0.3200357
## 3  4.731766 -0.9998123
## 4  4.382245 -0.9459957
## 5  8.693445  0.6678624
## 6 -4.891424  0.9840161

Usint nnet to create the model

Load the nnet package and use the function of the same name to create your model. Pass your weights via the Wts argument and set the maxit argument to 50. We want to fit a function which can have for output multiple possible values. To do so, set the linout argument to true. Finally, take the time to look at the structure of your model.

library(nnet)
set.seed(42)
reg.model.1<-nnet(reg.train$X,reg.train$Y,size=3,maxit=50,Wts=weight,linout=TRUE)
## # weights:  10
## initial  value 103.169943 
## iter  10 value 70.636986
## iter  20 value 69.759785
## iter  30 value 63.215384
## iter  40 value 45.634297
## iter  50 value 39.876476
## final  value 39.876476 
## stopped after 50 iterations
str(reg.model.1)
## List of 15
##  $ n            : num [1:3] 1 3 1
##  $ nunits       : int 6
##  $ nconn        : num [1:7] 0 0 0 2 4 6 10
##  $ conn         : num [1:10] 0 1 0 1 0 1 0 2 3 4
##  $ nsunits      : num 5
##  $ decay        : num 0
##  $ entropy      : logi FALSE
##  $ softmax      : logi FALSE
##  $ censored     : logi FALSE
##  $ value        : num 39.9
##  $ wts          : num [1:10] -7.503 2.202 3.004 -0.806 -4.69 ...
##  $ convergence  : int 1
##  $ fitted.values: num [1:150, 1] -0.196 0.568 -0.353 -0.205 -0.161 ...
##   ..- attr(*, "dimnames")=List of 2
##   .. ..$ : NULL
##   .. ..$ : NULL
##  $ residuals    : num [1:150, 1] 0.692 0.3079 -0.0398 0.4262 -0.7024 ...
##   ..- attr(*, "dimnames")=List of 2
##   .. ..$ : NULL
##   .. ..$ : NULL
##  $ call         : language nnet.default(x = reg.train$X, y = reg.train$Y, size = 3, Wts = weight,      linout = TRUE, maxit = 50)
##  - attr(*, "class")= chr "nnet"

Predict with RMSE

Predict the output for the test set and compute the RMSE of your predictions. Plot the function sin(x) and then plot your predictions.

predict.model.1<-predict(reg.model.1,data.frame(X=reg.test$X))
str(predict.model.1)
##  num [1:50, 1] -0.201 0.184 -0.873 -0.981 0.598 ...
##  - attr(*, "dimnames")=List of 2
##   ..$ : NULL
##   ..$ : NULL
rmse.reg<-sqrt(sum((reg.test$Y-predict.model.1)^2))
rmse.reg
## [1] 3.41651
plot(sin, -10, 10)
points(reg.test$X,predict.model.1)

The number of neurons in the hidden layer

The number of neurons in the hidden layer, as well as the number of hidden layer used, has a great influence on the effectiveness of your model. Repeat the exercises three to five, but this time use a hidden layer with seven neurons and initiate randomly 22 weights.

set.seed(42)
reg.model.2<-nnet(reg.train$X,reg.train$Y,size=7,maxit=50,Wts=runif(22, -1, 1),linout=TRUE)
## # weights:  22
## initial  value 353.642846 
## iter  10 value 55.906010
## iter  20 value 42.700328
## iter  30 value 22.757713
## iter  40 value 16.910492
## iter  50 value 12.770497
## final  value 12.770497 
## stopped after 50 iterations
str(reg.model.2)
## List of 15
##  $ n            : num [1:3] 1 7 1
##  $ nunits       : int 10
##  $ nconn        : num [1:11] 0 0 0 2 4 6 8 10 12 14 ...
##  $ conn         : num [1:22] 0 1 0 1 0 1 0 1 0 1 ...
##  $ nsunits      : num 9
##  $ decay        : num 0
##  $ entropy      : logi FALSE
##  $ softmax      : logi FALSE
##  $ censored     : logi FALSE
##  $ value        : num 12.8
##  $ wts          : num [1:22] 0.894 0.992 1.997 1.506 8.113 ...
##  $ convergence  : int 1
##  $ fitted.values: num [1:150, 1] 0.585 0.6145 -0.464 0.0943 -0.8862 ...
##   ..- attr(*, "dimnames")=List of 2
##   .. ..$ : NULL
##   .. ..$ : NULL
##  $ residuals    : num [1:150, 1] -0.0891 0.2612 0.071 0.1269 0.0232 ...
##   ..- attr(*, "dimnames")=List of 2
##   .. ..$ : NULL
##   .. ..$ : NULL
##  $ call         : language nnet.default(x = reg.train$X, y = reg.train$Y, size = 7, Wts = runif(22,      -1, 1), linout = TRUE, maxit = 50)
##  - attr(*, "class")= chr "nnet"
predict.model.2<-predict(reg.model.2,data.frame(X=reg.test$X))
str(predict.model.2)
##  num [1:50, 1] 1.13 0.153 -0.931 -0.962 0.647 ...
##  - attr(*, "dimnames")=List of 2
##   ..$ : NULL
##   ..$ : NULL
rmse.reg<-sqrt(sum((reg.test$Y-predict.model.2)^2))
rmse.reg
## [1] 2.188407
plot(sin, -10, 10)
points(reg.test$X,predict.model.2)

Using Neural Network for classification problem - Iris Dataset

Now let us use neural networks to solve a classification problem, so let’s load the iris data set! It is good practice to normalize your input data to uniformize the behavior of your model over different range of value and have a faster training. Normalize each factor so that they have a mean of zero and a standard deviation of 1, then create your train and test set.

data<-iris

scale.data<-data.frame(lapply(data[,1:4], function(x) scale(x)))
scale.data$Species<-data$Species

index<-sample(1:nrow(scale.data),round(0.75*nrow(scale.data)),replace=FALSE)
clust.train<-scale.data[index,]
clust.test<-scale.data[-index,]

Create Model on Iris Classifier

Use the nnet() and use a hidden layer of ten neurons to create your model. We want to fit a function which have a finite amount of value as output. To do so, set the linout argument to true. Look at the structure of your model. With classification problem, the output is usually a factor that is coded as multiple dummy variables, instead of a single numeric value. As a consequence, the output layer have as one less neuron than the number of levels of the output factor.

set.seed(42)
clust.model<-nnet(Species~.,size=10,Wts=runif(83, -1, 1),data=clust.train)
## # weights:  83
## initial  value 187.294915 
## iter  10 value 10.386561
## iter  20 value 5.337510
## iter  30 value 2.311922
## iter  40 value 1.426508
## iter  50 value 1.387440
## iter  60 value 1.386324
## final  value 1.386294 
## converged

Predict the value of the test set

Make prediction with the values of the test set.

predict.model.clust<-predict(clust.model,clust.test[,1:4],type="class")

Create teh confusion table

Create the confusion table of your prediction and compute the accuracy of the model.

Table<-table(clust.test$Species ,predict.model.clust)
Table
##             predict.model.clust
##              setosa versicolor virginica
##   setosa         16          0         0
##   versicolor      0          9         0
##   virginica       0          0        13
accuracy<-sum(diag(Table))/sum(Table)
accuracy
## [1] 1