Neural network have become a corner stone of machine learning in the last decade. Created in the late 1940s with the intention to create computer programs who mimics the way neurons process information.
set.seed(42)
x<-runif(200, -10, 10)
y<-sin(x)
plot(x,y)
A network function is made of three components: the network of neurons, the weight of each connection between neuron and the activation function of each neuron. For this example, we’ll use a feed-forward neural network and the logistic activation which are the defaults for the package nnet. We take one number as input of our neural network and we want one number as the output so the size of the input and output layer are both of one. For the hidden layer, we’ll start with three neurons. It’s good practice to randomize the initial weights, so create a vector of 10 random values, picked in the interval [-1,1].
weight<-runif(10, -1, 1)
Neural networks have a strong tendency of overfitting your data, meaning they become really good at describing the relationship between the values in your data set, but are not effective with data that wasn’t used to train your model. As a consequence, we need to cross-validate our model. Set the seed to 42, then create a training set containing 75% of the values in your initial data set and a test set containing the rest of your data.
set.seed(42)
index<-sample(1:length(x),round(0.75*length(x)),replace=FALSE)
reg.train<-data.frame(X=x[index],Y=y[index])
reg.test<-data.frame(X=x[-index],Y=y[-index])
head(reg.train)
## X Y
## 1 -3.660495 0.4959275
## 2 8.358081 0.8756100
## 3 3.545537 -0.3930479
## 4 -6.060110 0.2212295
## 5 -2.100539 -0.8629371
## 6 -5.656846 0.5861828
head(reg.test)
## X Y
## 1 -4.277209 0.9067943
## 2 6.608953 0.3200357
## 3 4.731766 -0.9998123
## 4 4.382245 -0.9459957
## 5 8.693445 0.6678624
## 6 -4.891424 0.9840161
Load the nnet package and use the function of the same name to create your model. Pass your weights via the Wts argument and set the maxit argument to 50. We want to fit a function which can have for output multiple possible values. To do so, set the linout argument to true. Finally, take the time to look at the structure of your model.
library(nnet)
set.seed(42)
reg.model.1<-nnet(reg.train$X,reg.train$Y,size=3,maxit=50,Wts=weight,linout=TRUE)
## # weights: 10
## initial value 103.169943
## iter 10 value 70.636986
## iter 20 value 69.759785
## iter 30 value 63.215384
## iter 40 value 45.634297
## iter 50 value 39.876476
## final value 39.876476
## stopped after 50 iterations
str(reg.model.1)
## List of 15
## $ n : num [1:3] 1 3 1
## $ nunits : int 6
## $ nconn : num [1:7] 0 0 0 2 4 6 10
## $ conn : num [1:10] 0 1 0 1 0 1 0 2 3 4
## $ nsunits : num 5
## $ decay : num 0
## $ entropy : logi FALSE
## $ softmax : logi FALSE
## $ censored : logi FALSE
## $ value : num 39.9
## $ wts : num [1:10] -7.503 2.202 3.004 -0.806 -4.69 ...
## $ convergence : int 1
## $ fitted.values: num [1:150, 1] -0.196 0.568 -0.353 -0.205 -0.161 ...
## ..- attr(*, "dimnames")=List of 2
## .. ..$ : NULL
## .. ..$ : NULL
## $ residuals : num [1:150, 1] 0.692 0.3079 -0.0398 0.4262 -0.7024 ...
## ..- attr(*, "dimnames")=List of 2
## .. ..$ : NULL
## .. ..$ : NULL
## $ call : language nnet.default(x = reg.train$X, y = reg.train$Y, size = 3, Wts = weight, linout = TRUE, maxit = 50)
## - attr(*, "class")= chr "nnet"
Predict the output for the test set and compute the RMSE of your predictions. Plot the function sin(x) and then plot your predictions.
predict.model.1<-predict(reg.model.1,data.frame(X=reg.test$X))
str(predict.model.1)
## num [1:50, 1] -0.201 0.184 -0.873 -0.981 0.598 ...
## - attr(*, "dimnames")=List of 2
## ..$ : NULL
## ..$ : NULL
rmse.reg<-sqrt(sum((reg.test$Y-predict.model.1)^2))
rmse.reg
## [1] 3.41651
plot(sin, -10, 10)
points(reg.test$X,predict.model.1)
Now let us use neural networks to solve a classification problem, so let’s load the iris data set! It is good practice to normalize your input data to uniformize the behavior of your model over different range of value and have a faster training. Normalize each factor so that they have a mean of zero and a standard deviation of 1, then create your train and test set.
data<-iris
scale.data<-data.frame(lapply(data[,1:4], function(x) scale(x)))
scale.data$Species<-data$Species
index<-sample(1:nrow(scale.data),round(0.75*nrow(scale.data)),replace=FALSE)
clust.train<-scale.data[index,]
clust.test<-scale.data[-index,]
Use the nnet() and use a hidden layer of ten neurons to create your model. We want to fit a function which have a finite amount of value as output. To do so, set the linout argument to true. Look at the structure of your model. With classification problem, the output is usually a factor that is coded as multiple dummy variables, instead of a single numeric value. As a consequence, the output layer have as one less neuron than the number of levels of the output factor.
set.seed(42)
clust.model<-nnet(Species~.,size=10,Wts=runif(83, -1, 1),data=clust.train)
## # weights: 83
## initial value 187.294915
## iter 10 value 10.386561
## iter 20 value 5.337510
## iter 30 value 2.311922
## iter 40 value 1.426508
## iter 50 value 1.387440
## iter 60 value 1.386324
## final value 1.386294
## converged
Make prediction with the values of the test set.
predict.model.clust<-predict(clust.model,clust.test[,1:4],type="class")
Create the confusion table of your prediction and compute the accuracy of the model.
Table<-table(clust.test$Species ,predict.model.clust)
Table
## predict.model.clust
## setosa versicolor virginica
## setosa 16 0 0
## versicolor 0 9 0
## virginica 0 0 13
accuracy<-sum(diag(Table))/sum(Table)
accuracy
## [1] 1