Forward Propagation or Feed Forward

Forward propagation is the process propogating the signal from the input layer to the output or visual layer. Consider this three-layer network:

Neural Network
Neural Network

The desired output of N2,0 is 1.0. The inputs are [1, 1]. The learning rate is \(\beta = 0.45\) and the momentum is \(\alpha = 0.9\).

beta <- 0.45
alpha <- 0.9
input <- N0 <- matrix(c(1,1))
w0 <- matrix(c(.4,-.1,.1,-.1), nrow=2)
print(input)
##      [,1]
## [1,]    1
## [2,]    1
print(w0)
##      [,1] [,2]
## [1,]  0.4  0.1
## [2,] -0.1 -0.1

Each of the hidden and output neurons are logistic neurons meaning that they apply the logistic function

\(\sigma(t) = \frac{1}{1 + e^{-t}}\)

sigma <- function(t) 1/(1+exp(-t))

to the input before returning the response.

Calculate the inputs into the hidden layer.

N1 <- sigma(w0 %*% input)
print(N1)
##           [,1]
## [1,] 0.6224593
## [2,] 0.4501660

Calculate the inputs into the output layer.

w1 <- matrix(c(0.06, -0.4), nrow=1)
print(w1)
##      [,1] [,2]
## [1,] 0.06 -0.4
N2 <- sigma(w1 %*% N1)

The output is 0.4643807.

Back Propogation

First, calculate the error at N2,0.

Substitute:

N2.0.error <- N2 * (1-N2) * (1-N2)
print(N2.0.error)
##           [,1]
## [1,] 0.1332253

Once error is known, it will be used for backward propagation and weights adjustment.

Calculate the rate of change for the weights at each of the two nodes with the equation:

w1.Rate = (beta * N2.0.error[1,1]) * t(N1)
print(w1.Rate)
##            [,1]       [,2]
## [1,] 0.03731729 0.02698807
print(w1)
##      [,1] [,2]
## [1,]  0.4  0.1
## [2,] -0.1 -0.1

Calculate the new weights. t is the iteration of the back propogation. Obviously, the first time through, the learning rate \(\alpha\) is not applied, but on subsequent iterations, the multiples of the learning rate are applied on each iteration.

t <- 1
w1.new <- w1 + w1.Rate + alpha*(t-1)
print(w1.new)
##            [,1]       [,2]
## [1,] 0.09731729 -0.3730119
N1.0.error <- N2.0.error %*% w1.new

Calculate the rate of change for the weights between the input and hidden layer.

w0.Rate = t(beta * N1.0.error) * (N0)
print(w0.Rate)
##              [,1]
## [1,]  0.005834304
## [2,] -0.022362575

Note - w0.Rate[1] applies to weights W0.[1,2] and W0.Rate[2] applies to W0.[2,3]

w0.Rate <- matrix(c(w0.Rate[1,1], w0.Rate[2,1], w0.Rate[1,1], w0.Rate[2,1]), nrow=2)
print(w0.Rate)
##              [,1]         [,2]
## [1,]  0.005834304  0.005834304
## [2,] -0.022362575 -0.022362575

Calculate the new weights at the input layer.

w0.new <- w0 + w0.Rate + alpha*(t-1)
print(w0.new)
##            [,1]       [,2]
## [1,]  0.4058343  0.1058343
## [2,] -0.1223626 -0.1223626

Second Iteration

Run forward propagation again to see if we improve.

w0 <- w0.new
N1 <- sigma(w0 %*% input)

w1 <- w1.new
N2 <- sigma(w1 %*% N1)

print(N2)
##           [,1]
## [1,] 0.4742839
N2.1.error <- N2 * (1-N2) * (1-N2)
print(N2.1.error)
##           [,1]
## [1,] 0.1310814

Error improves from 0.1332253 to 0.1310814. This isn’t much of an improvement, but as noted above, the learning rate will increase and will increase convergence towards the proper weights.

Reference

Cilimkovic, Mirza, Neural Networks and Back Propagation Algorithm, http://www.dataminingmasters.com/uploads/studentProjects/NeuralNetworks.pdf.