A multilayer perceptron is a feedforward neural network with one or more hidden layers.
The network consists of an input layer of source neurons, at least one middle or hidden layer of computational neurons, and an output layer of computational neurons.
The input signals are propagated in a forward direction on a layer-by-layer basis.
Backpropagation tarining Algorithm
Algorithm:
Step 1: Initialisation
Set all the weights and threshold levels of the network to random numbers uniformly distributed inside a small range:
where Fi is the total number of inputs of neuron i in the network. The weight initialisation is done on a neuron-by-neuron basis.
Step2: Activation
Activate the back-propagation neural network by applying inputs x1(p), x2(p),…, xn(p) and desired outputs yd,1(p), yd,2(p),…, yd,n(p).
(a) Calculate the actual outputs of the neurons in the hidden layer:
where n is the number of inputs of neuron j in the hidden layer, and sigmoid is the sigmoid activation function.
(b) Calculate the actual outputs of the neurons in the output layer:
where m is the number of inputs of neuron k in the output layer.
Step3: Weight training
Update the weights in the back-propagation network propagating backward the errors associated with output neurons.
(a) Calculate the error gradient for the neurons in the output layer:
where
Calculate the weight corrections:
Update the weights at the output neurons:
(b) Calculate the error gradient for the neurons in the hidden layer:
Calculate the weight corrections:
Update the weights at the hidden neurons:
Step3: Iteration
Increase iteration p by one, go back to Step 2 and repeat the process until the selected error criterion is satisfied.
Example: Three-layer network for solving the Exclusive-OR operation
The effect of the threshold applied to a neuron in the hidden or output layer is represented by its weight,θ, connected to a fixed input equal to -1.
The initial weights and threshold levels are set randomly as follows:
w13 = 0.5, w14 = 0.9, w23 = 0.4, w24 = 1.0, w35 = -1.2, w45 = 1.1, θ3 = 0.8, θ4 = -0.1 and θ5 = 0.3.
We consider a training set where inputs x1 and x2 are equal to 1 and desired output yd,5 is 0. The actual outputs of neurons 3 and 4 in the hidden layer are calculated as
the actual output of neuron 5 in the output layer is determined as:
the following error is obtained:
weight training, calculate the error gradient for neuron 5 in the output layer:
we determine the weight corrections assuming that the learning rate parameter, a, is equal to 0.1:
determine the weight correction and update all weight:
The training process is repeated until the sum of squared errors is less than 0.001.