•Single Layer Perceptron can Only express linear decision surfaces
![](https://img-blog.csdn.net/20160311143002702?watermark/2/text/aHR0cDovL2Jsb2cuY3Nkbi5uZXQv/font/5a6L5L2T/fontsize/400/fill/I0JBQkFCMA==/dissolve/70/gravity/Center)
![](https://img-blog.csdn.net/20160311143007124?watermark/2/text/aHR0cDovL2Jsb2cuY3Nkbi5uZXQv/font/5a6L5L2T/fontsize/400/fill/I0JBQkFCMA==/dissolve/70/gravity/Center)
![](https://img-blog.csdn.net/20160311143010796?watermark/2/text/aHR0cDovL2Jsb2cuY3Nkbi5uZXQv/font/5a6L5L2T/fontsize/400/fill/I0JBQkFCMA==/dissolve/70/gravity/Center)
![](https://img-blog.csdn.net/20160311143016010?watermark/2/text/aHR0cDovL2Jsb2cuY3Nkbi5uZXQv/font/5a6L5L2T/fontsize/400/fill/I0JBQkFCMA==/dissolve/70/gravity/Center)
![](https://img-blog.csdn.net/20160311143021124?watermark/2/text/aHR0cDovL2Jsb2cuY3Nkbi5uZXQv/font/5a6L5L2T/fontsize/400/fill/I0JBQkFCMA==/dissolve/70/gravity/Center)
![](https://img-blog.csdn.net/20160311143031869?watermark/2/text/aHR0cDovL2Jsb2cuY3Nkbi5uZXQv/font/5a6L5L2T/fontsize/400/fill/I0JBQkFCMA==/dissolve/70/gravity/Center)
![](https://img-blog.csdn.net/20160311143036526?watermark/2/text/aHR0cDovL2Jsb2cuY3Nkbi5uZXQv/font/5a6L5L2T/fontsize/400/fill/I0JBQkFCMA==/dissolve/70/gravity/Center)
![](https://img-blog.csdn.net/20160311143039828?watermark/2/text/aHR0cDovL2Jsb2cuY3Nkbi5uZXQv/font/5a6L5L2T/fontsize/400/fill/I0JBQkFCMA==/dissolve/70/gravity/Center)
![](https://img-blog.csdn.net/20160311143058500?watermark/2/text/aHR0cDovL2Jsb2cuY3Nkbi5uZXQv/font/5a6L5L2T/fontsize/400/fill/I0JBQkFCMA==/dissolve/70/gravity/Center)
•We can build a multilayer network represent the highly nonlinear decision surfaces
Sigmoid Unit
Back-propagation Algorithm
•For each training example, training involves following steps
Step 1: Present the training sample, calculate the outputs
(初始时,对于每一层的每个感知器的权重向量全是很小的随机数。)
Step 2: For each output unit k, calculate
(对于输出层的每个感知器)
Step 3: For hidden unit h, calculate
(对于隐藏层的每个感知器)
(k为输出层的感知器或者是下一层的感知器)、
Step 4: Update the output layer weights, wh,k
(更新输出层每个感知器的权重向量)where oh is the output of hidden layer h
Step 5: Update the hidden layer weights, wi,h
(更新隐藏层的每个感知器的权重向量)
如此迭代
通过梯度下降来更新神经网络中的权重向量,这样我们可能得到的只是一个局部极小值而不是全局最小值,但是在实践中神经网络表现不错。