# 改进的Cost函数Cross-entropy使神经网络学习更快

## 神经网络是如何学习的

### 为什么神经网络会出现一开始学习很慢后来学习变快的情况呢

$C=\frac{\left(y-a{\right)}^{2}}{2}$

$\begin{array}{}\text{(1)}& \frac{\mathrm{\partial }C}{\mathrm{\partial }w}& =& \left(a-y\right){\sigma }^{\prime }\left(z\right)x=a{\sigma }^{\prime }\left(z\right)\text{（把x=1,y=0带入）}\text{(2)}& \frac{\mathrm{\partial }C}{\mathrm{\partial }b}& =& \left(a-y\right){\sigma }^{\prime }\left(z\right)=a{\sigma }^{\prime }\left(z\right),\end{array}$

## 介绍cross-entropy 损失函数（cost function）

$\begin{array}{}\text{(3)}& C=-\frac{1}{n}\sum _{x}\left[y\mathrm{ln}a+\left(1-y\right)\mathrm{ln}\left(1-a\right)\right]\end{array}$

$\alpha =\sigma \left(z\right)$$\alpha = \sigma(z)$带入上式:

$\begin{array}{}\text{(4)}& C=-\frac{1}{n}\sum _{x}\left[y\mathrm{ln}\sigma \left(z\right)+\left(1-y\right)\mathrm{ln}\left(1-\sigma \left(z\right)\right)\right]\end{array}$

$\omega$$\omega$求偏导

$\begin{array}{}\text{(5)}& \frac{\mathrm{\partial }C}{\mathrm{\partial }{w}_{j}}& =& -\frac{1}{n}\sum _{x}\left(\frac{y}{\sigma \left(z\right)}-\frac{\left(1-y\right)}{1-\sigma \left(z\right)}\right)\frac{\mathrm{\partial }\sigma }{\mathrm{\partial }{w}_{j}}\text{(6)}& & =& -\frac{1}{n}\sum _{x}\left(\frac{y}{\sigma \left(z\right)}-\frac{\left(1-y\right)}{1-\sigma \left(z\right)}\right){\sigma }^{\prime }\left(z\right){x}_{j}\text{(7)}& & =& -\frac{1}{n}\sum _{x}\frac{{\sigma }^{\prime }\left(z\right){x}_{j}}{\sigma \left(z\right)\left(1-\sigma \left(z\right)\right)}\left(\sigma \left(z\right)-y\right)\text{(合并同类项)}\end{array}$

$\begin{array}{}\text{(1)}& \frac{\mathrm{\partial }C}{\mathrm{\partial }{w}_{j}}=\frac{1}{n}\sum _{x}{x}_{j}\left(\sigma \left(z\right)-y\right)=\frac{1}{n}\sum _{x}{x}_{j}\left(a-y\right)\end{array}$

$\begin{array}{}\text{(2)}& \frac{\mathrm{\partial }C}{\mathrm{\partial }b}=\frac{1}{n}\sum _{x}\left(\sigma \left(z\right)-y\right)=\frac{1}{n}\sum _{x}\left(a-y\right)\end{array}$

cross-entropy函数的好处是:

## 用cross-entropy进行手写数字识别

#coding=utf-8
'''
Created on 2018年5月14日

@author: devkite
'''
import network2
# cross-entropy损失函数test效果
# cross-entropy不会出现学习缓慢的问题，而且相对二次Cost，学习效果更好
#加载数据集
# 损失函数使用Cross-entropy
net=network2.Network([784,30,10], cost=network2.CrossEntropyCost)
#初始化权重和偏向，和之前初始化方式是一样的,因为在后面章节将会介绍新的初始化方法，所以在这里改了个名字
net.large_weight_initializer()
net.SGD(trainDataset, 30, 10, 0.5,evaluation_data=testDataset, monitor_evaluation_accuracy=True)

## 总结:

cross-entropy cost几乎总是比二次cost函数好