weight_initialization
Occam’s razor:简单一刀切设置0或者1
General rule for setting weights
The general rule for setting the weights in a neural network is to set them to be close to zero without being too small.
Good practice is to start your weights in the range of
[
−
y
,
y
]
[-y, y]
[−y,y] where
y
=
1
/
n
y=1/\sqrt{n}
y=1/n
(
n
n
n is the number of inputs to a given neuron).
Uniform Distribution:
设置分布区间为: [ − y , y ] [-y, y] [−y,y] where y = 1 / n y=1/\sqrt{n} y=1/n
Normal Distribution:
设置分布区间为:a mean of 0 and a standard deviation of y = 1 / n y=1/\sqrt{n} y=1/n.
不设置初始化
利用网络特殊的结构,淡化初始化的影响:
比如BN,每一层接近a mean of 0 and a standard deviation of ,自动化处理,避免了初始化的影响。