1.Xavier 初始化
适用于 tanh , sigmoid
Xavier初始化的基本思想是保持输入和输出的方差一致,避免了所有输出值都趋向于0,Xavier初始化的推导过程是基于线性函数的,所以他不适用于Relu
W = tf.Variable(np.random.randn(node_in, node_out)) / np.sqrt(node_in)
node_in:输入的layer_sizes
node_out :输出的layer_sizes
Tensorflow实现:
tf.contrib.layers.xavier_initializer(
uniform=True,
seed=None,
dtype=tf.dtypes.float32
)
Args:
uniform: Whether to use uniform or normal distributed random initialization.
seed: A Python integer. Used to create random seeds. See tf.set_random_seed for behavior.
dtype: The data type. Only floating point types are supported.
2.He 初始化
Xavier初始化的变种,适用于Relu
He initialization的思想是:在ReLU网络中,假定每一层有一半的神经元被激活,另一半为0,所以,要保持variance不变,只需要在Xavier的基础上再除以2
W = tf.Variable(np.random.randn(node_in,node_out)) / np.sqrt(node_in/2)
Tensorflow实现:
tf.contrib.layers.variance_scaling_initializer(
factor=2.0,
mode='FAN_IN',
uniform=False,
seed=None,
dtype=tf.dtypes.float32
)
Args:
factor: Float. A multiplicative factor.
mode: String. 'FAN_IN', 'FAN_OUT', 'FAN_AVG'.
uniform: Whether to use uniform or normal distributed random initialization.
seed: A Python integer. Used to create random seeds. See tf.set_random_seed for behavior.
dtype: The data type. Only floating point types are supported.
variance_scaling_initializer的默认参数对应于He初始化,更改参数可以生成Xavier初始化
3.截断正态分布
Tensorflow实现:
tf.initializers.truncated_normal(
mean=0.0,
stddev=1.0,
seed=None,
dtype=tf.dtypes.float32
)
Args:
mean: a python scalar or a scalar tensor. Mean of the random values to generate.
stddev: a python scalar or a scalar tensor. Standard deviation of the random values to generate.
seed: A Python integer. Used to create random seeds. See tf.set_random_seed for behavior.
dtype: Default data type, used if no dtype argument is provided when calling the initializer. Only floating point types are supported.
4.正态分布
Tensorflow实现:
tf.initializers.random_normal(
mean=0.0,
stddev=1.0,
seed=None,
dtype=tf.dtypes.float32
)
Args:
mean: a python scalar or a scalar tensor. Mean of the random values to generate.
stddev: a python scalar or a scalar tensor. Standard deviation of the random values to generate.
seed: A Python integer. Used to create random seeds. See tf.set_random_seed for behavior.
dtype: Default data type, used if no dtype argument is provided when calling the initializer. Only floating point types are supported.
5.均匀分布
Tensorflow实现:
tf.initializers.random_uniform(
minval=0,
maxval=None,
seed=None,
dtype=tf.dtypes.float32
)
Args:
minval: A python scalar or a scalar tensor. Lower bound of the range of random values to generate.
maxval: A python scalar or a scalar tensor. Upper bound of the range of random values to generate. Defaults to 1 for float types.
seed: A Python integer. Used to create random seeds. See tf.set_random_seed for behavior.
dtype: Default data type, used if no dtype argument is provided when calling the initializer.
参考: