深度学习之‘参数初始化’

1.Xavier 初始化

适用于 tanh , sigmoid
Xavier初始化的基本思想是保持输入和输出的方差一致,避免了所有输出值都趋向于0,Xavier初始化的推导过程是基于线性函数的,所以他不适用于Relu

W = tf.Variable(np.random.randn(node_in, node_out)) / np.sqrt(node_in)
node_in:输入的layer_sizes
node_out :输出的layer_sizes

Tensorflow实现: 

tf.contrib.layers.xavier_initializer(
    uniform=True,
    seed=None,
    dtype=tf.dtypes.float32
)

Args:
uniform: Whether to use uniform or normal distributed random initialization.
seed: A Python integer. Used to create random seeds. See tf.set_random_seed for behavior.
dtype: The data type. Only floating point types are supported.

2.He 初始化

Xavier初始化的变种,适用于Relu

He initialization的思想是:在ReLU网络中,假定每一层有一半的神经元被激活,另一半为0,所以,要保持variance不变,只需要在Xavier的基础上再除以2

W = tf.Variable(np.random.randn(node_in,node_out)) / np.sqrt(node_in/2)

 Tensorflow实现: 

tf.contrib.layers.variance_scaling_initializer(
    factor=2.0,
    mode='FAN_IN',
    uniform=False,
    seed=None,
    dtype=tf.dtypes.float32
)

Args:
factor: Float. A multiplicative factor.
mode: String. 'FAN_IN', 'FAN_OUT', 'FAN_AVG'.
uniform: Whether to use uniform or normal distributed random initialization.
seed: A Python integer. Used to create random seeds. See tf.set_random_seed for behavior.
dtype: The data type. Only floating point types are supported.

 variance_scaling_initializer的默认参数对应于He初始化,更改参数可以生成Xavier初始化


3.截断正态分布

Tensorflow实现: 

tf.initializers.truncated_normal(
    mean=0.0,
    stddev=1.0,
    seed=None,
    dtype=tf.dtypes.float32
)

Args:
mean: a python scalar or a scalar tensor. Mean of the random values to generate.
stddev: a python scalar or a scalar tensor. Standard deviation of the random values to generate.
seed: A Python integer. Used to create random seeds. See tf.set_random_seed for behavior.
dtype: Default data type, used if no dtype argument is provided when calling the initializer. Only floating point types are supported.

4.正态分布

 Tensorflow实现: 

tf.initializers.random_normal(
    mean=0.0,
    stddev=1.0,
    seed=None,
    dtype=tf.dtypes.float32
)

Args:
mean: a python scalar or a scalar tensor. Mean of the random values to generate.
stddev: a python scalar or a scalar tensor. Standard deviation of the random values to generate.
seed: A Python integer. Used to create random seeds. See tf.set_random_seed for behavior.
dtype: Default data type, used if no dtype argument is provided when calling the initializer. Only floating point types are supported.

5.均匀分布

 Tensorflow实现: 

tf.initializers.random_uniform(
    minval=0,
    maxval=None,
    seed=None,
    dtype=tf.dtypes.float32
)

Args:
minval: A python scalar or a scalar tensor. Lower bound of the range of random values to generate.
maxval: A python scalar or a scalar tensor. Upper bound of the range of random values to generate. Defaults to 1 for float types.
seed: A Python integer. Used to create random seeds. See tf.set_random_seed for behavior.
dtype: Default data type, used if no dtype argument is provided when calling the initializer.

参考:

https://www.tensorflow.org/api_docs/python/tf/initializers

https://zhuanlan.zhihu.com/p/25110150

  • 1
    点赞
  • 2
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值