卷积层权重初始化的时候,通常有以下几种方法:
1.Random Uniform distribution
函数为:
class RandomUniform(Initializer):
"""Initializer that generates tensors with a uniform distribution.
Args:
minval: A python scalar or a scalar tensor. Lower bound of the range
of random values to generate.
maxval: A python scalar or a scalar tensor. Upper bound of the range
of random values to generate. Defaults to 1 for float types.
seed: A Python integer. Used to create random seeds. See
@{tf.set_random_seed}
for behavior.
dtype: The data type.
"""
def __init__(self, minval=0, maxval=None, seed=None, dtype=dtypes.float32):
self.minval = minval
self.maxval = maxval
self.seed = seed
self.dtype = dtype
def __call__(self, shape, dtype=None, partition_info=None):
if dtype is None:
dtype = self.dtype
return random_ops.random_uniform(shape, self.minval, self.maxval,
dtype, seed=self.seed)
将参数w初始化值为[minval,maxval]范围内的随机均匀分布
2.Random Normal distribution(正态分布)
函数定义为:
class RandomNormal(Initializer):
"""Initializer that generates tensors with a normal distribution.
Args:
mean: a python scalar or a scalar tensor. Mean of the random values
to generate.
stddev: a python scalar or a scalar tensor. Standard deviation of the
random values to generate.
seed: A Python integer. Used to create random seeds. See
@{tf.set_random_seed}
for behavior.
dtype: The data type. Only floating point types are supported.
"""
def __init__(self, mean=0.0, stddev=1.0, seed=None, dtype=dtypes.float32):
self.mean = mean
self.stddev = stddev
self.seed = seed
self.dtype = _assert_float_dtype(dtype)
def __call__(self, shape, dtype=None, partition_info=None):
if dtype is None:
dtype = self.dtype
return random_ops.random_normal(shape, self.mean, self.stddev,
dtype, seed=self.seed)
将参数w初始化值为均值为mean,方差为stddev的高斯分布值.
3.Truncated Normal distribution(截断正态分布)
函数为:
class TruncatedNormal(Initializer):
"""Initializer that generates a truncated normal distribution.
These values are similar to values from a `random_normal_initializer`
except that values more than two standard deviations from the mean
are discarded and re-drawn. This is the recommended initializer for
neural network weights and filters.
Args:
mean: a python scalar or a scalar tensor. Mean of the random values
to generate.
stddev: a python scalar or a scalar tensor. Standard deviation of the
random values to generate.
seed: A Python integer. Used to create random seeds. See
@{tf.set_random_seed}
for behavior.
dtype: The data type. Only floating point types are supported.
"""
def __init__(self, mean=0.0, stddev=1.0, seed=None, dtype=dtypes.float32):
self.mean = mean
self.stddev = stddev
self.seed = seed
self.dtype = _assert_float_dtype(dtype)
def __call__(self, shape, dtype=None, partition_info=None):
if dtype is None:
dtype = self.dtype
return random_ops.truncated_normal(shape, self.mean, self.stddev,
dtype, seed=self.seed)
Truncated Normal 与Random Normal一样都为将权重初始化为正态分布,不过对于权重大于阈值(two standard deviations from the mean)的值截断.Truncated Normal初始化为常用的神经网络权重和滤波器初始化方法.
三种初始化方法tensorflow调用示例如下:
w = tf.get_variable('w', [k_h, k_w, input_.get_shape()[-1], output_dim],
initializer=tf.random_uniform_initializer(minval=0.0, maxval=1.0 ))
w = tf.get_variable('w', [k_h, k_w, input_.get_shape()[-1], output_dim],
initializer=random_normal_initializer(mean=m,stddev=stddev))
w = tf.get_variable('w', [k_h, k_w, input_.get_shape()[-1], output_dim],
initializer=tf.truncated_normal_initializer(mean=m,stddev=stddev))
identity initializtion
在CNN中,有时我们希望将权重初始化为上一层的feature map能够完整的传递到下一层,即对于卷积操作 F2=F1∗w ,我们希望初始化权重矩阵w,使得 F2=F1 ,此时的权重均值w初始化操作就叫identity initializtion.
tensorflow代码实现identity initializtion代码为:
def identity_initializer():
def _initializer(shape, dtype=tf.float32):
if len(shape) == 1:
return tf.constant_op.constant(0., dtype=dtype, shape=shape)
elif len(shape) == 2 and shape[0] == shape[1]:
return tf.constant_op.constant(np.identity(shape[0], dtype))
elif len(shape) == 4 and shape[2] == shape[3]:
array = np.zeros(shape, dtype=float)
cx, cy = shape[0]/2, shape[1]/2
for i in range(shape[2]):
array[cx, cy, i, i] = 1
return tf.constant_op.constant(array, dtype=dtype)
else:
raise
return _initializer
def identity_initializer():
def _initializer(shape, dtype=tf.float32, partition_info=None):
array = np.zeros(shape, dtype=float)
cx, cy = shape[0]//2, shape[1]//2
for i in range(shape[2]):
array[cx, cy, i, i] = 1
return tf.constant(array, dtype=dtype)
return _initializer
初始化后,权重矩阵array的其他值为0,除了array[cx, cy, :,:]为单位矩阵,例如shape=[3,3,8,8],得到的array[1,2,:,:]矩阵值为,
[[ 1. 0. 0. 0. 0. 0. 0. 0.]
[ 0. 1. 0. 0. 0. 0. 0. 0.]
[ 0. 0. 1. 0. 0. 0. 0. 0.]
[ 0. 0. 0. 1. 0. 0. 0. 0.]
[ 0. 0. 0. 0. 1. 0. 0. 0.]
[ 0. 0. 0. 0. 0. 1. 0. 0.]
[ 0. 0. 0. 0. 0. 0. 1. 0.]
[ 0. 0. 0. 0. 0. 0. 0. 1.]]
调用示例代码为:
import tensorflow.contrib.slim as slim
net=slim.conv2d(input,gm,[3,3],rate=1,activation_fn=lrelu,normalizer_fn=nm,weights_initializer=identity_initializer(),scope='g_conv1')