主要参数
units : 正整数,输出空间的维数;
activation: 使用的激活函数。如果没有指定,则没有使用激活函数,即线性激活函数 f(x) = x;
use_bias: 是否使用偏置的标志;
kernel_initialize: kernel 权重初始化器;
bias_initializer: bias 初始化器;
kernel_regularizer: kernel的正则化函数;
bias_regularizer: bias 的正则化函数;
activity_regularizer: 当前层的激活函数(对输出正则化。。);
kernel_constraint: kernel矩阵的约束函数;
bias_constraint: 偏置向量的约束函数。
输入shape, 输出shape
主要功能
Dense 实现了操作:
output = activation(dot(input, kernel) + bias )
其中, activation 激活函数
kernel 是当前层创建的权重矩阵
bias 偏置, 当‘use_bias’ 设置为true时产生作用
Note: if the input to the layer has a rank greater than 2, then it is flattened prior to the initial dot product with `kernel`.
如果该层的输入的秩大于2,那么它在与“kernel”的初始点积之前是平坦的。(暂时还不懂,求教!)
目前理解
使用者设置一个dense, 可以指定各个参数,每个参数对应功能,初学者大部分可以使用None,也就是不指定, dense可以完成层的创建于点积也就是计算过程,通过激活函数向下层输出指定的output结构, 如果不指定激活函数,则是线性操作。
两个重要函数:dense.build与dense.call。
build完成权重与偏置的初始化, call完成当前层的计算(如果有激活函数,将输出先激活)与输出。
例: 两层操作, 第一次添加了dense,指定了输入shape, 之后的所有添加不需要手动指定input_shape。
# as first layer in a sequential model:
model = Sequential()
model.add(Dense(32, input_shape=(16,)))
# now the model will take as input arrays of shape (*, 16)
# and output arrays of shape (*, 32)
# after the first layer, you don't need to specify
# the size of the input anymore:
model.add(Dense(32))
构造函数
def __init__(self,
units,
activation=None,
use_bias=True,
kernel_initializer='glorot_uniform',
bias_initializer='zeros',
kernel_regularizer=None,
bias_regularizer=None,
activity_regularizer=None,
kernel_constraint=None,
bias_constraint=None,
**kwargs):
if 'input_shape' not in kwargs and 'input_dim' in kwargs:
kwargs['input_shape'] = (kwargs.pop('input_dim'),)
super(Dense, self).__init__(
# 获取正则化器, 如果activity_regularizer 没有指定,返回None
activity_regularizer=regularizers.get(activity_regularizer), **kwargs)
self.units = int(units)
self.activation = activations.get(activation)
self.use_bias = use_bias
self.kernel_initializer = initializers.get(kernel_initializer)
self.bias_initializer = initializers.get(bias_initializer)
self.kernel_regularizer = regularizers.get(kernel_regularizer)
self.bias_regularizer = regularizers.get(bias_regularizer)
self.kernel_constraint = constraints.get(kernel_constraint)
self.bias_constraint = constraints.get(bias_constraint)
# base_layer 的两个参数指定
self.supports_masking = True
self.input_spec = InputSpec(min_ndim=2)
build 函数, 赋值,shape操作,最后的self.build 用于指示当前层的权重更新是否需要由当前类完成
def build(self, input_shape):
input_shape = tensor_shape.TensorShape(input_shape)
if input_shape[-1].value is None:
raise ValueError('The last dimension of the inputs to `Dense` '
'should be defined. Found `None`.')
self.input_spec = InputSpec(min_ndim=2,
axes={-1: input_shape[-1].value})
self.kernel = self.add_weight(
'kernel',
shape=[input_shape[-1].value, self.units],
initializer=self.kernel_initializer,
regularizer=self.kernel_regularizer,
constraint=self.kernel_constraint,
dtype=self.dtype,
trainable=True)
if self.use_bias:
self.bias = self.add_weight(
'bias',
shape=[self.units,],
initializer=self.bias_initializer,
regularizer=self.bias_regularizer,
constraint=self.bias_constraint,
dtype=self.dtype,
trainable=True)
else:
self.bias = None
self.built = True
call 函数
需要知道什么是tensor的秩,矩阵点积,矩阵乘积。
def call(self, inputs):
# 将输入的类型转换为tensor
inputs = ops.convert_to_tensor(inputs, dtype=self.dtype)
# 获取tensor的秩
rank = common_shapes.rank(inputs)
# 如果秩大于2, 则进行点积, 否则进行乘积
if rank > 2:
# Broadcasting is required for the inputs.
outputs = standard_ops.tensordot(inputs, self.kernel, [[rank - 1], [0]])
# Reshape the output back to the original ndim of the input.
if not context.executing_eagerly():
shape = inputs.get_shape().as_list()
output_shape = shape[:-1] + [self.units]
outputs.set_shape(output_shape)
else:
outputs = gen_math_ops.mat_mul(inputs, self.kernel)
if self.use_bias:
outputs = nn.bias_add(outputs, self.bias)
if self.activation is not None:
return self.activation(outputs) # pylint: disable=not-callable
return outputs