从头开始编写层和模型

最新推荐文章于 2023-02-03 09:00:00 发布

__依韵__

最新推荐文章于 2023-02-03 09:00:00 发布

阅读量347

点赞数

分类专栏： tensorflow 2.0 深度学习文章标签： TensorFlow2.0 深度学习

tensorflow 2.0 同时被 2 个专栏收录

7 篇文章 0 订阅

订阅专栏

深度学习

6 篇文章 0 订阅

订阅专栏

文章目录

准备
层

准备

!pip3 install tensorflow==2.0.0a0
%matplotlib inline
import tensorflow as tf
from tensorflow import keras

Looking in indexes: https://pypi.tuna.tsinghua.edu.cn/simple
Requirement already satisfied: tensorflow==2.0.0a0 in /usr/local/lib/python3.7/site-packages (2.0.0a0)
Requirement already satisfied: google-pasta>=0.1.2 in /usr/local/lib/python3.7/site-packages (from tensorflow==2.0.0a0) (0.1.4)
Requirement already satisfied: numpy<2.0,>=1.14.5 in /usr/local/lib/python3.7/site-packages (from tensorflow==2.0.0a0) (1.16.2)
Requirement already satisfied: astor>=0.6.0 in /Users/fei/Library/Python/3.7/lib/python/site-packages (from tensorflow==2.0.0a0) (0.7.1)
Requirement already satisfied: keras-preprocessing>=1.0.5 in /usr/local/lib/python3.7/site-packages (from tensorflow==2.0.0a0) (1.0.9)
Requirement already satisfied: wheel>=0.26 in /usr/local/lib/python3.7/site-packages (from tensorflow==2.0.0a0) (0.33.1)
Requirement already satisfied: tf-estimator-nightly<1.14.0.dev2019030116,>=1.14.0.dev2019030115 in /usr/local/lib/python3.7/site-packages (from tensorflow==2.0.0a0) (1.14.0.dev2019030115)
Requirement already satisfied: keras-applications>=1.0.6 in /usr/local/lib/python3.7/site-packages (from tensorflow==2.0.0a0) (1.0.7)
Requirement already satisfied: gast>=0.2.0 in /Users/fei/Library/Python/3.7/lib/python/site-packages (from tensorflow==2.0.0a0) (0.2.2)
Requirement already satisfied: six>=1.10.0 in /Users/fei/Library/Python/3.7/lib/python/site-packages (from tensorflow==2.0.0a0) (1.12.0)
Requirement already satisfied: protobuf>=3.6.1 in /usr/local/lib/python3.7/site-packages (from tensorflow==2.0.0a0) (3.7.0)
Requirement already satisfied: tb-nightly<1.14.0a20190302,>=1.14.0a20190301 in /usr/local/lib/python3.7/site-packages (from tensorflow==2.0.0a0) (1.14.0a20190301)
Requirement already satisfied: absl-py>=0.7.0 in /Users/fei/Library/Python/3.7/lib/python/site-packages (from tensorflow==2.0.0a0) (0.7.0)
Requirement already satisfied: termcolor>=1.1.0 in /Users/fei/Library/Python/3.7/lib/python/site-packages (from tensorflow==2.0.0a0) (1.1.0)
Requirement already satisfied: grpcio>=1.8.6 in /usr/local/lib/python3.7/site-packages (from tensorflow==2.0.0a0) (1.19.0)
Requirement already satisfied: h5py in /Users/fei/Library/Python/3.7/lib/python/site-packages (from keras-applications>=1.0.6->tensorflow==2.0.0a0) (2.9.0)
Requirement already satisfied: setuptools in /usr/local/lib/python3.7/site-packages (from protobuf>=3.6.1->tensorflow==2.0.0a0) (40.8.0)
Requirement already satisfied: markdown>=2.6.8 in /Users/fei/Library/Python/3.7/lib/python/site-packages (from tb-nightly<1.14.0a20190302,>=1.14.0a20190301->tensorflow==2.0.0a0) (3.0.1)
Requirement already satisfied: werkzeug>=0.11.15 in /Users/fei/Library/Python/3.7/lib/python/site-packages (from tb-nightly<1.14.0a20190302,>=1.14.0a20190301->tensorflow==2.0.0a0) (0.14.1)

层

层的内容

层内部包含了一些变量(权重)和计算方法(负责将输入转换为输出)。

class Linear(keras.layers.Layer):
    def __init__(self, units=32, input_dim=32):
        super(Linear, self).__init__()
        self.w = tf.Variable(initial_value=tf.random_normal_initializer()(shape=(input_dim, units), dtype='float32'))
        self.b = tf.Variable(initial_value=tf.zeros_initializer()(shape=(units, ), dtype='float32'))
        
    def call(self, inputs):
        return tf.matmul(inputs, self.w) + self.b
    
x = tf.ones((2, 2))
linear_layer = Linear(4, 2)
print(linear_layer(x))

tf.Tensor(
[[ 0.0960293   0.09410477 -0.01649074  0.14715078]
 [ 0.0960293   0.09410477 -0.01649074  0.14715078]], shape=(2, 4), dtype=float32)

需要注意的是w和b会设置为层的(weights)属性并自动被跟踪。

assert linear_layer.weights == [linear_layer.w, linear_layer.b]

你可以使用层提供的内建方法add_weight来快速添加权重和偏置。

class Linear(keras.layers.Layer):
    def __init__(self, units=32, input_dim=32):
        super(Linear, self).__init__()
        self.w = self.add_weight(shape=(input_dim, units), initializer='random_normal', trainable=True)
        self.b = self.add_weight(shape=(units, ), initializer='zeros', trainable=True)
        
    def call(self, inputs):
        return tf.matmul(inputs, self.w) + self.b
    
x = tf.ones((2, 2))
linear_layer = Linear(4, 2)
print(linear_layer(x))

tf.Tensor(
[[ 0.03054511  0.01756009 -0.04622959 -0.10992575]
 [ 0.03054511  0.01756009 -0.04622959 -0.10992575]], shape=(2, 4), dtype=float32)

层中的权重可以不参与训练

层中添加的权重也可以不参与训练，也就是说在你训练的过程中，该权重不会参与到反向传播的计算当中。
下面是一个添加不训练权重的例子。

class ComputeSum(keras.layers.Layer):
    def __init__(self, input_dim):
        super(ComputeSum, self).__init__()
        self.total = self.add_weight(shape=(input_dim, ), initializer='zeros', trainable=False)
    
    def call(self, inputs):
        self.total.assign_add(tf.reduce_sum(inputs, axis=0))
        return self.total
    
x = tf.ones((2, 2))
my_sum = ComputeSum(2)
y = my_sum(x)
print(y.numpy())
y = my_sum(x)
print(y.numpy())

[2. 2.]
[4. 4.]

不参与训练的权重依然是weights的一部分，但是会被标记为不参与训练。

print('weights: ', len(my_sum.weights))
print('non-trainable_weights: ', len(my_sum.non_trainable_weights))

# 这个层中是没有可训练的权重的
print('trainable_weights: ', len(my_sum.trainable_weights))

weights:  1
non-trainable_weights:  1
trainable_weights:  0

推迟权重创建

在上面逻辑回归的例子中，__init__方法接收了input_dim参数，用于计算权重的尺寸。
但是在很多情况下，可能无法提前知道输入的尺寸，此时就需要对权重的创建进行推迟，甚至需要推迟到实例化层之后。

class Linear(keras.layers.Layer):
    def __init__(self, units=32,):
        super(Linear, self).__init__()
        self.units = units
    
    def build(self, input_shape):
        self.w = self.add_weight(shape=(input_shape[-1], self.units), initializer='random_normal', trainable=True)
        self.b = self.add_weight(shape=(self.units, ), initializer='zeros', trainable=True)
        
    def call(self, inputs):
        return tf.matmul(inputs, self.w) + self.b

__call__方法会在第一次调用的时候调用build方法完成权重的延迟创建。这种方式可以很方便的实现延迟创建的功能。

linear_layer = Linear(32)
y = linear_layer(x)

层可以是嵌套的

有时候需要在一个层重用调用其它层，此时外部层会自动跟踪内部层中的权重属性。
推荐在外部层的__init__方法中实例化内部层对象，这样如果内部层实现了build方法，就可以将权重的初始化推迟到外部层获取到输入的时候。

class MLPBlock(keras.layers.Layer):
    def __init__(self):
        super(MLPBlock, self).__init__()
        self.linear_1 = Linear(32)
        self.linear_2 = Linear(32)
        self.linear_3 = Linear(1)
        
    def call(self, inputs):
        x = self.linear_1(inputs)
        x = keras.activations.relu(x)
        x = self.linear_2(x)
        x = keras.activations.relu(x)
        return self.linear_3(x)
    
x = tf.ones(shape=(3, 64))
mlp = MLPBlock()
y = mlp(x)
print('weights: ', len(mlp.weights))
print('trainable_weights: ', len(mlp.trainable_weights))

weights:  6
trainable_weights:  6

递归收集添加的损失

当实现层的call方法时，可以使用self.add_loss(value)将一个张量添加到损失中，以便后面使用。

class ActivityRegularizationLayer(keras.layers.Layer):
    def __init__(self, rate=1e-2):
        super(ActivityRegularizationLayer, self).__init__()
        self.rate = rate
    
    def call(self, inputs):
        self.add_loss(self.rate * tf.reduce_sum(inputs))
        return inputs

添加的损失(包括内部层添加的)可以通过层的losses集合获取，该集合会在每次调用__call__方法的时候重置，因此集合中只会包含最后一次前向传播中计算的结果。

class OuterLayer(keras.layers.Layer):
    def __init__(self):
        super(OuterLayer, self).__init__()
        self.activity_reg = ActivityRegularizationLayer()
    
    def call(self, inputs):
        return self.activity_reg(inputs)
    
layer = OuterLayer()
assert len(layer.losses) == 0   # 此时没有调用__call__方法，因此loss为空
_ = layer(tf.zeros((1, 1)))
assert len(layer.losses) == 1   # 上面调用了一次__call__，因此loss已经被添加到集合中
# loss集合在每一次调用__call__时被自动重置
_ = layer(tf.zeros((1, 1)))
assert len(layer.losses) == 1   # 上面调用了一次__call__，但由于被重置了一次，因此loss集合中依然只有一个

另外，losses集合中也会包含其他内部层添加的对权重或偏置的正则化。

class OuterLayer(keras.layers.Layer):
    def __init__(self):
        super(OuterLayer, self).__init__()
        self.dense = keras.layers.Dense(32, kernel_regularizer=keras.regularizers.l2(1e-3))
    
    def call(self, inputs):
        return self.dense(inputs)
    
layer = OuterLayer()
_ = layer(tf.zeros((1, 1)))

print(layer.losses)

[<tf.Tensor: id=234, shape=(), dtype=float32, numpy=0.0021141893>]

在编写训练过程时，需要将该集合中的损失取出并加到总的损失当中。

optimizer = keras.optimizers.SGD(learning_rate=1e-3)
loss_fn = keras.losses.SparseCategoricalCrossentropy(from_logits=True)

for x_train_batch, y_train_batch in train_dataset:
    with tf.GradientTape() as tape:
        logits = model(x_train_batch)
        loss_value = loss_fn(y_train_batch, logits)
        
        loss_value += sum(model.losses)
    
        grads = tape.gradient(loss_value, model.trainable_variables)
        optimizer.apply_gradients(zip(grads, model.train_ables))

更多的训练细节，可以查看训练和验证这一节。

序列化

如果希望可以在后面对自定义的层进行序列化，那么可以通过get_config进行序列化。

class Linear(keras.layers.Layer):
    def __init__(self, units=32):
        super(Linear, self).__init__()
        self.units = units
    
    def build(self, input_shape):
        self.w = self.add_weight(shape=(input_shape[-1], self.units), initializer='random_normal', trainable=True)
        self.b = self.add_weight(shape=(self.units, ), initializer='zeros', trainable=True)
        
    def call(self, inputs):
        return tf.matmul(inputs, self.w) + self.b
    
    def get_config(self):
        return {'units': self.units}
    
layer = Linear(64)
config = layer.get_config()
print(config)
new_layer = Linear.from_config(config)

{'units': 64}

层的基类的__init__会接收一些例如名称、数据类型的关键参数，一个好的习惯就是在子类中将这些传给父类以及写到get_config方法中。

class Linear(keras.layers.Layer):
    def __init__(self, units=32, **kwargs):
        super(Linear, self).__init__(**kwargs)
        self.units = units
    
    def build(self, input_shape):
        self.w = self.add_weight(shape=(input_shape[-1], self.units), initializer='random_normal', trainable=True)
        self.b = self.add_weight(shape=(self.units, ), initializer='zeros', trainable=True)
        
    def call(self, inputs):
        return tf.matmul(inputs, self.w) + self.b
    
    def get_config(self):
        config = super(Linear, self).get_config()
        config.update({'units': self.units})
        return config
    
layer = Linear(64)
config = layer.get_config()
print(config)
new_layer = Linear.from_config(config)

{'name': 'linear_10', 'trainable': True, 'dtype': None, 'units': 64}

如果你想灵活的从config中恢复层，那么可以通过重载from_config来自己实现，下面是默认的from_config方法：

@classmethod
def from_config(cls, config):
    return cls(**config)

更多关于序列化和反序列化的方法，可以查看“保存和序列化模型”这一章。

`call`方法中特殊的训练参数

在一些特殊的层中，比如batch normalization和dropout，在训练和推断时的表现是不同的，对于这种类型的层，可以在调用call方法时使用training参数来控制其行为。
通过这个参数，你可以在训练和推断是正确控制模型的行为和输出。

class CustomDropout(keras.layers.Layer):
    def __init__(self, rate, **kwargs):
        super(CustomDropout, self).__init__(**kwargs)
        self.rate = rate

    def call(self, inputs, training=None):
        if training:
            return tf.nn.dropout(inputs, rate=self.rate)
        return inputs

__依韵__

关注

0
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
从头开始编写层和模型

文章目录准备层层的内容层中的权重可以不参与训练推迟权重创建层可以是嵌套的递归收集添加的损失序列化`call`方法中特殊的训练参数准备!pip3 install tensorflow==2.0.0a0%matplotlib inlineimport tensorflow as tffrom tensorflow import kerasLooking in indexes: https...
复制链接

扫一扫