Keras 自定义模型：自定义fit的train_step

本文链接：https://blog.csdn.net/u012762410/article/details/127512066

前言

接着《Tensorflow中的梯度和自动微分：tf.GradientTape理解》，开始自定义梯度，来解决无法定义类似于loss(y_pred, y_true) 的形式的损失函数问题。

本文参考了：
keras官方文档-The Model class
tf官方文档-Customize what happens in Model.fit

Keras创建Model类的两种方式

keras中创建模型有两种方式：

使用标准的 `Functional API`

import tensorflow as tf

inputs = tf.keras.Input(shape=(3,))
x = tf.keras.layers.Dense(4, activation=tf.nn.relu)(inputs)
outputs = tf.keras.layers.Dense(5, activation=tf.nn.softmax)(x)
model1 = tf.keras.Model(inputs=inputs, outputs=outputs)
model1.summary()

使用tf.keras.Model的子类

class MyModel(tf.keras.Model):
    def __init__(self):
        super().__init__()
        self.dense1 = tf.keras.layers.Dense(4, activation=tf.nn.relu)
        self.dense2 = tf.keras.layers.Dense(5, activation=tf.nn.softmax)
    def call(self, inputs):
        x = self.dense1(inputs)
        return self.dense2(x)

model2 = MyModel()

model1是类keras.engine.functional.Functional的实例，model2是自定义类MyModel的实例，但两者类别的父类都是tf.keras.Model。调用summary中可以看到，两者在在结构上是等价的：

>>> model1.summary()
Model: "model_1"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
input_2 (InputLayer)         [(None, 3)]               0         
_________________________________________________________________
dense_4 (Dense)              (None, 4)                 16        
_________________________________________________________________
dense_5 (Dense)              (None, 5)                 25        
=================================================================
Total params: 41
Trainable params: 41
Non-trainable params: 0
_________________________________________________________________

>>> model1.summary()
Model: "my_model_5"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
dense_16 (Dense)             multiple                  16        
_________________________________________________________________
dense_17 (Dense)             multiple                  25        
=================================================================
Total params: 41
Trainable params: 41
Non-trainable params: 0
_________________________________________________________________

自定义Model.fit

先看一下fit的参数：

fit(x=None, y=None, batch_size=None,...)
x: 输入的数据
y: 数据标签(labels)，如果x是tf.data.Dataset对象, 不需要传递y

实现train_step函数

官方文档中对该train_step的解释为：

实现一个train step的逻辑。可以重写这个方法来实现自定义的训练逻辑, 通常包含前向传播、损失计算、后向传播、度量更新。

创建train_step需要如下步骤：

创建GradientTape，通过前向传播记录loss相对于可训练变量的计算过程；
调用GradientTape.gradient计算loss相对于可训练变量的梯度；
使用优化器更新权重（需要传(gradient, variable) pairs.）；
更新每次输出的metrics (包括跟踪loss的metrics)
返回一个将metrics名称映射到当前值的dict

模型

class CustomModel(keras.Model):
    def train_step(self, data):
        # Unpack dataset
        x, y = data
        # 1. 创建GradientTape，通过前向传播记录loss相对于可训练变量的计算过程；
        with tf.GradientTape() as tape:
            y_pred = self(x, training=True)
            loss = self.compiled_loss(y, y_pred, regularization_losses=self.losses)

        # 2. 计算梯度
        gradients = tape.gradient(loss, self.trainable_variables)
        # 3. 使用指定的优化器在模型上应用梯度
        self.optimizer.apply_gradients(zip(gradients, self.trainable_variables))
        # 4. 更新每次输出的metrics.
        self.compiled_metrics.update_state(y, y_pred)
        # 5. 返回metrics, metricsm默认含有loss
        return {m.name: m.result() for m in self.metrics}

训练

import numpy as np

# Construct and compile an instance of CustomModel
inputs = keras.Input(shape=(32,))
outputs = keras.layers.Dense(1)(inputs)
model = CustomModel(inputs, outputs)
model.compile(optimizer="adam", loss="mse", metrics=["mae"])

# Just use `fit` as usual
x = np.random.random((1000, 32))
y = np.random.random((1000, 1))
model.fit(x, y, epochs=3)

输出:

import numpy as np

# 构建和配置CustomModel实例
inputs = keras.Input(shape=(32,))
outputs = keras.layers.Dense(1)(inputs)
model = CustomModel(inputs, outputs)
model.compile(optimizer="adam", loss="mse", metrics=["mae"])

# 训练
x = np.random.random((1000, 32))
y = np.random.random((1000, 1))
model.fit(x, y, epochs=3)