Tensorflow_2.6 官方demo——详细解读

本文链接：https://blog.csdn.net/AI_eNyu/article/details/126434106

Tensorflow_demo

- - - 1. Tensorflow 基本细节
    - 2. demo

本文介绍了tensorflow的一些基本知识，并以tensorflow官方提供的demo为例，讲解tensorflow_v2.6搭建模型，并训练模型的大致过程。 同时对其中涉及到tensorflow api和其他知识点进行了详细解读。

1. Tensorflow 基本细节

tensorflow Tensor 的通道排序：[batch, height, width, channel]
tf.keras模块：2.0版本之后，官方主推使用keras模块搭建网络
- 可选使用Keras Functional API(tf1中搭建模型的风格)和Model Subclassing API(类似pytorch中搭建模型的风格)两种方法搭建模型。

使用Model Subclassing API构建模型demo

from tensorflow.keras.layers import Dense, Flatten, Conv2D
from tensorflow.keras import Model


class MyModel(Model):
    def __init__(self):
        super(MyModel, self).__init__()
        self.conv1 = Conv2D(32, 3, activation='relu')
        self.flatten = Flatten()
        self.d1 = Dense(128, activation='relu')
        self.d2 = Dense(10, activation='softmax')

    def call(self, x, **kwargs):
        x = self.conv1(x)      # input[batch, 28, 28, 1] output[batch, 26, 26, 32]
        x = self.flatten(x)    # output [batch, 21632]
        x = self.d1(x)         # output [batch, 128]
        return self.d2(x)      # output [batch, 10]

说明：

a. __init__(self)中定义构建模型中所需要的模块。call()中定义网络正向传播的过程。

b. super(MyModel, self).__init__(): 解决在多继承过程中可能出现的一系列问题。

经卷积后的矩阵尺寸大小计算公式为： $N = (W - F + 2 P) / s （下取整变小） + 1$

a. 输入图片长/宽尺寸 $W$ 。

b. Filter大小为 $F\times F$ .

c.步长 $S$

d.padding的像素数 $P$ .

注：在tensorflow中，不需要用户自定义padding，而是给了两个参数VALID， SAME. 二者参数的输出feature_map大小如下：

在这里插入图片描述

2. demo

from __future__ import absolute_import, division, print_function, unicode_literals

import tensorflow as tf
from model import MyModel


def main():
    mnist = tf.keras.datasets.mnist

    # download and load data
    (x_train, y_train), (x_test, y_test) = mnist.load_data()
    x_train, x_test = x_train / 255.0, x_test / 255.0

    # Add a channels dimension
    # 增加一个深度信息
    x_train = x_train[..., tf.newaxis]
    x_test = x_test[..., tf.newaxis]

    # create data generator
    train_ds = tf.data.Dataset.from_tensor_slices(
        (x_train, y_train)).shuffle(10000).batch(32)
    test_ds = tf.data.Dataset.from_tensor_slices((x_test, y_test)).batch(32)

    # create model
    model = MyModel()

    # define loss
    loss_object = tf.keras.losses.SparseCategoricalCrossentropy()
    # define optimizer
    optimizer = tf.keras.optimizers.Adam()

    # define train_loss and train_accuracy
    train_loss = tf.keras.metrics.Mean(name='train_loss')
    train_accuracy = 					     tf.keras.metrics.SparseCategoricalAccuracy(name='train_accuracy')

    # define train_loss and train_accuracy
    test_loss = tf.keras.metrics.Mean(name='test_loss')
    test_accuracy = tf.keras.metrics.SparseCategoricalAccuracy(name='test_accuracy')

    # define train function including calculating loss, applying gradient and calculating accuracy
    @tf.function
    def train_step(images, labels):
        with tf.GradientTape() as tape:
            predictions = model(images)
            loss = loss_object(labels, predictions)
        gradients = tape.gradient(loss, model.trainable_variables)
        optimizer.apply_gradients(zip(gradients, model.trainable_variables))

        train_loss(loss)
        train_accuracy(labels, predictions)

    # define test function including calculating loss and calculating accuracy
    @tf.function
    def test_step(images, labels):
        predictions = model(images)
        t_loss = loss_object(labels, predictions)

        test_loss(t_loss)
        test_accuracy(labels, predictions)

    EPOCHS = 5

    for epoch in range(EPOCHS):
        train_loss.reset_states()        # clear history info
        train_accuracy.reset_states()    # clear history info
        test_loss.reset_states()         # clear history info
        test_accuracy.reset_states()     # clear history info

        for images, labels in train_ds:
            train_step(images, labels)

        for test_images, test_labels in test_ds:
            test_step(test_images, test_labels)

        template = 'Epoch {}, Loss: {}, Accuracy: {}, Test Loss: {}, Test Accuracy: {}'
        print(template.format(epoch + 1,
                              train_loss.result(),
                              train_accuracy.result() * 100,
                              test_loss.result(),
                              test_accuracy.result() * 100))


if __name__ == '__main__':
    main()

说明：

展示前三张训练集和对应标签

imgs = x_test[:3]
labels = y_test[:3]
print(labels)
plot_img = np.hstack(imgs) # 水平拼接
plt.imshow(plot_imgs, camp='gray') # 灰度图:camp='gray'
plt.show()

```
train_ds = tf.data.Dataset.from_tensor_slices((x_train, 	y_train)).shuffle(10000).batch(32)
```
其中，shuffle(10000)表示一次性读入内存10000张图片，在10000张图片中进行batch为32的随机采样。一般shuffle中对应的数值越大越能体现整体数据的随机采样过程。但一般会受到内存的限制，不能设置过大。

3. tf.keras.losses.SparseCategoricalCrossentropy()损失函数用来处理,非one-hot类型的标签值。而CategoricalCrossentropy()处理one-hot类型的标签值。

4. 与Pytorch不同，tensorflow不会自动跟踪每一个可训练参数的梯度，所以需要使用tf.GradientTape()与with上下文管理器配合使用。

with用法概述： with会自动调用其后的对象进行实例化，并将对象赋给as后的变量名。with会首先调用对象的__enter()__方法，然后执行在with下的语句体。执行结束后with会调用对象的__exit()__方法。

with的好处在于执行语句体的过程中就算发生了异常也会调用__exit()__方法，例如在open()对象中可以保证文件的正常关闭。with详细参考。

**@tf.function**装饰器

作用：构建高效的python代码，将python代码转为tensorflow的图结构，能够在GPU，TPU上运算。加上@tf.function后的python代码部分不可以在函数内部设置断点进行调试，但训练速度会大大提升。

如下定义的train_loss``train_accuracy等均可以看作累加器。所以在下一个epoch训练时，需要将累加器的值清零。

# define train_loss and train_accuracy
train_loss = tf.keras.metrics.Mean(name='train_loss')
train_accuracy = tf.keras.metrics.SparseCategoricalAccuracy(name='train_accuracy')

# define train_loss and train_accuracy
test_loss = tf.keras.metrics.Mean(name='test_loss')
test_accuracy = tf.keras.metrics.SparseCategoricalAccuracy(name='test_accuracy')

train_loss.reset_states()        # clear history info
train_accuracy.reset_states()    # clear history info
test_loss.reset_states()         # clear history info
test_accuracy.reset_states()     # clear history info