TensorFlow2代码解读(3)

赛博炼丹师

于 2023-10-16 11:20:35 发布

阅读量72

点赞数

分类专栏： TensorFlow2代码解读文章标签： tensorflow 人工智能 python 深度学习

本文链接：https://blog.csdn.net/victor_li_/article/details/133854630

版权

TensorFlow2代码解读专栏收录该内容

7 篇文章 0 订阅

订阅专栏

import tensorflow as tf
from tensorflow.keras import datasets,layers,optimizers,Sequential


(x,y),(x_test,y_test) = datasets.fashion_mnist.load_data()
x = tf.convert_to_tensor(x,dtype = tf.float32) / 255.
y = tf.convert_to_tensor(y,dtype = tf.int32)
x_test = tf.convert_to_tensor(x_test,dtype = tf.float32) / 255.
y_test = tf.convert_to_tensor(y_test,dtype = tf.int32)

batchs = 128

db = tf.data.Dataset.from_tensor_slices((x,y))
db = db.shuffle(10000).batch(batchs)

db_test = tf.data.Dataset.from_tensor_slices((x_test,y_test))
db_test = db_test.batch(batchs)


model = Sequential([
    layers.Dense(256,activation = tf.nn.relu),
    layers.Dense(128,activation = tf.nn.relu),
    layers.Dense(64,activation = tf.nn.relu),
    layers.Dense(32,activation = tf.nn.relu),
    layers.Dense(16,activation = tf.nn.relu),
    layers.Dense(10)
])
model.build(input_shape = [None,28*28])
optimizer = optimizers.Adam(lr = 1e-3)
def main():
    for epoch in range(30):
        for step,(x,y) in enumerate(db):
            x = tf.reshape(x,[-1,28*28])
            with tf.GradientTape() as tape:
                logits = model(x)
                y_onehot = tf.one_hot(y,depth = 10)
                loss = tf.reduce_mean(tf.losses.MSE(y_onehot,logits))
            grads = tape.gradient(loss,model.trainable_variables)
            optimizer.apply_gradients(zip(grads,model.trainable_variables))
            if step % 100 == 0:
                print(epoch,step,'loss: ',float(loss))
        total_correct = 0
        total_num = 0
        for x,y in db_test:
            x = tf.reshape(x,[-1,28*28])
            logits = model(x)
            prob = tf.nn.softmax(logits,axis = 1)
            pred = tf.argmax(prob,axis = 1)
            pred = tf.cast(pred,dtype = tf.int32)
            correct = tf.equal(pred,y)
            correct = tf.reduce_sum(tf.cast(correct,dtype = tf.int32))
            
            total_correct += int(correct)
            total_num += x.shape[0]
        acc = total_correct / total_num
        print('acc: ',acc)
    
if __name__ == '__main__':
    main()

import tensorflow as tf
from tensorflow.keras import datasets, layers, optimizers, Sequential

首先，我们导入了TensorFlow和一些相关模块。其中datasets模块提供了许多流行的数据集，layers模块包含了构建神经网络层的类，optimizers模块包含了优化算法，Sequential是Keras中的一个模型容器。

(x, y), (x_test, y_test) = datasets.fashion_mnist.load_data()

通过调用datasets.fashion_mnist.load_data()函数，我们加载了Fashion MNIST数据集，并将其分为训练集和测试集。训练集包含输入图像(x)和相应标签(y)，测试集也包含输入图像(x_test)和相应标签(y_test)。

x = tf.convert_to_tensor(x, dtype=tf.float32) / 255.
y = tf.convert_to_tensor(y, dtype=tf.int32)
x_test = tf.convert_to_tensor(x_test, dtype=tf.float32) / 255.
y_test = tf.convert_to_tensor(y_test, dtype=tf.int32)

在此部分，我们将图像数据和标签转换为TensorFlow张量，并进行一些预处理。我们首先使用tf.convert_to_tensor()将数据转换为张量，然后将图像数据归一化到0到1的范围。标签数据不需要归一化，因此我们只转换其数据类型为tf.int32。

batchs = 128

batchs是一个批次的大小，即每次训练时从数据集中取出的样本数量。这里设置为128。

db = tf.data.Dataset.from_tensor_slices((x, y))
db = db.shuffle(10000).batch(batchs)

db_test = tf.data.Dataset.from_tensor_slices((x_test, y_test))
db_test = db_test.batch(batchs)

在此部分，我们使用tf.data.Dataset.from_tensor_slices()函数创建了训练集和测试集的数据集对象。然后，通过使用shuffle()方法对训练集进行打乱操作，并使用batch()方法将数据集分割为大小为batchs的小批次。同样地，我们对测试集进行了相同的处理。

这样做的目的是为了更高效地处理数据，以便于模型的训练和评估。

model = Sequential([
    layers.Dense(256, activation=tf.nn.relu),
    layers.Dense(128, activation=tf.nn.relu),
    layers.Dense(64, activation=tf.nn.relu),
    layers.Dense(32, activation=tf.nn.relu),
    layers.Dense(16, activation=tf.nn.relu),
    layers.Dense(10)
])
model.build(input_shape=[None, 28*28])

在这段代码中，我们使用Keras的Sequential方法创建了一个顺序模型。我们通过传递一个包含每个层的列表来定义模型的架构。这里的模型有6个全连接层，每个层都使用ReLU激活函数。最后一层没有指定激活函数，并输出了10个节点，对应着10个类别。

model.build(input_shape=[None, 28*28])用于指定模型的输入形状，其中None表示可以接受任意大小的批次大小，而28*28表示每个输入样本是一个大小为28x28的二维数组。

通过这段代码，我们定义了神经网络模型的架构。

optimizer = optimizers.Adam(lr=1e-3)

在此处，我们使用了Adam优化器作为优化算法，并设置了学习率为1e-3。优化器用于调整模型的参数以最小化损失函数。

def main():
    for epoch in range(30):
        for step, (x, y) in enumerate(db):
            x = tf.reshape(x, [-1, 28*28])
            with tf.GradientTape() as tape:
                logits = model(x)
                y_onehot = tf.one_hot(y, depth=10)
                loss = tf.reduce_mean(tf.losses.MSE(y_onehot, logits))
            grads = tape.gradient(loss, model.trainable_variables)
            optimizer.apply_gradients(zip(grads, model.trainable_variables))
            if step % 100 == 0:
                print(epoch, step, 'loss:', float(loss))
        total_correct = 0
        total_num = 0
        for x, y in db_test:
            x = tf.reshape(x, [-1, 28*28])
            logits = model(x)
            prob = tf.nn.softmax(logits, axis=1)
            pred = tf.argmax(prob, axis=1)
            pred = tf.cast(pred, dtype=tf.int32)
            correct = tf.equal(pred, y)
            correct = tf.reduce_sum(tf.cast(correct, dtype=tf.int32))
            total_correct += int(correct)
            total_num += x.shape[0]
        acc = total_correct / total_num
        print('acc:', acc)

if __name__ == '__main__':
    main()


这是我们的主函数main()，其中包含了模型训练和评估的逻辑。

首先，我们使用两个嵌套的循环进行训练。外部的循环是迭代30个epoch，而内部的循环是在训练集上迭代批次。在每个步骤中，我们首先将输入数据重塑成形状为[-1, 28*28]的二维数组，其中-1表示自动推断批次的大小。然后，我们使用tf.GradientTape()记录前向传播过程以计算梯度，同时计算损失函数，并通过反向传播计算并应用梯度来更新模型参数。如果步骤数可以被100整除，则打印当前epoch和步骤数以及损失值。

然后，我们对测试集进行评估。对于每个批次的测试数据，我们首先重塑输入数据，并使用模型进行预测。然后，我们使用softmax函数对预测结果进行归一化，并根据概率选择最高的类别作为预测结果。我们将预测结果与真实标签进行比较，计算正确预测的数量，并累加到总数中。最后，我们计算分类准确率，并打印出来。

通过if __name__ == '__main__':这部分代码，我们确保主函数只在直接执行该脚本时运行，而不是在导入为模块时运行。