TensorFlow2-实战-手写数字识别（一）：手写版（不用模型）、步骤：【初始化参数】--＞【循环（①根据参数W、B与输入值X计算出预测值Y；②计算Loss；③计算梯度；④梯度下降来更新参数）】

u013250861

已于 2022-04-18 21:28:34 修改

阅读量238

点赞数

分类专栏： TensorFlow 文章标签： TensorFlow2 手写数字识别

于 2022-04-16 23:35:18 首次发布

本文链接：https://blog.csdn.net/u013250861/article/details/124223027

版权

TensorFlow 专栏收录该内容

29 篇文章 6 订阅

订阅专栏

模型计算步骤：

初始化参数w、b
根据参数，通过模型前向计算，得到输入特征值对应的输出(步骤①+②)
根据预测值与真实值计算Loss
计算梯度
根据梯度下降算法更新参数

import pandas as pd
import tensorflow as tf
from tensorflow.keras import datasets, optimizers
import os

os.environ['TF_CPP_MIN_LOG_LEVEL'] = '2'

# 读取数据集
# x: [60k, 28, 28], y: [60k]
mnist_dataset = datasets.mnist.load_data()
dataset_train, datasset_val = mnist_dataset[0], mnist_dataset[1]
(x, y) = dataset_train
(x_val, y_val) = datasset_val

# 将ndarray数组转为Tensor格式
x = tf.convert_to_tensor(x, dtype=tf.float32) / 255.  # x: [0~255] => [0~1.]
print("x.shape = {0}, x[0] = \n{1}".format(x.shape, pd.DataFrame(x[0].numpy())))  # x.shape =  (60000, 28, 28)
y = tf.convert_to_tensor(y, dtype=tf.int32)
print("y.shape = {0}, y = {1}".format(y.shape, y))  # y.shape = (60000,), y = [5 0 4 ... 5 6 8]

print("\n数据集数据特征：")
print("x.shape = {0}, y.shape = {1}, x.dtype = {2}, y.dtype = {3}".format(x.shape, y.shape, x.dtype, y.dtype))
print("tf.reduce_min(x) = {0}, tf.reduce_max(x) = {1}".format(tf.reduce_min(x), tf.reduce_max(x)))
print("tf.reduce_min(y) = {0}, tf.reduce_max(y) = {1}".format(tf.reduce_min(y), tf.reduce_max(y)))

# 从(x, y)中抽取训练数据集,batch_size=128
train_db = tf.data.Dataset.from_tensor_slices((x, y)).batch(128)
train_iter = iter(train_db)
sample = next(train_iter)
print('\n每一个batch数据的形状: x.shape = {0}, y.shape = {1}'.format(sample[0].shape, sample[1].shape))

# 初始化参数
# [b, 784] => [b, 256] => [b, 128] => [b, 10]
# [dim_in, dim_out], [dim_out]
w1 = tf.Variable(tf.random.truncated_normal([784, 256], mean=0, stddev=0.1))
b1 = tf.Variable(tf.zeros([256]))
w2 = tf.Variable(tf.random.truncated_normal([256, 128], mean=0, stddev=0.1))
b2 = tf.Variable(tf.zeros([128]))
w3 = tf.Variable(tf.random.truncated_normal([128, 10], mean=0, stddev=0.1))
b3 = tf.Variable(tf.zeros([10]))

# 设置学习率
lr = 1e-3

# 训练10个epoch
for epoch in range(10):  # iterate db for 10
    # 遍历所有的batch
    for batch_idx, (x, y) in enumerate(train_db):
        # x:[128, 28, 28]；y: [128]
        # 将特征值进行维度变换
        x = tf.reshape(x, [-1, 28 * 28])  # [b, 28, 28] => [b, 28*28]
        # Tensorflow使用梯度带（tf.GradientTape）来记录正向运算过程，然后反向传播自动得到梯度值。
        with tf.GradientTape() as tape:  # tf.Variable
            # 步骤一: 根据参数，通过模型前向计算，得到输入特征值对应的输出(步骤①+②)
            # ①: 第一层网络
            # x: [b, 28*28]
            # h1 = x@w1 + b1
            # [b, 784]@[784, 256] + [256] => [b, 256] + [256] => [b, 256] + [b, 256]
            h1 = x @ w1 + tf.broadcast_to(b1, [x.shape[0], 256])
            h1 = tf.nn.relu(h1)

            # ②: 第二层网络
            # [b, 256] => [b, 128]
            h2 = h1 @ w2 + b2
            h2 = tf.nn.relu(h2)
            # [b, 128] => [b, 10]
            out = h2 @ w3 + b3

            # 步骤二: 计算Loss
            # out: [b, 10]
            # y: [b] => [b, 10]
            y_onehot = tf.one_hot(y, depth=10)
            loss = tf.reduce_mean(tf.square(y_onehot - out))  # mse = mean(sum(y-out)^2)

        # 步骤三: 计算梯度
        grads = tape.gradient(loss, [w1, b1, w2, b2, w3, b3])
        # print(grads)

        # 步骤四: 根据梯度下降算法更新参数
        # w1 = w1 - lr * w1_grad
        # ①、使用optimizer进行梯度下降
        # optimizer.apply_gradients(zip(grads, [w1, b1, w2, b2, w3, b3]))
        # ②、不使用optimizer, 手动进行梯度下降
        # 要进行原地更新【w1 = w1 - lr * grads[0], 生成的结果会从Variable又变回Tensor, 会出错】
        w1.assign_sub(lr * grads[0])
        b1.assign_sub(lr * grads[1])
        w2.assign_sub(lr * grads[2])
        b2.assign_sub(lr * grads[3])
        w3.assign_sub(lr * grads[4])
        b3.assign_sub(lr * grads[5])

        if batch_idx % 100 == 0:
            print("epoch = {0}, batch_idx = {1}, loss = {2}".format(epoch, batch_idx, float(loss)))

打印结果：

x.shape = (60000, 28, 28), x[0] = 
     0    1    2    3         4  ...         23   24   25   26   27
0   0.0  0.0  0.0  0.0  0.000000 ...   0.000000  0.0  0.0  0.0  0.0
1   0.0  0.0  0.0  0.0  0.000000 ...   0.000000  0.0  0.0  0.0  0.0
2   0.0  0.0  0.0  0.0  0.000000 ...   0.000000  0.0  0.0  0.0  0.0
3   0.0  0.0  0.0  0.0  0.000000 ...   0.000000  0.0  0.0  0.0  0.0
4   0.0  0.0  0.0  0.0  0.000000 ...   0.000000  0.0  0.0  0.0  0.0
5   0.0  0.0  0.0  0.0  0.000000 ...   0.498039  0.0  0.0  0.0  0.0
6   0.0  0.0  0.0  0.0  0.000000 ...   0.250980  0.0  0.0  0.0  0.0
7   0.0  0.0  0.0  0.0  0.000000 ...   0.000000  0.0  0.0  0.0  0.0
8   0.0  0.0  0.0  0.0  0.000000 ...   0.000000  0.0  0.0  0.0  0.0
9   0.0  0.0  0.0  0.0  0.000000 ...   0.000000  0.0  0.0  0.0  0.0
10  0.0  0.0  0.0  0.0  0.000000 ...   0.000000  0.0  0.0  0.0  0.0
11  0.0  0.0  0.0  0.0  0.000000 ...   0.000000  0.0  0.0  0.0  0.0
12  0.0  0.0  0.0  0.0  0.000000 ...   0.000000  0.0  0.0  0.0  0.0
13  0.0  0.0  0.0  0.0  0.000000 ...   0.000000  0.0  0.0  0.0  0.0
14  0.0  0.0  0.0  0.0  0.000000 ...   0.000000  0.0  0.0  0.0  0.0
15  0.0  0.0  0.0  0.0  0.000000 ...   0.000000  0.0  0.0  0.0  0.0
16  0.0  0.0  0.0  0.0  0.000000 ...   0.000000  0.0  0.0  0.0  0.0
17  0.0  0.0  0.0  0.0  0.000000 ...   0.000000  0.0  0.0  0.0  0.0
18  0.0  0.0  0.0  0.0  0.000000 ...   0.000000  0.0  0.0  0.0  0.0
19  0.0  0.0  0.0  0.0  0.000000 ...   0.000000  0.0  0.0  0.0  0.0
20  0.0  0.0  0.0  0.0  0.000000 ...   0.000000  0.0  0.0  0.0  0.0
21  0.0  0.0  0.0  0.0  0.000000 ...   0.000000  0.0  0.0  0.0  0.0
22  0.0  0.0  0.0  0.0  0.000000 ...   0.000000  0.0  0.0  0.0  0.0
23  0.0  0.0  0.0  0.0  0.215686 ...   0.000000  0.0  0.0  0.0  0.0
24  0.0  0.0  0.0  0.0  0.533333 ...   0.000000  0.0  0.0  0.0  0.0
25  0.0  0.0  0.0  0.0  0.000000 ...   0.000000  0.0  0.0  0.0  0.0
26  0.0  0.0  0.0  0.0  0.000000 ...   0.000000  0.0  0.0  0.0  0.0
27  0.0  0.0  0.0  0.0  0.000000 ...   0.000000  0.0  0.0  0.0  0.0

[28 rows x 28 columns]
y.shape = (60000,), y = [5 0 4 ... 5 6 8]

数据集数据特征：
x.shape = (60000, 28, 28), y.shape = (60000,), x.dtype = <dtype: 'float32'>, y.dtype = <dtype: 'int32'>
tf.reduce_min(x) = 0.0, tf.reduce_max(x) = 1.0
tf.reduce_min(y) = 0, tf.reduce_max(y) = 9

每一个batch数据的形状: x.shape = (128, 28, 28), y.shape = (128,)
epoch = 0, batch_idx = 0, loss = 0.31570330262184143
epoch = 0, batch_idx = 100, loss = 0.2133779525756836
epoch = 0, batch_idx = 200, loss = 0.18124043941497803
epoch = 0, batch_idx = 300, loss = 0.185683935880661
epoch = 0, batch_idx = 400, loss = 0.16511934995651245
epoch = 1, batch_idx = 0, loss = 0.16189470887184143
epoch = 1, batch_idx = 100, loss = 0.1595584750175476
epoch = 1, batch_idx = 200, loss = 0.14553877711296082
epoch = 1, batch_idx = 300, loss = 0.151431605219841
epoch = 1, batch_idx = 400, loss = 0.13905295729637146
epoch = 2, batch_idx = 0, loss = 0.13708670437335968
epoch = 2, batch_idx = 100, loss = 0.13877740502357483
epoch = 2, batch_idx = 200, loss = 0.1271398961544037
epoch = 2, batch_idx = 300, loss = 0.13122180104255676
epoch = 2, batch_idx = 400, loss = 0.12365438044071198
epoch = 3, batch_idx = 0, loss = 0.12157578766345978
epoch = 3, batch_idx = 100, loss = 0.12561878561973572
epoch = 3, batch_idx = 200, loss = 0.11524558067321777
epoch = 3, batch_idx = 300, loss = 0.11790748685598373
epoch = 3, batch_idx = 400, loss = 0.11349649727344513
epoch = 4, batch_idx = 0, loss = 0.11081816256046295
epoch = 4, batch_idx = 100, loss = 0.1163901537656784
epoch = 4, batch_idx = 200, loss = 0.10680149495601654
epoch = 4, batch_idx = 300, loss = 0.1084493026137352
epoch = 4, batch_idx = 400, loss = 0.1061994805932045
epoch = 5, batch_idx = 0, loss = 0.10276894271373749
epoch = 5, batch_idx = 100, loss = 0.10943224281072617
epoch = 5, batch_idx = 200, loss = 0.10031596571207047
epoch = 5, batch_idx = 300, loss = 0.10134726762771606
epoch = 5, batch_idx = 400, loss = 0.10063724219799042
epoch = 6, batch_idx = 0, loss = 0.09647054225206375
epoch = 6, batch_idx = 100, loss = 0.10392111539840698
epoch = 6, batch_idx = 200, loss = 0.09510777145624161
epoch = 6, batch_idx = 300, loss = 0.09576252847909927
epoch = 6, batch_idx = 400, loss = 0.09620489180088043
epoch = 7, batch_idx = 0, loss = 0.09139622002840042
epoch = 7, batch_idx = 100, loss = 0.09940839558839798
epoch = 7, batch_idx = 200, loss = 0.09078116714954376
epoch = 7, batch_idx = 300, loss = 0.09128470718860626
epoch = 7, batch_idx = 400, loss = 0.09251183271408081
epoch = 8, batch_idx = 0, loss = 0.0871615782380104
epoch = 8, batch_idx = 100, loss = 0.09560022503137589
epoch = 8, batch_idx = 200, loss = 0.08711323887109756
epoch = 8, batch_idx = 300, loss = 0.08756624162197113
epoch = 8, batch_idx = 400, loss = 0.08939112722873688
epoch = 9, batch_idx = 0, loss = 0.08357226848602295
epoch = 9, batch_idx = 100, loss = 0.09233171492815018
epoch = 9, batch_idx = 200, loss = 0.08394805341959
epoch = 9, batch_idx = 300, loss = 0.08437744528055191
epoch = 9, batch_idx = 400, loss = 0.08669549226760864

Process finished with exit code 0

u013250861

关注

0
点赞
踩
1

收藏

觉得还不错? 一键收藏
0
评论
TensorFlow2-实战-手写数字识别（一）：手写版（不用模型）、步骤：【初始化参数】--＞【循环（①根据参数W、B与输入值X计算出预测值Y；②计算Loss；③计算梯度；④梯度下降来更新参数）】

import pandas as pdimport tensorflow as tffrom tensorflow.keras import datasetsimport osos.environ['TF_CPP_MIN_LOG_LEVEL'] = '2'# 读取数据集# x: [60k, 28, 28], y: [60k]mnist_dataset = datasets.mnist.load_data()dataset_train, datasset_val = mnist_datas
复制链接

扫一扫