tensorflow基于cnn的实战mnist手写识别（浅层网络搭建）

最新推荐文章于 2024-05-05 23:39:28 发布

胖虎记录学习

最新推荐文章于 2024-05-05 23:39:28 发布

阅读量1.5k

点赞数 2

文章标签： tensorflow python cnn

本文链接：https://blog.csdn.net/panghuzhenbang/article/details/124375710

版权

手写数字识别mnist这个实例相当于c中的helloworld，对于刚入门cnn的同学来说可以借鉴一下，本人也是刚入门的小白，不对之处欢迎大家指正。

Tensorflow实战mnist手写数字识别

关于mnist数据集这里不多做介绍了，大家自己百度吧~~

使用tensorflow框架，导入相关包

import os  # 执行文件与文件夹
import numpy as np  # 导入numpy，同时给numpy一个别名np
import tensorflow as tf  # 导入tensorflow，同时给其一个别名tf
from tqdm import trange  # 进度条模块
from tensorflow.core.framework import summary_pb2  # 结果提取模块
from tensorflow.python.framework.graph_util import convert_variables_to_constants  # 将模型参数保存进graph

GPU使用设置

tf.logging.set_verbosity(tf.logging.ERROR)  # 让tensorflow记录错误信息
gpu_options = tf.GPUOptions(per_process_gpu_memory_fraction=0.3)  # 指定每个GPU进程中使用显存的上限

下载mnist数据，打印训练集与数据集个数

这一步需要下载mnist数据集到本地文件夹mnist_data目录下，下载需要等待一段时间；或者自己去官网下载。

tutorials文件夹如果需要下载的同学自行百度下，不难的~~

关于one-hot编码，简单理解就是属于该类置1，不属于置0.

例如10分类中:

1，[0,1,0,0,0,0,0,0,0,0]

9，[0,0,0,0,0,0,0,0,0,1]

from tensorflow.examples.tutorials.mnist import input_data  # 导入数据
mnist = input_data.read_data_sets("MNIST_data", one_hot=True)  # 输入数据，one-hot编码
print(mnist.train.num_examples, mnist.test.num_examples)  # 打印训练集与测试集个数

输出：

Extracting MNIST_data\train-images-idx3-ubyte.gz
Extracting MNIST_data\train-labels-idx1-ubyte.gz
Extracting MNIST_data\t10k-images-idx3-ubyte.gz
Extracting MNIST_data\t10k-labels-idx1-ubyte.gz
55000 10000

定义一个FLAGS类，记录训练参数；计算训练批次

class FLAGS:
    num_classes = 10  # 十分类
    batch_size = 10  # 单次训练数
    global_epoch = 30  # epoch数 更新权重
    model_dir = "best_model" #模型地址

n_batch = mnist.train.num_examples // FLAGS.batch_size  # 计算批次

声明张量占位符操作

语法结构为：tf.placeholder( dtype, shape=None, name )
函数的作用：减少产生的op，进而减少graph的开销。
具体原理是：因为每一个tensor值在graph上都是一个op，当我们将train数据分成一个个minibatch然后传入网络进行训练时，每一个minibatch都将是一个op，这样的话，一副graph上的op未免太多，也会产生巨大的开销；于是就有了tf.placeholder，我们每次可以将一个minibatch传入到x = tf.placeholder(tf.float32,[None,32])上，下一次传入的x都替换掉上一次传入的x，这样就对于所有传入的minibatch x就只会产生一个op，不会产生其他多余的op，进而减少了graph的开销。op其实可以理解成为是一个x数据经过网络时长生的张量值。

inputs = tf.placeholder(tf.float32, [None, 784],name='inputs')  # tf.placeholder(dtype（参数类型）, [None,784]的形状,其中784是单个扁平28乘28像素MNIST图像的维度,而None表示对应于批量大小的第一个维度可以是任何大小., name=None name是数据名称，可不写)
y_true = tf.placeholder(tf.float32, [None, 10], name='ground_truth')  # 标签
global_step=tf.get_variable('global_step'[],initializer=tf.constant_initializer(0),trainable=False)  # 常量初始化函数tf.constant_initializer
is_training = tf.placeholder(tf.bool, name="phase_train")  # 阶段训练数据

定义卷积操作（输入、卷积核个数、步长striders）

加入BN层（批归一化，百度（手动狗头）），激活函数选择ReLU

def conv2d(x, filters, kernel_size=3, strides=1, is_training=True):  # 定义卷积
    x = tf.layers.conv2d(inputs=x, filters=filters, kernel_size=kernel_size, strides=strides, padding='same')
    x = tf.layers.batch_normalization(inputs=x, training=is_training)  # BN层
    return tf.nn.relu(x)  # 激活函数为ReLU

定义最大池化（输入、步长striders）

def maxpool2d(x, strides=2):  # 池化
    return tf.layers.max_pooling2d(inputs=x, pool_size=2, strides=strides, padding='same')

cnn（卷积神经网络）

[-1,28,28,1] -1表示自动计算这一维的大小，28*28是图像height*weight，1是通道数channels，由于mnist原图为黑白，所以为1，RGB彩色图片为3

网络结构为浅层网络：2层卷积、两层池化、两层全连接层

def conv_net(inputs, num_classes=10, is_training=True):  # cnn
    inputs = tf.reshape(inputs, shape=[-1, 28, 28, 1])  # 重塑输入图片
    conv1 = conv2d(inputs, 32, kernel_size=5, strides=1, is_training=is_training)  # 卷积1
    conv1 = maxpool2d(conv1, strides=2)  # 池化1

    conv2 = conv2d(conv1, 64, kernel_size=5, strides=1, is_training=is_training)  # 卷积2
    conv2 = maxpool2d(conv2, strides=2)  # 池化2

平坦化，变成[-1,7,7,64] （n，3136）

两层全连接（输入、神经元个数、激活函数），dropout是防止过拟合，过拟合是什么？自己百度~~简单来说就是防止模型过于拟合数据，泛化能力减弱。全连接最后一层神经元个数为10,（0-9十分类）。

fc1 = tf.reshape(conv2, [-1, 7 * 7 * 64])  # 重塑conv2
    fc1 = tf.layers.dense(fc1, 1024, activation=tf.nn.relu)  # 全连接层（输入数据，输出维度，激活函数）
    fc1 = tf.layers.dropout(fc1, rate=0.3, training=is_training)  # 防止过拟合（拿掉0.3的神经元）
 end_point = tf.layers.dense(fc1, num_classes)  # 全连接层（输入数据，输出维度=10）
    return end_point

调用网络结构进行训练

end_point = conv_net(inputs, num_classes=FLAGS.num_classes, is_training=is_training)  # 调用
result_cls = tf.argmax(end_point, axis=-1)  # 返回最大的那个数值所在的下标
result_cls = tf.identity(result_cls,name="result_cls") # 检查位置

计算accurary

accurary = tf.metrics.accuracy(labels=tf.argmax(y_true, axis=-1), predictions=result_cls)[1]

计算交叉熵损失

loss = tf.losses.softmax_cross_entropy(onehot_labels=y_true, logits=end_point)  # loss

通过loss计算误差，批归一化处理，并使用Adam优化

update_opts = tf.get_collection(tf.GraphKeys.UPDATE_OPS)  # 取出ops列表
with tf.control_dependencies([tf.group(*update_opts)]):  # 当括号里面的参数执行完毕再执行with里面的语句，一般都会与with共用
    var_list = tf.trainable_variables()  # 返回使用trainable=ture创建的所有变量
    g_list = tf.global_variables()  # 返回全局变量
    bn_moving_vars = [g for g in g_list if 'moving_mean' in g.name]  # 批归一化均值
    bn_moving_vars += [g for g in g_list if 'moving_variance' in g.name]  # 批归一化方差
    var_list += bn_moving_vars  # 左=左+右
    loss = loss  # 交叉熵
    train_op = tf.train.AdamOptimizer(learning_rate=0.001).minimize(loss, var_list=var_list,
                                                                    global_step=global_step)  # Adam优化算法：一个寻找全局最优点的优化算法，引入了二次方梯度校正。
    pass

创建会话（session），初始化变量，数据输出以显示整个训练的过程

init = tf.group(tf.global_variables_initializer(), tf.local_variables_initializer())  # 组合操作初始化

# with tf.Session() as sess:
with tf.Session(
        config=tf.ConfigProto(gpu_options=gpu_options)) as sess:  # tf.ConfigProto用在创建session的时候，用来对session进行参数配置
    sess.run(init)
    bast_score = 0.995
    for epoch in range(FLAGS.global_epoch):
        for batch in trange(n_batch):
            batch_xs, batch_ys = mnist.train.next_batch(FLAGS.batch_size)
            [__train_op, __total_loss, __accurary, __global_step] = sess.run([train_op, loss, accurary, global_step],
                                                                             feed_dict={inputs: batch_xs,
                                                                                        y_true: batch_ys,
                                                                                        is_training: True})
            pass
        acc_train = sess.run(accurary,
                             feed_dict={inputs: mnist.train.images, y_true: mnist.train.labels, is_training: False})
        acc_test = sess.run(accurary,
                            feed_dict={inputs: mnist.test.images, y_true: mnist.test.labels, is_training: False})
        print("Iter:{:3d}, loss: {}, TrainAcc: {}, TestAcc: {}".format(epoch, '%.4f' % __total_loss, '%.4f' % acc_train,
                                                                       '%.4f' % acc_test))

输出如下：

100%|██████████| 5500/5500 [01:52<00:00, 48.85it/s]

  0%|          | 0/5500 [00:00<?, ?it/s]Iter:  0, loss: 0.0610, TrainAcc: 0.9682, TestAcc: 0.9695
100%|██████████| 5500/5500 [01:51<00:00, 49.32it/s]
2022-04-24 09:25:45.217825: W tensorflow/core/framework/allocator.cc:107] Allocation of 5519360000 exceeds 10% of system memory.
  0%|          | 0/5500 [00:00<?, ?it/s]Iter:  1, loss: 0.0121, TrainAcc: 0.9778, TestAcc: 0.9783
100%|██████████| 5500/5500 [01:53<00:00, 48.46it/s]
  0%|          | 0/5500 [00:00<?, ?it/s]Iter:  2, loss: 0.0564, TrainAcc: 0.9820, TestAcc: 0.9823
100%|██████████| 5500/5500 [01:52<00:00, 48.94it/s]
  0%|          | 0/5500 [00:00<?, ?it/s]Iter:  3, loss: 0.0001, TrainAcc: 0.9847, TestAcc: 0.9848
100%|██████████| 5500/5500 [01:51<00:00, 49.36it/s]
  0%|          | 0/5500 [00:00<?, ?it/s]Iter:  4, loss: 0.0008, TrainAcc: 0.9862, TestAcc: 0.9862
100%|██████████| 5500/5500 [01:51<00:00, 49.30it/s]
Iter:  5, loss: 0.0000, TrainAcc: 0.9875, TestAcc: 0.9876
100%|██████████| 5500/5500 [01:52<00:00, 48.97it/s]
  0%|          | 0/5500 [00:00<?, ?it/s]Iter:  6, loss: 0.0002, TrainAcc: 0.9887, TestAcc: 0.9887
100%|██████████| 5500/5500 [01:54<00:00, 48.07it/s]
  0%|          | 0/5500 [00:00<?, ?it/s]Iter:  7, loss: 0.0002, TrainAcc: 0.9896, TestAcc: 0.9897
100%|██████████| 5500/5500 [02:01<00:00, 45.44it/s]
Iter:  8, loss: 0.8875, TrainAcc: 0.9902, TestAcc: 0.9902
100%|██████████| 5500/5500 [01:57<00:00, 46.70it/s]
Iter:  9, loss: 0.0000, TrainAcc: 0.9909, TestAcc: 0.9909
100%|██████████| 5500/5500 [01:57<00:00, 46.75it/s]
Iter: 10, loss: 0.0000, TrainAcc: 0.9915, TestAcc: 0.9915
100%|██████████| 5500/5500 [01:57<00:00, 46.75it/s]
  0%|          | 0/5500 [00:00<?, ?it/s]Iter: 11, loss: 0.0000, TrainAcc: 0.9920, TestAcc: 0.9920
100%|██████████| 5500/5500 [01:57<00:00, 46.65it/s]
Iter: 12, loss: 0.0000, TrainAcc: 0.9924, TestAcc: 0.9924
100%|██████████| 5500/5500 [01:57<00:00, 46.87it/s]
Iter: 13, loss: 0.0000, TrainAcc: 0.9928, TestAcc: 0.9927
100%|██████████| 5500/5500 [01:58<00:00, 46.40it/s]
  0%|          | 0/5500 [00:00<?, ?it/s]Iter: 14, loss: 0.0000, TrainAcc: 0.9931, TestAcc: 0.9931
100%|██████████| 5500/5500 [01:57<00:00, 46.69it/s]
  0%|          | 0/5500 [00:00<?, ?it/s]Iter: 15, loss: 0.0000, TrainAcc: 0.9934, TestAcc: 0.9934
100%|██████████| 5500/5500 [01:56<00:00, 47.10it/s]
Iter: 16, loss: 0.0000, TrainAcc: 0.9937, TestAcc: 0.9937
100%|██████████| 5500/5500 [01:53<00:00, 48.45it/s]
Iter: 17, loss: 0.0000, TrainAcc: 0.9939, TestAcc: 0.9939
100%|██████████| 5500/5500 [01:53<00:00, 48.29it/s]
Iter: 18, loss: 0.0000, TrainAcc: 0.9941, TestAcc: 0.9941
100%|██████████| 5500/5500 [01:54<00:00, 48.22it/s]
  0%|          | 0/5500 [00:00<?, ?it/s]Iter: 19, loss: 0.0000, TrainAcc: 0.9943, TestAcc: 0.9943
100%|██████████| 5500/5500 [01:56<00:00, 47.09it/s]
  0%|          | 0/5500 [00:00<?, ?it/s]Iter: 20, loss: 0.0000, TrainAcc: 0.9945, TestAcc: 0.9945
100%|██████████| 5500/5500 [01:55<00:00, 47.52it/s]
Iter: 21, loss: 0.0000, TrainAcc: 0.9947, TestAcc: 0.9947
100%|██████████| 5500/5500 [01:55<00:00, 47.80it/s]
Iter: 22, loss: 0.0000, TrainAcc: 0.9948, TestAcc: 0.9948
100%|██████████| 5500/5500 [01:57<00:00, 46.95it/s]
  0%|          | 0/5500 [00:00<?, ?it/s]Iter: 23, loss: 0.0000, TrainAcc: 0.9950, TestAcc: 0.9950
100%|██████████| 5500/5500 [01:57<00:00, 46.65it/s]
  0%|          | 0/5500 [00:00<?, ?it/s]Iter: 24, loss: 0.0000, TrainAcc: 0.9951, TestAcc: 0.9951
100%|██████████| 5500/5500 [01:56<00:00, 47.31it/s]
  0%|          | 0/5500 [00:00<?, ?it/s]Iter: 25, loss: 0.0008, TrainAcc: 0.9952, TestAcc: 0.9952
100%|██████████| 5500/5500 [01:57<00:00, 46.85it/s]
Iter: 26, loss: 0.0000, TrainAcc: 0.9953, TestAcc: 0.9953
100%|██████████| 5500/5500 [01:57<00:00, 46.71it/s]
  0%|          | 0/5500 [00:00<?, ?it/s]Iter: 27, loss: 0.0000, TrainAcc: 0.9954, TestAcc: 0.9954
100%|██████████| 5500/5500 [01:57<00:00, 46.71it/s]
  0%|          | 0/5500 [00:00<?, ?it/s]Iter: 28, loss: 0.0000, TrainAcc: 0.9955, TestAcc: 0.9955
100%|██████████| 5500/5500 [01:56<00:00, 47.12it/s]
Iter: 29, loss: 0.0000, TrainAcc: 0.9956, TestAcc: 0.9956

Process finished with exit code 0

转载请联系作者

胖虎记录学习

关注

2
点赞
踩
5

收藏

觉得还不错? 一键收藏
打赏
0
评论
tensorflow基于cnn的实战mnist手写识别（浅层网络搭建）

手写数字识别mnist这个实例相当于c中的helloworld，对于刚入门cnn的同学来说可以借鉴一下，本人也是刚入门的小白，不对之处欢迎大家指正。Tensorflow实战mnist手写数字识别关于mnist数据集这里不多做介绍了，大家自己百度吧~~使用tensorflow框架，导入相关包import os # 执行文件与文件夹import numpy as np # 导入numpy，同时给numpy一个别名npimport tensorflow as tf # 导入tensor
复制链接

扫一扫