图像分类项目（二）：读取自定义的TFRecord数据集，训练卷积网络并保存pb模型

最新推荐文章于 2024-08-18 09:54:23 发布

置顶 LTyyCFY

最新推荐文章于 2024-08-18 09:54:23 发布

阅读量1.6k

点赞数 1

分类专栏：深度学习 python tensorflow

本文链接：https://blog.csdn.net/LTyyCFY/article/details/99978430

版权

tensorflow 同时被 3 个专栏收录

9 篇文章 0 订阅

订阅专栏

深度学习

8 篇文章 0 订阅

订阅专栏

python

7 篇文章 0 订阅

订阅专栏

自定义数据集、模型，用QT、c++调用tensorflow编译好的pb模型的图像分类项目
第一步：tfrecord数据集的制作和读取（验证）、
第二步：本文
第三步：opencv读取自定义的pb模型，自定义softmax,argmax函数输出图像分类结果
流程图：
在这里插入图片描述
主要参考如下：
Windows下用c++来调用tensorflow训练好的模型
 TensorFlow之CNN图像分类及模型保存与调用
真的是各种坑磨死人，因为我的项目是在QT上的，经过各种查找好像有三种思路
1.保存pb模型，用tensorflow C++ API调用模型输出结果
2.调用py文件
3.保存pb模型，用opencv的dnn模块调用
这里我用的第三种，即保存pb模型，用opencv的dnn模块的cv::dnn::readNetFromTensorflow()函数调用，这样的话在我的QT项目中只需要添加opencv库就可以了。这里有一些很蛋疼的地方，就是保存pb模型的时候，输入到输出节点间不能包含div/softmax/argmax/dropout等节点，不然opencv调用时会出现下面这种错误。
在这里插入图片描述

补充说明

我的笔记本显卡是960M，安装tensorflow1.4.0-gpu，读取tfrecords文件，设定batch_size大小为16的话电脑温度会到73度上下。之前运行:Tensorflow 实战Google深度学习框架花朵集的迁移学习的代码很有趣，它是读取每一张图片的数据，经过Inception-v3模型后得到一个特征向量，然后将特征向量保存到一个txt文件中，数据集多的时候，生成txt文件要花费很多时间，但是设定batch_size=100即每次训练100个特征向量，训练速度很快，且cpu也没有明显卡顿。

代码如下：

import tensorflow as tf
from tensorflow.python.framework import graph_util
import matplotlib.pyplot as plt

tra_data_dir = 'flower_tra100.tfrecords'
val_data_dir = 'flower_val100.tfrecords'

W = 100  # 图片原来的长度
H = 100  # 图片原来的高度
Channels = 3 # 图片原来的通道数
 
def read_and_decode2stand(tfrecords_file, batch_size):
    '''阅读和解码TFRecord文件，生成(image, label) 批数据
    参数：
        tfrecords_file: TFRecord文件的目录
        batch_size: 批数据的大小
    返回:
        image_batch: 4维张量 - [batch_size, height, width, channel]
        label_batch: 2维张量 - [batch_size, n_classes]
    '''
    # tf.train.string_input_producer函数会使用初始化时提供的文件列表创建一个输入队列
    # 输入队列中原始的元素为文件列表中的所有文件，可以设置shuffle参数。
    filename_queue = tf.train.string_input_producer([tfrecords_file])
    # 创建一个reader来读取TFRecord文件中的样例
    reader = tf.TFRecordReader()
    # 从文件中读出一个样例。也可以使用read_up_to函数一次性读取多个案例
    _, serialized_example = reader.read(filename_queue)#返回文件名和文件
    # 解析读入的一个样例。如果需要解析多个样例，可以用parse_example函数
    img_features = tf.parse_single_example(
            serialized_example,
            features={
                    # tf.FixedLenFeature解析的结果为一个tensor
                    'label': tf.FixedLenFeature([], tf.int64),
                    'image_raw': tf.FixedLenFeature([], tf.string),
                    })#取出包含image和label的feature对象
    
    # tf.decode_raw可以将字符串解析成图像对应的像素数组
    image = tf.decode_raw(img_features['image_raw'], tf.uint8)
    # 根据图像尺寸，还原图像
    image = tf.reshape(image, [H, W,Channels])
    # 将image的数据格式转换成实数型，并进行归一化处理
    # image = image.astype('float32');image /= 255
    image = tf.cast(image, tf.float32) * (1.0 /255)
    # 图像标准化是将数据通过去均值实现中心化的处理，更容易取得训练之后的泛化效果
    # 线性缩放image以具有零均值和单位范数。操作计算(x - mean) / adjusted_stddev
    # image = tf.image.per_image_standardization(image)
 
    # 如果使用其他数据集，需要更改图像大小
    label = tf.cast(img_features['label'], tf.int32)
    # 将多个输入样例组织成一个batch可以提高模型训练的效率
    # 一般image和label分别代表训练样本和这个样本对应的正确标签。
    # batch_size：一个batch中样例的个数
    # num_threads：指定多个线程同时执行入队操作
    # capacity：组合样例的队列中最多可以存储的样例个数。太大，需要占用很多内存资源
    # 太小，出队操作可能会因为没有数据而被阻碍，从而导致训练效率降低。
    image_batch, label_batch = tf.train.batch([image, label],
                                                batch_size= batch_size,
                                                num_threads= 4,
                                                capacity = 2000)
    # 将类别向量(0~n_classes的整数向量)映射为二值类别矩阵，相当于用one-hot重新编码
    label_batch = tf.one_hot(label_batch, depth= n_classes)
    label_batch = tf.cast(label_batch, dtype=tf.int32)
    label_batch = tf.reshape(label_batch, [batch_size, n_classes])
    # 张量保存的是计算过程。一个张量主要保存了三个属性：name、shape、dtype
    print(label_batch)
    return image_batch, label_batch
 
def build_network(height, width, channel, n_classes, train, regularizer):
    """构造卷积神经网络"""
    
    # 定义两个placeholder，用于输入数据
    x = tf.placeholder(tf.float32, shape=[None, height, width, channel],
                       name="input")  ####这个名称很重要！！！
    y = tf.placeholder(tf.int32, shape=[None, n_classes], name="labels_placeholder")
   
    with tf.variable_scope('layer1-conv1'):
        conv1_weights = tf.get_variable(
                "weight",[5,5,3,32],
                initializer=tf.truncated_normal_initializer(stddev=0.1))
        conv1_biases = tf.get_variable(
                "bias", [32], initializer=tf.constant_initializer(0.0))
        conv1 = tf.nn.conv2d(
                x, conv1_weights, strides=[1, 1, 1, 1], padding='SAME')
        relu1 = tf.nn.relu(tf.nn.bias_add(conv1, conv1_biases))

    with tf.name_scope("layer2-pool1"):
        pool1 = tf.nn.max_pool(
                relu1, ksize = [1,2,2,1],strides=[1,2,2,1],padding="VALID")

    with tf.variable_scope("layer3-conv2"):
        conv2_weights = tf.get_variable(
                "weight",[5,5,32,64],
                initializer=tf.truncated_normal_initializer(stddev=0.1))
        conv2_biases = tf.get_variable(
                "bias", [64], initializer=tf.constant_initializer(0.0))
        conv2 = tf.nn.conv2d(
                pool1, conv2_weights, strides=[1, 1, 1, 1], padding='SAME')
        relu2 = tf.nn.relu(tf.nn.bias_add(conv2, conv2_biases))

    with tf.name_scope("layer4-pool2"):
        pool2 = tf.nn.max_pool(
                relu2, ksize=[1, 2, 2, 1], strides=[1, 2, 2, 1],
                padding='VALID')

    with tf.variable_scope("layer5-conv3"):
        conv3_weights = tf.get_variable(
                "weight",[3,3,64,128],
                initializer=tf.truncated_normal_initializer(stddev=0.1))
        conv3_biases = tf.get_variable(
                "bias", [128], initializer=tf.constant_initializer(0.0))
        conv3 = tf.nn.conv2d(
                pool2, conv3_weights, strides=[1, 1, 1, 1], padding='SAME')
        relu3 = tf.nn.relu(tf.nn.bias_add(conv3, conv3_biases))

    with tf.name_scope("layer6-pool3"):
        pool3 = tf.nn.max_pool(
                relu3, ksize=[1, 2, 2, 1], strides=[1, 2, 2, 1],
                padding='VALID')

    with tf.variable_scope("layer7-conv4"):
        conv4_weights = tf.get_variable(
                "weight",[3,3,128,128],
                initializer=tf.truncated_normal_initializer(stddev=0.1))
        conv4_biases = tf.get_variable(
                "bias", [128], initializer=tf.constant_initializer(0.0))
        conv4 = tf.nn.conv2d(pool3, conv4_weights, strides=[1, 1, 1, 1],
                             padding='SAME')
        relu4 = tf.nn.relu(tf.nn.bias_add(conv4, conv4_biases))

    with tf.name_scope("layer8-pool4"):
        pool4 = tf.nn.max_pool(relu4, ksize=[1, 2, 2, 1],
                               strides=[1, 2, 2, 1], padding='VALID')
        nodes = 6*6*128
        reshaped = tf.reshape(pool4,[-1,nodes])

    with tf.variable_scope('layer9-fc1'):
        fc1_weights = tf.get_variable(
                "weight", [nodes, 1024],
                initializer=tf.truncated_normal_initializer(stddev=0.1))
        if regularizer != None:
            tf.add_to_collection('losses', regularizer(fc1_weights))
        fc1_biases = tf.get_variable(
                "bias", [1024], initializer=tf.constant_initializer(0.1))
        fc1 = tf.nn.relu(tf.matmul(reshaped, fc1_weights) + fc1_biases)
        if train:
            fc1 = tf.nn.dropout(fc1, 0.5)

    with tf.variable_scope('layer10-fc2'):
        fc2_weights = tf.get_variable(
                "weight", [1024, 512],
                initializer=tf.truncated_normal_initializer(stddev=0.1))
        if regularizer != None:
            tf.add_to_collection('losses', regularizer(fc2_weights))
        fc2_biases = tf.get_variable("bias", [512], initializer=tf.constant_initializer(0.1))

        fc2 = tf.nn.relu(tf.matmul(fc1, fc2_weights) + fc2_biases)
        if train:
            fc2 = tf.nn.dropout(fc2, 0.5)

    with tf.variable_scope('layer11-fc3'):
        fc3_weights = tf.get_variable(
                "weight", [512, n_classes],
                initializer=tf.truncated_normal_initializer(stddev=0.1))
        if regularizer != None:
            tf.add_to_collection('losses', regularizer(fc3_weights))
        fc3_biases = tf.get_variable(
                "bias", [n_classes], initializer=tf.constant_initializer(0.1))
        logits = tf.matmul(fc2, fc3_weights) + fc3_biases
    
    #(小处理)将logits乘以1赋值给logits_eval，定义name，方便在后续调用模型时通过tensor名字调用输出tensor
    b = tf.constant(value=1, dtype=tf.float32)
    logits_eval = tf.multiply(logits,b,name='output')
                
    # softmax_cross_entropy_with_logits计算交叉熵(废弃)
    # cost = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(logits=finaloutput, labels=y))*1000
    # logits是batch×classes的一个矩阵，classes为类别数量
    # labels是长batch的一个一维数组。当logits判断图片为某一类时，对应classes的位置为1
    cost = tf.reduce_mean(tf.nn.sparse_softmax_cross_entropy_with_logits(
            logits=logits, labels=tf.argmax(y, 1)))
    # 定义反向传播算法来优化神经网络中的参数
    optimize = tf.train.AdamOptimizer(0.001).minimize(cost)
    
    finaloutput = tf.nn.softmax(logits)
    prediction_labels = tf.argmax(finaloutput, axis=1)
    read_labels = tf.argmax(y, axis=1)
    
    # 判断两个张量的每一维是否相等，如果相等返回True，否则返回False
    correct_prediction = tf.equal(prediction_labels, read_labels)
    # 这个运算先将布尔型数值转换为实数型，然后计算平均值。
    # 这个平均值就是模型在这一组数据上的正确率。
    accuracy = tf.reduce_mean(tf.cast(correct_prediction, tf.float32))
    # 批数据训练中正确的次数
    correct_times_in_batch = tf.reduce_sum(tf.cast(correct_prediction, tf.int32))
 
    return dict(
        x=x,
        y=y,
        optimize=optimize,
        correct_prediction=correct_prediction,
        correct_times_in_batch=correct_times_in_batch,
        cost=cost,
        accuracy=accuracy,
    )
 
 
def train_network(graph, batch_size, num_epochs, pb_file_path):
    # 训练集批数据
    tra_image_batch, tra_label_batch = read_and_decode2stand(
            tfrecords_file=tra_data_dir,batch_size= batch_size)
    # 验证集批数据
    val_image_batch, val_label_batch = read_and_decode2stand(
            tfrecords_file=val_data_dir,batch_size= batch_size)
    init = tf.global_variables_initializer()
    with tf.Session() as sess:
        # 变量初始化
        sess.run(init)
        # 声明一个tf.train.Coordinator类来协同多个线程
        coord = tf.train.Coordinator()
        # tf.train.start_queue_runners函数默认启动tf.GraphKeys.QUEUE_RUNNERS
        # 集合中所有的QueueRunner
        threads = tf.train.start_queue_runners(sess=sess, coord=coord)
        epoch_delta = 20 #每隔20次计算一下准确率和损失函数
        lost = []
        acc = []
        try:
            for epoch_index in range(num_epochs):
                tra_images,tra_labels = sess.run([tra_image_batch, tra_label_batch])
                # 替你刚刚选取的样本训练神经网络并更新参数
                accuracy,mean_cost_in_batch,return_correct_times_in_batch,_=sess.run([graph['accuracy'],graph['cost'],graph['correct_times_in_batch'],graph['optimize']], feed_dict={
                    graph['x']: tra_images,
                    graph['y']: tra_labels
                })
                if epoch_index > 100:
                    lost.append(mean_cost_in_batch)
                    acc.append(accuracy)
                
                # 每epoch_delta轮输出一次在验证数据集上的测试结果
                if epoch_index % epoch_delta == 0:
                    # 开始在训练集上计算一下准确率和损失函数
                    print("index[%s]".center(50,'-')%epoch_index)
                    print("Train: cost_in_batch：{},correct_in_batch：{},accuracy：{}".format(mean_cost_in_batch,return_correct_times_in_batch,accuracy))
                    
 
                    # 开始在验证集上计算一下准确率和损失函数
                    val_images, val_labels = sess.run([val_image_batch, val_label_batch])
                    mean_cost_in_batch,return_correct_times_in_batch = sess.run([graph['cost'],graph['correct_times_in_batch']], feed_dict={
                        graph['x']: val_images,
                        graph['y']: val_labels
                    })
                    print("***Val: cost_in_batch：{},correct_in_batch：{},accuracy：{}".format(mean_cost_in_batch,return_correct_times_in_batch,return_correct_times_in_batch/batch_size))
 
 
                if epoch_index % 50 == 0:
                    # 将图中的变量及其取值转化为常量，同时将图中不必要的节点去掉。
                    # 如果只关心程序中定义的某些计算时，无关的节点就没必要导出并保存
                    constant_graph = graph_util.convert_variables_to_constants(
                            sess, sess.graph_def, ["output"])
                    with tf.gfile.FastGFile(pb_file_path, mode='wb') as f:
                        f.write(constant_graph.SerializeToString())
                
        except tf.errors.OutOfRangeError:#当遍历结束时，程序会抛出OutOfRangeError
            print('Done training -- epoch limit reached')
        finally:
            plt.plot(lost)
            plt.savefig('ch_cost.png', bbox_inches='tight')
            plt.close() #保存完图片后，需要plt.close()清空一下
            plt.plot(acc)
            plt.savefig('ch_acc.png', bbox_inches='tight')
            # 调用coord.request_stop()函数来停止所有其他的线程
            coord.request_stop()
        # 等待所有线程退出
        coord.join(threads)
        sess.close()
 
 
if __name__=="__main__":
    batch_size = 16    # 定义组合数据batch的大小
    num_epochs = 3000 #训练轮数
    n_classes = 5 # 类别数
 
    pb_file_path = "flower_cnn.pb"
    
    regularizer = tf.contrib.layers.l2_regularizer(0.0001)
    g = build_network(height=H, width=W, channel=3, n_classes=n_classes,
                      train=False, regularizer=regularizer)
    train_network(g, batch_size, num_epochs, pb_file_path)

这段代码的好处是分成了三个模块，读取tfrecords文件生成批数据、构建神经网络、训练神经网络并保存pb模型。通常只需要修改第二部分构建神经网络即可。并且用了异常处理机制。
需要注意的地方：
1.图片的长度、宽度、类别数等一定要对应上。修改时需要手动修改。
2.倒数第二行的train=false，如果train=true的话，在全连接层那里会有dropout层操作，opencv读取pb模型时会报错。
3.注意我output节点的位置
4.之前在构建模型那里时自己构建的VGG16模型，很蛋疼，不知道哪里有问题，要么就是迅速收敛到1.65，要么就是不停的震荡。本来都打算放弃了，用迁移学习试了下，准确率可以到九十几，但是保存模型又不知道怎么保存我需要的模型。然后突然看到了这个卷积网络，试了下，能用，识别率也还行，转移到我的实际项目中效果还不错。

结果展示

下图是flower数据集的训练结果
在这里插入图片描述

咦，好像过拟合了。思考，明明添加了正则化。后面在慢慢测试改进模型。
批数据损失函数图像如下所示，可以看到损失函数是下降收敛的趋势。

批数据准确率图像如下所示，可以看到准确率是上升收敛的趋势。
在这里插入图片描述

查看pb模型中的节点

附加一个查看pb模型节点的小程序

import tensorflow as tf
import os
 
model_name = 'flower_cnn.pb'
 
def create_graph():
    with tf.gfile.FastGFile(os.path.join(model_name), 'rb') as f:
        graph_def = tf.GraphDef()
        graph_def.ParseFromString(f.read())
        tf.import_graph_def(graph_def, name='')
 
create_graph()
tensor_name_list = [tensor.name for tensor in tf.get_default_graph().as_graph_def().node]
for tensor_name in tensor_name_list:
    print(tensor_name,'\n')