求助，在使用TensorFlow进行花朵图片分类时，想把数据集的图片转化为tfrecord文件，但是程序运行后没有反应，不知道问题出在哪里

最新推荐文章于 2020-12-21 13:04:45 发布

Rudderless.

最新推荐文章于 2020-12-21 13:04:45 发布

阅读量350

点赞数

文章标签： python 人工智能深度学习 tensorflow

本文链接：https://blog.csdn.net/Reflect1on/article/details/104652486

版权

求助，在使用TensorFlow进行花朵图片分类时，想把数据集的图片转化为tfrecord文件，但是程序运行后没有反应，一直显示服务空闲，不知道问题出在哪里，一下是我的程序，希望可以指点我一下哪里出了问题

# glob模块的主要方法就是glob,该方法返回所有匹配的文件路径列表（list）
import glob
#os.path生成路径方便glob获取
import os.path
#这里主要用到随机数
import numpy as np
#引入tensorflow框架
import tensorflow as tf
#引入gflie对图片做处理
from tensorflow.python.platform import gfile
#输入图片地址
INPUT_DATA = 'E:/code/jupyter/201806-github代码数据打包/201806-github代码数据打包/datasets/flower_photos'
#训练数据集
OUTPUT_FILE = 'E:/code/jupyter/201806-github代码数据打包/201806-github代码数据打包/flowers train/output.tfrecords'
#测试数据集
OUTPUT_TEST_FILE = 'E:/code/jupyter/201806-github代码数据打包/201806-github代码数据打包/flowers train/output_test.tfrecords'
#验证数据集
OUTPUT_VALIDATION_FILE = 'E:/code/jupyter/201806-github代码数据打包/201806-github代码数据打包/flowers train/output_validation.tfrecords'
#测试数据和验证数据的比例
VALIDATION_PERCENTAGE = 10
TEST_PERCENTAGE = 10
def create_image_lists(sess,testing_percentage,validation_percentage):
    #拿到INPUT_DATA文件夹下的所有目录（包括root）
    sub_dirs = [x[0] for x in os.walk(INPUT_DATA)]
    #如果是root_dir不需要做处理
    is_root_dir = True
    #定义图片对应的标签，从0-4分别代表不同的花
    current_label = 0
    #写入TFRecord的数据需要首先定义writer
    #这里定义三个writer分别存储训练，测试和验证数据
    writer = tf.python_io.TFRecordWriter(OUTPUT_FILE)
    writer_test = tf.python_io.TFRecordWriter(OUTPUT_TEST_FILE)
    writer_validation = tf.python_io.TFRecordWriter(OUTPUT_VALIDATION_FILE)
    #循环目录
    for sub_dir in sub_dirs:
        if is_root_dir:
            #跳过根目录
            is_root_dir = False
            continue
        #定义空数组来装图片路径
        file_list = []
        #生成查找路径
        dir_name = os.path.basename(sub_dir)
        file_glob = os.path.join(INPUT_DATA, dir_name, '*.' + "jpg")
        # extend合并两个数组
        # glob模块的主要方法就是glob,该方法返回所有匹配的文件路径列表（list）
        # 比如：glob.glob(r’c:*.txt’) 这里就是获得C盘下的所有txt文件
        file_list.extend(glob.glob(file_glob))
        #路径下没有文件就跳过，不继续操作
        if not file_list: continue
        #这里我定义index来打印当前进度
        index = 0
        #file_list此时是图片路径列表
        for file_name in file_list:
            #使用gfile从路径中读取图片
            image_raw_data = gfile.FastGFile(file_name, 'rb').read()
            #对图像解码，解码结果为一个张量
            image = tf.image.decode_jpeg(image_raw_data)

            #对图像矩阵进行归一化处理
            #因为为了将图片数据能够保存到 TFRecord 结构体中
            #所以需要将其图片矩阵转换成 string
            #所以为了在使用时能够转换回来
            #这里确定下数据格式为 tf.float32  
            if image.dtype != tf.float32:
                image = tf.image.convert_image_dtype(image, dtype=tf.float32)
            # 将图片转化成299*299方便模型处理
            image = tf.image.resize_images(image, [299, 299])
            #为了拿到图片的真实数据这里我们要运行一个session op
            image_value = sess.run(image)
            pixels = image_value.shape[1]
            #存储在TFrecord里面的不能是array的形式
            #所以我们需要利用tostring()将上面的矩阵
            #转化成字符串
            #再通过tf.train.BytesList转化成可以存储的形式
            image_raw = image_value.tostring()

            #存到features
            #随机划分测试集和训练集
            #这里存入TFRecord三个数据，图像的pixels像素
            #图像原张量，这里我们需要转成string
            #以及当前图像对应的标签
            example = tf.train.Example(features=tf.train.Features(feature={
                'pixels': _int64_feature(pixels),
                'label': _int64_feature(current_label),
                'image_raw': _bytes_feature(image_raw)
            }))
            chance = np.random.randint(100)
            #随机划分数据集
            if chance < validation_percentage:
                writer_validation.write(example.SerializeToString())
            elif chance < (testing_percentage+validation_percentage):
                writer_test.write(example.SerializeToString())
            else:
                writer.write(example.SerializeToString())
            # print('example',index)
            index = index + 1

        #每一个文件夹下的所有图片都是一个类别
        #所以这里每遍历完一个文件夹，标签就增加1
        current_label += 1

    writer.close()
    writer_validation.close()
    writer_test.close()