学习Tensorflow-API过程中入的坑-----（1）csv文件转换为tfrecords文件

最新推荐文章于 2022-09-13 06:00:00 发布

weixin_41551411

最新推荐文章于 2022-09-13 06:00:00 发布

阅读量709

点赞数 1

文章标签： tensorflow 深度学习

本文链接：https://blog.csdn.net/weixin_41551411/article/details/105551374

版权

学习Tensorflow-API过程中入的坑-----（1）csv文件转换为tfrecords文件

以上是搭建属于做自己的物体识别模型[我学习参考的博客](https://blog.csdn.net/dy_guox/article/details/79111949)
- 重启内核之后紧接着又出现了问题了：

以上是搭建属于做自己的物体识别模型我学习参考的博客

之前的按照博主的步骤都没有问题，但到csv转换成tfrecords文件的时候遇到问题了。
代码如下：

# -*- coding: utf-8 -*-
"""
Created on Tue Jan 16 01:04:55 2018
@author: Xiang Guo
由CSV文件生成TFRecord文件
"""
 
"""
Usage:
  # From tensorflow/models/
  # Create train data:
  python generate_tfrecord.py --csv_input=data/tv_vehicle_labels.csv  --output_path=train.record
  # Create test data:
  python generate_tfrecord.py --csv_input=data/test_labels.csv  --output_path=test.record
"""
 
 
 
import os
import io
import pandas as pd
import tensorflow as tf
 
from PIL import Image
from object_detection.utils import dataset_util
from collections import namedtuple, OrderedDict
 
os.chdir('D:\\tensorflow-model\\models\\research\\object_detection\\')
 
flags = tf.app.flags
flags.DEFINE_string('csv_input', '', 'Path to the CSV input')
flags.DEFINE_string('output_path', '', 'Path to output TFRecord')
FLAGS = flags.FLAGS
 
 
# TO-DO replace this with label map
#注意将对应的label改成自己的类别！！！！！！！！！！
def class_text_to_int(row_label):
    if row_label == 'tv':
        return 1
    elif row_label == 'vehicle':
        return 2
    else:
        None
 
 
def split(df, group):
    data = namedtuple('data', ['filename', 'object'])
    gb = df.groupby(group)
    return [data(filename, gb.get_group(x)) for filename, x in zip(gb.groups.keys(), gb.groups)]
 
 
def create_tf_example(group, path):
    with tf.gfile.GFile(os.path.join(path, '{}'.format(group.filename)), 'rb') as fid:
        encoded_jpg = fid.read()
    encoded_jpg_io = io.BytesIO(encoded_jpg)
    image = Image.open(encoded_jpg_io)
    width, height = image.size
 
    filename = group.filename.encode('utf8')
    image_format = b'jpg'
    xmins = []
    xmaxs = []
    ymins = []
    ymaxs = []
    classes_text = []
    classes = []
 
    for index, row in group.object.iterrows():
        xmins.append(row['xmin'] / width)
        xmaxs.append(row['xmax'] / width)
        ymins.append(row['ymin'] / height)
        ymaxs.append(row['ymax'] / height)
        classes_text.append(row['class'].encode('utf8'))
        classes.append(class_text_to_int(row['class']))
 
    tf_example = tf.train.Example(features=tf.train.Features(feature={
        'image/height': dataset_util.int64_feature(height),
        'image/width': dataset_util.int64_feature(width),
        'image/filename': dataset_util.bytes_feature(filename),
        'image/source_id': dataset_util.bytes_feature(filename),
        'image/encoded': dataset_util.bytes_feature(encoded_jpg),
        'image/format': dataset_util.bytes_feature(image_format),
        'image/object/bbox/xmin': dataset_util.float_list_feature(xmins),
        'image/object/bbox/xmax': dataset_util.float_list_feature(xmaxs),
        'image/object/bbox/ymin': dataset_util.float_list_feature(ymins),
        'image/object/bbox/ymax': dataset_util.float_list_feature(ymaxs),
        'image/object/class/text': dataset_util.bytes_list_feature(classes_text),
        'image/object/class/label': dataset_util.int64_list_feature(classes),
    }))
    return tf_example
 
 
def main(_):
    writer = tf.python_io.TFRecordWriter(FLAGS.output_path)
    path = os.path.join(os.getcwd(), 'images')
    examples = pd.read_csv(FLAGS.csv_input)
    grouped = split(examples, 'filename')
    for group in grouped:
        tf_example = create_tf_example(group, path)
        writer.write(tf_example.SerializeToString())
 
    writer.close()
    output_path = os.path.join(os.getcwd(), FLAGS.output_path)
    print('Successfully created the TFRecords: {}'.format(output_path))
 
 
if __name__ == '__main__':
    tf.app.run()

我把路径改好之后，运行，但总是出现如下的错误：
在这里插入图片描述
后来在百度上查询发现需要重启一下内核，我用的是spyder(python3.6)
内核在如下位置：
点击那个带刺的圆，发现有个restart kernel然后点击就行了

重启内核之后紧接着又出现了问题了：

在这里插入图片描述
查询百度总是说路径有问题，但反复看了好几遍，就是没有发现路径哪里错了。
最后在有个外国人也遇到了这个问题，下面评论说：
flags.DEFINE_string(‘csv_input’, ‘’, ‘Path to the CSV input’) flags.DEFINE_string(‘output_path’, ‘’, ‘Path to output TFRecord’)
Replace the ‘Path to the CSV input’ and ‘Path to output TFRecord’ with actual path.
我干脆就把源代码里面的这一段：
在这里插入图片描述
注释掉了。

紧接着把下面这一部分：
在这里插入图片描述
里面涉及到的路径，全部用实际的路径代替。

然后问题就解决了

重要提示：如果按照我的方法，需要把images里面的train和test文件夹里面的图片都提取带images里面因为我的路径只是到images。如果你先生成train.tfrecords文件，就把train文件夹里面的xml文件提取到images这个大文件夹下，其实也可以不用那么麻烦，直接在加一级文件目录。即images\train和images\test.

weixin_41551411

关注

1
点赞
踩
2

收藏

觉得还不错? 一键收藏
2
评论
学习Tensorflow-API过程中入的坑-----（1）csv文件转换为tfrecords文件

学习Tensorflow-API过程中入的坑-----（1）csv文件转换为tfrecords文件以上是搭建属于做自己的物体识别模型[我学习参考的博客](https://blog.csdn.net/dy_guox/article/details/79111949)重启内核之后紧接着又出现了问题了：以上是搭建属于做自己的物体识别模型我学习参考的博客之前的按照博主的步骤都没有问题，但到csv转换成...
复制链接

扫一扫