tensorflow model detection api 添加背景图片训练

最新推荐文章于 2020-12-02 10:29:48 发布

northeastsqure

最新推荐文章于 2020-12-02 10:29:48 发布

阅读量203

点赞数

文章标签： tensorflow 背景

原文链接：https://github.com/tensorflow/models/issues/3365

版权

https://github.com/tensorflow/models/issues/3365

OK for future references, this is how i add background images to the dataset allowing the model to train on it.
Functions used from: datitran/raccoon_dataset

Generate CSV file -> xml_to_csv.py
Generate TFRecord from CSV file -> generate_tfrecord.py

First Step - Creating XML file for it

Example of background image XML file

<annotation>
    <folder>test/</folder>
    <filename>XXXXXX.png</filename>
    <path>your_path/test/XXXXXX.png</path>
    <source>
        <database>Unknown</database>
    </source>
    <size>
        <width>640</width>
        <height>640</height>
        <depth>3</depth>
    </size>
    <segmented>0</segmented>
</annotation>

Basically you remove the entire <object> (i.e no annotation )

Second Step - Generate CSV file

Using the xml_to_csv.py I just add a little change, to consider the XML file that do not have any annotation (the background images) as so:
From the original:
https://github.com/datitran/raccoon_dataset/blob/93938849301895fb73909842ba04af9b602f677a/xml_to_csv.py#L12-L22

I add:

value = None
for member in root.findall('object'):
            value = (root.find('filename').text,
                     int(root.find('size')[0].text),
                     int(root.find('size')[1].text),
                     member[0].text,
                     int(member[4][0].text),
                     int(member[4][1].text),
                     int(member[4][2].text),
                     int(member[4][3].text)
                     )
            xml_list.append(value)
        if value is None:
            value = (root.find('filename').text,
                     int(root.find('size')[0].text),
                     int(root.find('size')[1].text),
                     '-1',
                     '-1',
                     '-1',
                     '-1',
                     '-1'
                     )
            xml_list.append(value)

I'm just adding negative values to the coordinates of the bounding box if there is no in the XML file, which is the case for the background images, and it will be usefull when generating the TFRecords.

Third and Final Step - Generating the TFRecords

Now, when creating the TFRecords, if the the corresponding row/image has negative coordinates, i just add zero values to the record (before, this would not even be possible).

So from the original:
https://github.com/datitran/raccoon_dataset/blob/93938849301895fb73909842ba04af9b602f677a/generate_tfrecord.py#L60-L66

I add:

for index, row in group.object.iterrows():
        if int(row['xmin']) > -1:
            xmins.append(row['xmin'] / width)
            xmaxs.append(row['xmax'] / width)
            ymins.append(row['ymin'] / height)
            ymaxs.append(row['ymax'] / height)
            classes_text.append(row['class'].encode('utf8'))
            classes.append(class_text_to_int(row['class']))
        else:
            xmins.append(0)
            xmaxs.append(0)
            ymins.append(0)
            ymaxs.append(0)
            classes_text.append('something'.encode('utf8'))  # this doe not matter for the background
            classes.append(5000)

To note that in the class_text (of the else statement), since for the background images there are no bounding boxes, you can replace the string with whatever you would like, for the background cases, this will not appear anywhere.

And lastly for the classes (of the else statement) you just need to add a number label that does not belong to neither of your own classes.

For those who are wondering, I've used this procedure many times, and currently works for my use cases.

Hope it helped in some way.