tensorflow model detection api 添加背景图片训练

https://github.com/tensorflow/models/issues/3365

OK for future references, this is how i add background images to the dataset allowing the model to train on it.
Functions used from: datitran/raccoon_dataset

  1. Generate CSV file -> xml_to_csv.py
  2. Generate TFRecord from CSV file -> generate_tfrecord.py

First Step - Creating XML file for it

Example of background image XML file

<annotation>
    <folder>test/</folder>
    <filename>XXXXXX.png</filename>
    <path>your_path/test/XXXXXX.png</path>
    <source>
        <database>Unknown</database>
    </source>
    <size>
        <width>640</width>
        <height>640</height>
        <depth>3</depth>
    </size>
    <segmented>0</segmented>
</annotation>

Basically you remove the entire <object> (i.e no annotation )

Second Step - Generate CSV file

Using the xml_to_csv.py I just add a little change, to consider the XML file that do not have any annotation (the background images) as so:
From the original:
https://github.com/datitran/raccoon_dataset/blob/93938849301895fb73909842ba04af9b602f677a/xml_to_csv.py#L12-L22

I add:

value = None
for member in root.findall('object'):
            value = (root.find('filename').text,
                     int(root.find('size')[0].text),
                     int(root.find('size')[1].text),
                     member[0].text,
                     int(member[4][0].text),
                     int(member[4][1].text),
                     int(member[4][2].text),
                     int(member[4][3].text)
                     )
            xml_list.append(value)
        if value is None:
            value = (root.find('filename').text,
                     int(root.find('size')[0].text),
                     int(root.find('size')[1].text),
                     '-1',
                     '-1',
                     '-1',
                     '-1',
                     '-1'
                     )
            xml_list.append(value)

I'm just adding negative values to the coordinates of the bounding box if there is no in the XML file, which is the case for the background images, and it will be usefull when generating the TFRecords.

Third and Final Step - Generating the TFRecords

Now, when creating the TFRecords, if the the corresponding row/image has negative coordinates, i just add zero values to the record (before, this would not even be possible).

So from the original:
https://github.com/datitran/raccoon_dataset/blob/93938849301895fb73909842ba04af9b602f677a/generate_tfrecord.py#L60-L66

I add:

for index, row in group.object.iterrows():
        if int(row['xmin']) > -1:
            xmins.append(row['xmin'] / width)
            xmaxs.append(row['xmax'] / width)
            ymins.append(row['ymin'] / height)
            ymaxs.append(row['ymax'] / height)
            classes_text.append(row['class'].encode('utf8'))
            classes.append(class_text_to_int(row['class']))
        else:
            xmins.append(0)
            xmaxs.append(0)
            ymins.append(0)
            ymaxs.append(0)
            classes_text.append('something'.encode('utf8'))  # this doe not matter for the background
            classes.append(5000)

To note that in the class_text (of the else statement), since for the background images there are no bounding boxes, you can replace the string with whatever you would like, for the background cases, this will not appear anywhere.

And lastly for the classes (of the else statement) you just need to add a number label that does not belong to neither of your own classes.

For those who are wondering, I've used this procedure many times, and currently works for my use cases.

Hope it helped in some way.

thumbs up 1

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值