【TensorFlow学习笔记】目标识别篇（四）：基于Tensorflow object detection API 打造属于自己的物体检测模型（填坑之路）

最新推荐文章于 2024-05-15 16:33:35 发布

Jarvis_lele

最新推荐文章于 2024-05-15 16:33:35 发布

阅读量830

点赞数

分类专栏： # 深度学习

本文链接：https://blog.csdn.net/Jarvis_lele/article/details/107992973

版权

1、数据集准备

1.1 准备训练和验证数据集

1）本案采用的水杯识别，从公司办公座位拍了上百张水杯照片，当然网上很多物体的大量数据集，可自行选择下载并使用。

得到的训练集和测试集，分别在\models-master\research\object_detection\test_images文件夹下创建train和test文件夹，把对应的数据集拷贝进去。如下:
在这里插入图片描述

1.2 增强数据

但是我们只截取了大概800张左右的图，这个量级在训练时肯定是不够的，所以我们需要使用数据增强（无非是旋转加噪调量度）来增加我们的训练样本

相较于Augmentor，imgaug具有更多的功能，比如对影像增强的同时，对keypoint, bounding box进行相应的变换。例如在目标检测的过程中，训练集包括影像及其对应的bounding box文件，在对影像增强的时候，同时解算出bounding box 相应变换的坐标生成对应的bounding box文件。

安装依赖库

pip install six numpy scipy matplotlib scikit-image opencv-python imageio

安装imgaug

方式一（安装github最新版本）：

pip install　git+https://github.com/aleju/imgaug

方式二（安装pypi版本）：

pip install imgaug

Bounding Boxes实现
读取原影像bounding boxes坐标
读取xml文件并使用ElementTree对xml文件进行解析，找到每个object的坐标值。

def change_xml_annotation(root, image_id, new_target):
new_xmin = new_target[0]
new_ymin = new_target[1]
new_xmax = new_target[2]
new_ymax = new_target[3]

in_file = open(os.path.join(root, str(image_id) + '.xml'))  # 这里root分别由两个意思
tree = ET.parse(in_file)
xmlroot = tree.getroot()
object = xmlroot.find('object')
bndbox = object.find('bndbox')
xmin = bndbox.find('xmin')
xmin.text = str(new_xmin)
ymin = bndbox.find('ymin')
ymin.text = str(new_ymin)
xmax = bndbox.find('xmax')
xmax.text = str(new_xmax)
ymax = bndbox.find('ymax')
ymax.text = str(new_ymax)
tree.write(os.path.join(root, str("%06d" % (str(id) + '.xml'))))

生成变换序列
产生一个处理图片的Sequential。

#影像增强
seq = iaa.Sequential([
iaa.Flipud(0.5), # vertically flip 20% of all images
iaa.Fliplr(0.5), # 镜像
iaa.Multiply((1.2, 1.5)), # change brightness, doesn’t affect BBs
iaa.GaussianBlur(sigma=(0, 3.0)), # iaa.GaussianBlur(0.5),
iaa.Affine(
translate_px={“x”: 15, “y”: 15},
scale=(0.8, 0.95),
rotate=(-30, 30)
) # translate by 40/60px on x/y axis, and scale to 50-70%, affects BBs
])
bounding box 变化后坐标计算
先读取该影像对应xml文件，获取所有目标的bounding boxes，然后依次计算每个box变化后的坐标。

seq_det = seq.to_deterministic() # 保持坐标和图像同步改变，而不是随机
#读取图片
img = Image.open(os.path.join(IMG_DIR, name[:-4] + ‘.jpg’))
#sp = img.size
img = np.asarray(img)
#bndbox 坐标增强
for i in range(len(bndbox)):
bbs = ia.BoundingBoxesOnImage([
ia.BoundingBox(x1=bndbox[i][0], y1=bndbox[i][1], x2=bndbox[i][2], y2=bndbox[i][3]),
], shape=img.shape)

bbs_aug = seq_det.augment_bounding_boxes([bbs])[0]
boxes_img_aug_list.append(bbs_aug)

# 此处运用了一个max，一个min （max是为了方式变化后的box小于1，min是为了防止变化后的box的坐标超出图片，在做faster r-cnn训练的时候，box的坐标会减1，若坐标小于1,就会报错，当然超出图像范围也会报错）
n_x1 = int(max(1, min(img.shape[1], bbs_aug.bounding_boxes[0].x1)))
n_y1 = int(max(1, min(img.shape[0], bbs_aug.bounding_boxes[0].y1)))
n_x2 = int(max(1, min(img.shape[1], bbs_aug.bounding_boxes[0].x2)))
n_y2 = int(max(1, min(img.shape[0], bbs_aug.bounding_boxes[0].y2)))
if n_x1 == 1 and n_x1 == n_x2:
    n_x2 += 1
if n_y1 == 1 and n_y2 == n_y1:
    n_y2 += 1
if n_x1 >= n_x2 or n_y1 >= n_y2:
    print('error', name)
new_bndbox_list.append([n_x1, n_y1, n_x2, n_y2])

#存储变化后的图片
image_aug = seq_det.augment_images([img])[0]
path = os.path.join(AUG_IMG_DIR,
str("%06d" % (len(files) + int(name[:-4]) + epoch * 250)) + ‘.jpg’)
image_auged = bbs.draw_on_image(image_aug, thickness=0)
Image.fromarray(image_auged).save(path)

#存储变化后的XML–此处可根据需要更改文件具体的名称
change_xml_list_annotation(XML_DIR, name[:-4], new_bndbox_list, AUG_XML_DIR,
len(files) + int(name[:-4]) + epoch * 250)
print(str("%06d" % (len(files) + int(name[:-4]) + epoch * 250)) + ‘.jpg’)
new_bndbox_list = []
使用示例
数据准备
输入数据为两个文件夹一个是需要增强的影像数据（JPEGImages），一个是对应的xml文件（Annotations）。注意：影像文件名需和xml文件名相对应！

Annotations

JPEGImages

设置文件路径
IMG_DIR = “…/create-pascal-voc-dataset/examples/VOC2007/JPEGImages”
XML_DIR = “…/create-pascal-voc-dataset/examples/VOC2007/Annotations”

AUG_XML_DIR = “./Annotations” # 存储增强后的XML文件夹路径
try:
shutil.rmtree(AUG_XML_DIR)
except FileNotFoundError as e:
a = 1
mkdir(AUG_XML_DIR)

AUG_IMG_DIR = “./JPEGImages” # 存储增强后的影像文件夹路径
try:
shutil.rmtree(AUG_IMG_DIR)
except FileNotFoundError as e:
a = 1
mkdir(AUG_IMG_DIR)
设置增强次数
AUGLOOP = 10 # 每张影像增强的数量
设置增强参数
通过修改Sequential函数参数进行设置，具体设置参考imgaug使用文档

seq = iaa.Sequential([
iaa.Flipud(0.5), # v翻转
iaa.Fliplr(0.5), # 镜像
iaa.Multiply((1.2, 1.5)), # 改变明亮度
iaa.GaussianBlur(sigma=(0, 3.0)), # 高斯噪声
iaa.Affine(
translate_px={“x”: 15, “y”: 15},
scale=(0.8, 0.95),
rotate=(-30, 30)
) # translate by 40/60px on x/y axis, and scale to 50-70%, affects BBs
])
输出
运行augmentation.py ，运行结束后即可得到增强的影像和对应的xml文件夹

import xml.etree.ElementTree as ET
import pickle
import os
from os import getcwd
import numpy as np
from PIL import Image
import shutil
import matplotlib.pyplot as plt

import imgaug as ia
from imgaug import augmenters as iaa


ia.seed(1)


def read_xml_annotation(root, image_id):
    in_file = open(os.path.join(root, image_id))
    tree = ET.parse(in_file)
    root = tree.getroot()
    bndboxlist = []

    for object in root.findall('object'):  # 找到root节点下的所有country节点
        bndbox = object.find('bndbox')  # 子节点下节点rank的值

        xmin = int(bndbox.find('xmin').text)
        xmax = int(bndbox.find('xmax').text)
        ymin = int(bndbox.find('ymin').text)
        ymax = int(bndbox.find('ymax').text)
        # print(xmin,ymin,xmax,ymax)
        bndboxlist.append([xmin, ymin, xmax, ymax])
        # print(bndboxlist)

    bndbox = root.find('object').find('bndbox')
    return bndboxlist


# (506.0000, 330.0000, 528.0000, 348.0000) -> (520.4747, 381.5080, 540.5596, 398.6603)
def change_xml_annotation(root, image_id, new_target):
    new_xmin = new_target[0]
    new_ymin = new_target[1]
    new_xmax = new_target[2]
    new_ymax = new_target[3]

    in_file = open(os.path.join(root, str(image_id) + '.xml'))  # 这里root分别由两个意思
    tree = ET.parse(in_file)
    xmlroot = tree.getroot()
    object = xmlroot.find('object')
    bndbox = object.find('bndbox')
    xmin = bndbox.find('xmin')
    xmin.text = str(new_xmin)
    ymin = bndbox.find('ymin')
    ymin.text = str(new_ymin)
    xmax = bndbox.find('xmax')
    xmax.text = str(new_xmax)
    ymax = bndbox.find('ymax')
    ymax.text = str(new_ymax)
    tree.write(os.path.join(root, str("%06d" % (str(id) + '.xml'))))


def change_xml_list_annotation(root, image_id, new_target, saveroot, id):
    in_file = open(os.path.join(root, str(image_id) + '.xml'))  # 这里root分别由两个意思
    tree = ET.parse(in_file)
    elem = tree.find('filename')
    elem.text = (str("%06d" % int(id)) + '.jpg')
    tree.find('path').text = (saveroot+'/'+ str("%06d" % int(id)) + '.jpg')  #  change file path to rot degree name
    xmlroot = tree.getroot()
    index = 0

    for object in xmlroot.findall('object'):  # 找到root节点下的所有country节点
        bndbox = object.find('bndbox')  # 子节点下节点rank的值

        # xmin = int(bndbox.find('xmin').text)
        # xmax = int(bndbox.find('xmax').text)
        # ymin = int(bndbox.find('ymin').text)
        # ymax = int(bndbox.find('ymax').text)

        new_xmin = new_target[index][0]
        new_ymin = new_target[index][1]
        new_xmax = new_target[index][2]
        new_ymax = new_target

最低0.47元/天解锁文章

Jarvis_lele

关注

0
点赞
踩
4

收藏

觉得还不错? 一键收藏
0
评论
【TensorFlow学习笔记】目标识别篇（四）：基于Tensorflow object detection API 打造属于自己的物体检测模型（填坑之路）

2、准备训练数据集：lambel1、VOC2007数据集简介知己知彼，方百战不殆。想制作自己的数据集当然要先了解SSD使用的数据集VOC2007长啥样。VOC2007下载链接，密码是：m5io。（VOC2007完整下载有3个压缩包+1个PDF，上面链接里只包含其中一个压缩包VOCtrainval_06-Nov-2007）。打开压缩包就如下图：VOC2007详细介绍在这里，提供给大家有兴趣作了解。而制作自己的数据集只需用到前三个文件夹，所以请事先建好这三个文件夹放入同一文件夹内，同时ImageSets
复制链接

扫一扫