【深度学习标注数据处理】 BoundingBox标注数据中的xml、txt存储文件的处理汇总

钱多多先森

已于 2024-07-11 19:57:24 修改

阅读量8k

点赞数 19

分类专栏：深度学习标注数据处理文章标签：目标检测 xml python

于 2019-07-17 11:10:28 首次发布

本文链接：https://blog.csdn.net/wsLJQian/article/details/92784131

版权

深度学习标注数据处理专栏收录该内容

9 篇文章 4 订阅

订阅专栏

一、BoundingBox 的 Label Img 形式--- xml 文件

1.1、labelImg的安装

1.2、xml文件内容注释

1.3、对xml文件操作

二、BoundingBox 的 VOC 形式--- txt 文件

三、两BoundingBox的IOU的计算

3.1、方法1

3.2、方法2

一、BoundingBox 的 Label Img 形式--- xml 文件

xml文件中矩形框坐标的获取比较简单。

xml文件可采用标注软件labelImg进行生成
xml中记录了被标注图像信息和标注的信息

1.1、labelImg的安装

（着重说Ubuntu下的一个安装，别的版本可参照上面说到的labelImg博客）

现在就可以Terminal下打开看看了，点击Open打开一张带标记图片，如图：

根据上面的英文，也都该知道怎么用，其中有些省事省力的工作，就是：

先给待label图片做好命名，放在同一文件夹;
然后设定OpenDir和待保存.xml文件夹下ChangeSaveDir;
如果是一个类别，可使用Use Default label，这样提高标注效率。

其他更多安装方式，可以参考这篇文章，亲测有效：【数据准备001】标注工具Labelimg安装与使用（附txt与xml文件相互转化代码）-CSDN博客

1.2、xml文件内容注释

下面对保存的xml文件内容进行简单的注视解析

从图片中来，再到图片中去，我们来找一下对应关系

1.3、对xml文件读取

# -*- coding:utf8 -*-
import xml.etree.ElementTree as ET


def readXML(xml_file_path):
    tree = ET.parse(xml_file_path)
    root = tree.getroot()

    for size in root.iter('size'):
        width = int(size.find('width').text)
        height = int(size.find('height').text)
    for box in root.iter('bndbox'):
        xmin = int(box.find('xmin').text)
        ymin = int(box.find('ymin').text)
        xmax = int(box.find('xmax').text)
        ymax = int(box.find('ymax').text)

二、BoundingBox 的 VOC 形式--- txt 文件

内容按行存储，依次是label，x_center，y_center，x_relative，y_relative

直观换算后是这样的矩形框

三、两BoundingBox的IOU的计算

3.1、方法1

得到一个框与另一个框的iou结果


import numpy as np
# ############################################################
# # IOU
# ############################################################
def two_Box_iou(list_a, list_b):
    """Compute the iou of two boxes.
    """
    # 获取矩形框交集对应的顶点坐标(intersection)
    xmin1, ymin1, xmax1, ymax1 = int(list_a[0]),int(list_a[1]), int(list_a[2]), int(list_a[3])
    xmin2, ymin2, xmax2, ymax2 = int(list_b[0]),int(list_b[1]), int(list_b[2]), int(list_b[3])

    xx1 = np.max([xmin1, xmin2])
    yy1 = np.max([ymin1, ymin2])
    xx2 = np.min([xmax1, xmax2])
    yy2 = np.min([ymax1, ymax2])

    # 计算两个矩形框面积
    area1 = (xmax1 - xmin1 + 1) * (ymax1 - ymin1 + 1)
    area2 = (xmax2 - xmin2 + 1) * (ymax2 - ymin2 + 1)

    # 计算交集面积
    inter_area = (np.max([0, xx2 - xx1])) * (np.max([0, yy2 - yy1]))
    # 计算交并比
    iou = inter_area / (area1 + area2 - inter_area + 1e-6)
    return iou
#
list_a = [321,296,387,342]
list_b = [328,313,359,332]
rst_IOU = two_Box_iou(list_a, list_b)
print(rst_IOU)

3.2、方法2

与法1不同是：得到一系列数组框与另一系列框的iou结果，计算任意两两之间的iou值使用场景更丰富


import numpy as np
import cv2

def get_iou_arr(arr_box_a, arr_box_b):
    """
    :param arr_box_a: (n,4)
    :param arr_box_b: (n,4)
    :return: 顺序性iou
    """
    assert arr_box_a.shape[-1] == 4 and arr_box_b.shape[-1] == 4, "a box should be described by 4 nums"
    if arr_box_a.ndim < 2:
        arr_box_a = arr_box_a.reshape([1, -1])
    if arr_box_b.ndim < 2:
        arr_box_b = arr_box_b.reshape([1, -1])

    max_v = np.max((np.max(arr_box_a), np.max(arr_box_b.max())))
    arr_box_a = np.expand_dims(arr_box_a, 1) / max_v
    arr_box_b = np.expand_dims(arr_box_b, 0) / max_v
    # 交集矩形框
    left_top = np.maximum(arr_box_a[..., :2], arr_box_b[..., :2])
    right_down = np.minimum(arr_box_a[..., 2:], arr_box_b[..., 2:])
    cover_wh = np.clip(right_down - left_top, 0, 1)
    s_cover = cover_wh[..., 0] * cover_wh[..., 1]

    # a、b面积
    wh_a = arr_box_a[..., 2:] - arr_box_a[..., :2]
    s_a = wh_a[..., 0] * wh_a[..., 1]
    wh_b = arr_box_b[..., 2:] - arr_box_b[..., :2]
    s_b = wh_b[..., 0] * wh_b[..., 1]
    iou = s_cover / (s_a + s_b - s_cover)
    return iou

def aibox_match_gt(ai_box, gt_box, gt_class, iou_threshold):
    if gt_box.size == 0:
        gt_class = np.array(['-1'])
        gt_box = np.array([[0, 0, 0, 0]])

    print(ai_box)
    print(gt_box)
    iou_arr = get_iou_arr(ai_box, gt_box)
    print('iou_arr:', iou_arr)
    iou_max = iou_arr.max(axis=1)   # axis=1 按行  axis=0 按列   返回一维数组中最大元素

    ai_loc_res = iou_max > iou_threshold
    select_idx = iou_arr.argmax(axis=1)     # .argmax 返回一维数组中最大元素的索引位置
    gt_class_tiled = gt_class[select_idx]

    return gt_class_tiled, ai_loc_res

if __name__=='__main__':
    ai_box = np.array([[210, 67, 682, 305], [318, 307, 627, 540]])
    gt_box = np.array([[229, 77, 662, 275], [318, 297, 617, 560]])
    gt_class = np.array(['glass', 'face'])
    gt_class_matched, loc_res = aibox_match_gt(ai_box, gt_box, gt_class, iou_threshold=0.3)
    print(gt_class_matched, loc_res)

打印内容：

ai_box: [[210  67 682 305]
 [318 307 627 540]]
gt_box: [[229  77 662 275]
 [318 297 617 560]]
iou_arr: [[0.76319257 0.0126842 ]
 [0.         0.86043697]]
['glass' 'face'] [ True  True]

对预测边框与标注边框，打印到原图上，进行查看，如下代码：

def draw():
    ai_box = np.array([[210, 67, 682, 305], [318, 307, 627, 540]])
    gt_box = np.array([[229, 77, 662, 275], [318, 297, 617, 560]])

    image = cv2.imread(r'E:\temp\jpeg\girl.jpg')
    for e in ai_box.tolist():
        x, y, x2, y2 = e
        cv2.rectangle(image, (int(x), int(y)), (int(x2), int(y2)), (0, 0, 255), thickness=2)

    for e in gt_box.tolist():
        x, y, x2, y2 = e
        cv2.rectangle(image, (int(x), int(y)), (int(x2), int(y2)), (255, 0, 0), thickness=2)
    cv2.imwrite(r'E:\temp\jpeg\girl_ai_gt.jpg', image)

展示结果如下：红色框是ai结果，蓝色框是gt标注内容