目标检测加权框融合 WBF（Weighted Boxes Fusion）

兔子Code

已于 2023-08-30 07:52:15 修改

阅读量733

点赞数 1

分类专栏： Yolo涨点文章标签：目标检测计算机视觉 pytorch

于 2023-08-30 07:51:11 首次发布

本文链接：https://blog.csdn.net/YXD0514/article/details/132574588

版权

Yolo涨点专栏收录该内容

4 篇文章 2 订阅

订阅专栏

1. WBF简介

WBF是一种目标检测模型集成方法，最初是由 Ensemble of Exemplar-SVMs (EES) 的作者在其论文中提出的。WBF 可以通过对多个已有目标检测模型的结果进行融合，来提高检测精度。

2. WBF的具体实现如下

（1）假设有训练好的N个模型对同一幅图像进行预测，每个模型的预测框被添加到一个单独的集合B，并将列表中的元素按照置信度降 ${\mathbf{C}}$ 序排序。
（2）建立两个空列表：列表L表示每个目标检测框的cluster（用于存放属于同一个目标的所有框），列表F表示每个目标的fusion即融合框（用cluster 中所有的框融合的新框）。
（3）遍历列表B中的预测框，在F中找到相应的匹配框（IoU大于阈值的框）。
如果没有找到匹配项，则将列表B中的框作为新框添加到列表L和F的最后，然后继续匹配列表B中的下一个框。
如果找到匹配项，将此框添加到列表L中与列表F中匹配框对应的位置。
（4）使用以下的融合公式，对cluster中每个位置的所有所有 ${\mathbf{T}}$ 个框，重新计算新的坐标和置信度。然后设置fusion中每个目标框的置信度为该目标所有框的平均置信度。每个框的坐标是该目标所有框坐标的加权和，权重就是每个框的置信度，因此置信度较高的框对最终融合坐标的贡献越大。
$\begin{gathered} \mathbf{C}=\frac{\sum_{i=1}^{\mathbf{T}}\mathbf{C}_i}{T} \\ \mathbf{X}\mathbf{1},\mathbf{2} =\frac{\sum_{i=1}^\mathbf{T}\mathbf{C}_i*\mathbf{X}\mathbf{1},\mathbf{2}_i}{\sum_{i=1}^\mathbf{T}\mathbf{C}_i} \\ \mathbf{Y1,2} =\frac{\sum_{i=1}^\mathbf{T}\mathbf{C}_i*\mathbf{Y}\mathbf{1},\mathbf{2}_i}{\sum_{i=1}^\mathbf{T}\mathbf{C}_i} \end{gathered}$
其中 ${\mathbf{X}\mathbf{1},\mathbf{2} }$ 预测框的横坐标 ${x_1,x_2}$ ， ${\mathbf{Y}\mathbf{1},\mathbf{2} }$ 预测框的纵坐标 ${y_1,y_2}$
（5）处理完B中的所有框以后，重新算出F中的置信度得分，具体来说：先乘框的总数，再除以模型的数量 ${N}$ 。如果cluster中框的个数很少，也就意味着只有少数模型预测到目标，这时可以用以下两种方式降低这种情况的置信度分数：
$\begin{aligned}\mathbf{C}&=\mathbf{C}*\frac{min(T,N)}N,\\\\\mathbf{C}&=\mathbf{C}*\frac TN,\end{aligned}$

3. 代码详解


import warnings
import numpy as np

def weighted_boxes_fusion(
        boxes_list, # 每个模型的预测框列表，每个框是4个数字。shape为(models_number, model_preds, 4)
        scores_list, # 每个模型预测框分数列表。shape为(models_number, model_preds）
        labels_list, # 每个模型预测框类别列表。shape为(models_number, model_preds）
        weights=None, # 每个模型的权重，默认：None，每个模型权重为 1
        iou_thr=0.55, # 与当前框的IoU值大于 iou_thr 的预测框都会被舍弃
        skip_box_thr=0.0, # 排除得分低于此变量的预测框
        conf_type='avg', # 如何计算加权框的置信度，
        allows_overflow=False # 如果我们希望置信度不超过1.0，则为False
):
    '''
    'avg': 平均值,
    'max': 最大值,
    'box_and_model_avg': 预测框和模型混合加权平均,
    'absent_model_aware_avg': 考虑了缺席（不在集群中）模型的加权平均。
    '''
    # 判断每个模型的权重是否为None，如果为None，权重都设为 1
    if weights is None:
        weights = np.ones(len(boxes_list))
    # 每个模型权重的个数必须和模型个数相等
    if len(weights) != len(boxes_list):
        print('Warning: incorrect number of weights {}. Must be: {}. Set weights equal to 1.'.format(len(weights), len(boxes_list)))
        weights = np.ones(len(boxes_list))
    weights = np.array(weights)
    # 确定计算加权框的置信度的方法是否合理
    if conf_type not in ['avg', 'max', 'box_and_model_avg', 'absent_model_aware_avg']:
        print('Unknown conf_type: {}. Must be "avg", "max" or "box_and_model_avg", or "absent_model_aware_avg"'.format(conf_type))
        exit()
    # 根据 skip_box_thr 初步过滤预测框，并进行简单的修正，输出字典 {label:[[label, score*weight, weight, model index, x1, y1, x2, y2],...]}
    filtered_boxes = prefilter_boxes(boxes_list, scores_list, labels_list, weights, skip_box_thr)
    # 初步筛选后的预测框为0个，直接返回0值
    if len(filtered_boxes) == 0:
        return np.zeros((0, 4)), np.zeros((0,)), np.zeros((0,))

    overall_boxes = []
    for label in filtered_boxes:
        # 取出字典中键为label的值
        boxes = filtered_boxes[label]
        new_boxes = []
        # np.empty()根据给定的维度和数值类型返回一个新的数组，其元素不进行初始化。
        # 初始化一个融合框，后面会选出IoU与之相近的预测框进行融合，更新这个融合框。
        weighted_boxes = np.empty((0, 8))

        # Clusterize boxes
        for j in range(0, len(boxes)):
            index, best_iou = find_matching_box_fast(weighted_boxes, boxes[j], iou_thr)

            if index != -1:
                # new_boxes里面存放着相应index的融合框由哪些预测框融合而成。
                new_boxes[index].append(boxes[j])
                # 预测框融合
                weighted_boxes[index] = get_weighted_box(new_boxes[index], conf_type)
            else:
                new_boxes.append([boxes[j].copy()])
                weighted_boxes = np.vstack((weighted_boxes, boxes[j].copy()))

        # 根据模型和框的数量重新调整置信度
        for i in range(len(new_boxes)):
            # new_boxes里面存放着相应index的融合框由哪些预测框融合而成。这些预测框称之为集群框。
            clustered_boxes = new_boxes[i]
            if conf_type == 'box_and_model_avg':
                clustered_boxes = np.array(clustered_boxes)
                # 边框数加权平均
                weighted_boxes[i, 1] = weighted_boxes[i, 1] * len(clustered_boxes) / weighted_boxes[i, 2]
                # identify unique model index by model index column
                # np.unique
                # 去除其中重复的元素 ，并按元素 由小到大 返回一个新的无元素重复的元组或者列表
                # return_index=True 返回新列表元素在旧列表中的位置（下标），并以列表形式存储。
                _, idx = np.unique(clustered_boxes[:, 3], return_index=True)
                # 根据唯一的模型权重重新缩放
                weighted_boxes[i, 1] = weighted_boxes[i, 1] *  clustered_boxes[idx, 2].sum() / weights.sum()
            elif conf_type == 'absent_model_aware_avg':
                clustered_boxes = np.array(clustered_boxes)
                # 获取集群中唯一的模型索引
                # np.unique
                # 去除其中重复的元素 ，并按元素 由小到大 返回一个新的无元素重复的元组或者列表
                # 返回新列表元素在旧列表中的位置（下标），并以列表形式存储。
                models = np.unique(clustered_boxes[:, 3]).astype(int)
                # 创建一个蒙版来获取未使用的模型权重
                mask = np.ones(len(weights), dtype=bool)
                # 将已经在边框融合中使用的模型权重的mask设为False
                mask[models] = False
                # absent model aware weighted average
                # 使用全部模型的权重来调整置信度
                weighted_boxes[i, 1] = weighted_boxes[i, 1] * len(clustered_boxes) / (weighted_boxes[i, 2] + weights[mask].sum())
            elif conf_type == 'max':
                weighted_boxes[i, 1] = weighted_boxes[i, 1] / weights.max()
            elif not allows_overflow:
                weighted_boxes[i, 1] = weighted_boxes[i, 1] * min(len(weights), len(clustered_boxes)) / weights.sum()
            else:
                weighted_boxes[i, 1] = weighted_boxes[i, 1] * len(clustered_boxes) / weights.sum()
        overall_boxes.append(weighted_boxes)
    overall_boxes = np.concatenate(overall_boxes, axis=0)
    overall_boxes = overall_boxes[overall_boxes[:, 1].argsort()[::-1]]
    boxes = overall_boxes[:, 4:]
    scores = overall_boxes[:, 1]
    labels = overall_boxes[:, 0]
    return boxes, scores, labels



def prefilter_boxes(boxes, scores, labels, weights, thr):
    # Create dict with boxes stored by its label
    new_boxes = dict()
    # 遍历每一个模型
    for t in range(len(boxes)):
        # 预测框的个数必须和预测框得分个数相等
        if len(boxes[t]) != len(scores[t]):
            print('Error. Length of boxes arrays not equal to length of scores array: {} != {}'.format(len(boxes[t]), len(scores[t])))
            exit()
        # 预测框的个数必须和预测框类别个数相等
        if len(boxes[t]) != len(labels[t]):
            print('Error. Length of boxes arrays not equal to length of labels array: {} != {}'.format(len(boxes[t]), len(labels[t])))
            exit()
        # 遍历每个预测框
        for j in range(len(boxes[t])):
            # 预测框得分
            score = scores[t][j]
            # 如果预测框得分小于最小分则被舍弃
            if score < thr:
                continue
            label = int(labels[t][j])
            box_part = boxes[t][j]
            x1 = float(box_part[0])
            y1 = float(box_part[1])
            x2 = float(box_part[2])
            y2 = float(box_part[3])

            # 进行一些操作使预测框更加合理，如果预测框没有归一化，需要把关于归一化的处理注释掉
            if x2 < x1:
                warnings.warn('X2 < X1 value in box. Swap them.')
                x1, x2 = x2, x1
            if y2 < y1:
                warnings.warn('Y2 < Y1 value in box. Swap them.')
                y1, y2 = y2, y1
            if x1 < 0:
                warnings.warn('X1 < 0 in box. Set it to 0.')
                x1 = 0
            if x1 > 1:
                warnings.warn('X1 > 1 in box. Set it to 1. Check that you normalize boxes in [0, 1] range.')
                x1 = 1
            if x2 < 0:
                warnings.warn('X2 < 0 in box. Set it to 0.')
                x2 = 0
            if x2 > 1:
                warnings.warn('X2 > 1 in box. Set it to 1. Check that you normalize boxes in [0, 1] range.')
                x2 = 1
            if y1 < 0:
                warnings.warn('Y1 < 0 in box. Set it to 0.')
                y1 = 0
            if y1 > 1:
                warnings.warn('Y1 > 1 in box. Set it to 1. Check that you normalize boxes in [0, 1] range.')
                y1 = 1
            if y2 < 0:
                warnings.warn('Y2 < 0 in box. Set it to 0.')
                y2 = 0
            if y2 > 1:
                warnings.warn('Y2 > 1 in box. Set it to 1. Check that you normalize boxes in [0, 1] range.')
                y2 = 1
            if (x2 - x1) * (y2 - y1) == 0.0:
                warnings.warn("Zero area box skipped: {}.".format(box_part))
                continue
            # 将标签，得分，模型权重，模型的索引，坐标作为 值 存入字典，键 为标签
            # [label, score, weight, model index, x1, y1, x2, y2]
            b = [int(label), float(score) * weights[t], weights[t], t, x1, y1, x2, y2]
            if label not in new_boxes:
                new_boxes[label] = []
            new_boxes[label].append(b)

    # 按分数排序字典中的每个列表，并将其转换为numpy数组
    for k in new_boxes:
        current_boxes = np.array(new_boxes[k])
        new_boxes[k] = current_boxes[current_boxes[:, 1].argsort()[::-1]]

    return new_boxes


def get_weighted_box(boxes, conf_type='avg'):
    """
    Create weighted box for set of boxes
    :param boxes: set of boxes to fuse
    :param conf_type: type of confidence one of 'avg' or 'max'
    :return: weighted box (label, score, weight, model index, x1, y1, x2, y2)
    """

    box = np.zeros(8, dtype=np.float32)
    conf = 0
    conf_list = []
    w = 0
    # 遍历每个预测框
    for b in boxes:
        # 预测框坐标乘以得分并相加
        box[4:] += (b[1] * b[4:])
        # 得分相加
        conf += b[1]
        # 得分列表
        conf_list.append(b[1])
        # 模型权重相加
        w += b[2]
    # 类别
    box[0] = boxes[0][0]
    if conf_type in ('avg', 'box_and_model_avg', 'absent_model_aware_avg'):
        # 融合框的分数等于预测框平均分数
        box[1] = conf / len(boxes)
    elif conf_type == 'max':
        # 融合框的分数等于预测框最大分数
        box[1] = np.array(conf_list).max()
    # 融合框的模型权重等于预测框模型权重的总和
    box[2] = w
    # 模型索引(无用)设为-1
    box[3] = -1 # model index field is retained for consistency but is not used.
    # 融合框的坐标等于预测框坐标和除以得分和
    box[4:] /= conf
    return box


def find_matching_box_fast(boxes_list, new_box, match_iou):
    """
        Reimplementation of find_matching_box with numpy instead of loops. Gives significant speed up for larger arrays
        (~100x). This was previously the bottleneck since the function is called for every entry in the array.
    """

    def bb_iou_array(boxes, new_box):
        # bb interesection over union
        xA = np.maximum(boxes[:, 0], new_box[0])
        yA = np.maximum(boxes[:, 1], new_box[1])
        xB = np.minimum(boxes[:, 2], new_box[2])
        yB = np.minimum(boxes[:, 3], new_box[3])

        interArea = np.maximum(xB - xA, 0) * np.maximum(yB - yA, 0)

        # compute the area of both the prediction and ground-truth rectangles
        boxAArea = (boxes[:, 2] - boxes[:, 0]) * (boxes[:, 3] - boxes[:, 1])
        boxBArea = (new_box[2] - new_box[0]) * (new_box[3] - new_box[1])

        iou = interArea / (boxAArea + boxBArea - interArea)

        return iou
    # 如果融合框个数为0，不进行融合，因为初始化融合框个数为0，后面会选出得分最高的框添加融合框列表
    if boxes_list.shape[0] == 0:
        return -1, match_iou

    # boxes = np.array(boxes_list)
    boxes = boxes_list
    # 计算融合框和预测框IoU值
    ious = bb_iou_array(boxes[:, 4:], new_box[4:])
    # 将类别不同的预测框IoU值设为 -1
    ious[boxes[:, 0] != new_box[0]] = -1

    # 筛选最大的IoU
    best_idx = np.argmax(ious)
    best_iou = ious[best_idx]
    # 如果最大的IoU小于等于设定的阈值，则将最大IoU值设为阈值，预测框的索引设为-1，下一步将此框添加到融合框列表
    if best_iou <= match_iou:
        best_iou = match_iou
        best_idx = -1
    # 返回此预测框可以和融合框列表哪个框融合，和相应的IoU值
    return best_idx, best_iou