计算anchor和ground truth box之间overlap的cython加速方法

最新推荐文章于 2023-06-11 14:05:29 发布

WeissSama

最新推荐文章于 2023-06-11 14:05:29 发布

阅读量1.9k

点赞数 1

分类专栏：算法 Deep Learning

本文链接：https://blog.csdn.net/Bismarckczy/article/details/85249143

版权

Deep Learning 同时被 2 个专栏收录

44 篇文章 1 订阅

订阅专栏

算法

23 篇文章 0 订阅

订阅专栏

在训练RetinaNet的过程中，每一种图片的操作步骤大概可以分成这样几步：
1 获取ground truth的四个顶点坐标先算出在原图上的比例，然后乘以我们要resize的图片大小比如512，这样便得到，四个顶点坐标。
2 生成anchors，计算N个anchors和该图片中M个gt_boxes的overlap，得到一个(N,M)的矩阵，根据交集的大小，来给每一个anchors分配一个gt_box,并且把这个gt_box的类也分配给那个anchors。最终这一步的输出时(N,5)，表示每一个anchor的gt_box坐标和所属的类。

3 进行bbox回归，也就是用bbox_transform函数计算出四个变换参数
4 将这四个变换参数以及类的信息，送入RetinaNet，与RetinaNet的p3到p5作用求出smooth L1 loss和cross entropy loss。

在两块Titan X上训练
对于单张图片，在上述几步中，耗时大概是0.05+1.5+0.7+0.1~~2.4 s
一张图片训练2.4s，那一个1万的数据集，单个epoch就要花6.6个小时。
unacceptable ！
所以要提高2，3步骤中计算overlap和计算bbox的效率，尤其是overlap。
先放一下python实现overlap的代码

def compute_overlap(a, b):
    #a [N,4]
    #b [M,4]
    area = (b[:, 2] - b[:, 0] + 1) * (b[:, 3] - b[:, 1] + 1)
    iw = np.minimum(np.expand_dims(a[:, 2], axis=1), b[:, 2]) - np.maximum(np.expand_dims(a[:, 0], axis=1), b[:, 0]) + 1
    ih = np.minimum(np.expand_dims(a[:, 3], axis=1), b[:, 3]) - np.maximum(np.expand_dims(a[:, 1], axis=1), b[:, 1]) + 1
    # 假设a的数目是N，b的数目是M
    # np.expand_dims((N,),axis=1)将(N,)变成(N,1)
    # np.minimum((N,1),(M,)) 得到 (N M) 的矩阵 代表a和b逐一比较的结果
    # 取x和y中较小的值 来计算intersection
    # iw和ih分别是intersection的宽和高 iw和ih的shape都是(N,M), 代表每个anchor和groundTruth之间的intersection
    iw = np.maximum(iw, 0)
    ih = np.maximum(ih, 0) #不允许iw或者ih小于0

    ua = np.expand_dims((a[:, 2] - a[:, 0] + 1) *(a[:, 3] - a[:, 1] + 1), axis=1) + area - iw * ih
    # 并集的计算 S_a+S_b-interection_ab
    ua = np.maximum(ua, np.finfo(float).eps)

    intersection = iw * ih
    return intersection / ua # (N,M)

再看cython的实现

# --------------------------------------------------------
# Fast R-CNN
# Copyright (c) 2015 Microsoft
# Licensed under The MIT License [see LICENSE for details]
# Written by Sergey Karayev
# --------------------------------------------------------

cimport cython
import numpy as np
cimport numpy as np


def compute_overlap(
    np.ndarray[double, ndim=2] boxes,
    np.ndarray[double, ndim=2] query_boxes
):
    """
    Args
        a: (N, 4) ndarray of float
        b: (K, 4) ndarray of float

    Returns
        overlaps: (N, K) ndarray of overlap between boxes and query_boxes
    """
    cdef unsigned int N = boxes.shape[0]
    cdef unsigned int K = query_boxes.shape[0]
    cdef np.ndarray[double, ndim=2] overlaps = np.zeros((N, K), dtype=np.float64)
    cdef double iw, ih, box_area
    cdef double ua
    cdef unsigned int k, n
    for k in range(K):
        box_area = (
            (query_boxes[k, 2] - query_boxes[k, 0] + 1) *
            (query_boxes[k, 3] - query_boxes[k, 1] + 1)
        )
        for n in range(N):
            iw = (
                min(boxes[n, 2], query_boxes[k, 2]) -
                max(boxes[n, 0], query_boxes[k, 0]) + 1
            )
            if iw > 0:
                ih = (
                    min(boxes[n, 3], query_boxes[k, 3]) -
                    max(boxes[n, 1], query_boxes[k, 1]) + 1
                )
                if ih > 0:
                    ua = np.float64(
                        (boxes[n, 2] - boxes[n, 0] + 1) *
                        (boxes[n, 3] - boxes[n, 1] + 1) +
                        box_area - iw * ih
                    )
                    overlaps[n, k] = iw * ih / ua
    return overlaps

cython的使用也比较简单，我们需要写一个setup.py文件将.pyx文件转化为.c文件，同时还会生成一个.so文件，这个so文件使我们import时候用的，它是c文件和我们python代码之间的桥梁，实现直接调用c代码。
setup文件有两种写法，我只写其中一种

from distutils.core import setup, Extension
from Cython.Build import cythonize
import numpy as np

setup(ext_modules=cythonize("compute_overlap.pyx"),include_dirs=[np.get_include()])
#setup(ext_modules = cythonize("bbox_transform.pyx"),include_dirs=[np.get_include()])

然后在命令行输入

python setup.py build

然后我们就会发现pyx的文件夹下面多了一个compute_overlap.c文件，还有一个build文件夹，进入build文件夹之后，有个lib文件，便是我们生成的python库。

drwxrwxr-x 3 zhaoyang zhaoyang 4096 Dec 25 12:09 lib.linux-x86_64-3.6

进入到这个lib文件的最底部会发现有我们需要的.so文件

-rwxrwxr-x 1 zhaoyang zhaoyang 180328 Dec 25 12:09 compute_overlap.cpython-36m-x86_64-linux-gnu.so

将这个文件cp到跟.c文件一个文件夹下。
然后我们就可以通过import compute_overlap来调用pyx文件对应的c文件了。
注意这个import是import的pyx文件，比如我们pyx文件名是compute_overlap.pyx，里面有很多函数，其中之一叫compute_overlap(a,b).
所以在调用compute_overlap函数的时候，是compute_overlap.compute_overlap(a,b)，前者表示文件，后者代表函数。

cython版本的compute_overlap函数比python版本的快了五倍，单张图片overlap计算从1.5s降到0.3s左右。

另外bbox_transform.pyx和bbox_transform_inv.pyx如下

这两个函数对性能提升不如compute_overlap显著，不过也缩短了一半，bbox_transform从0.7s变成了0.3s。

所以之前单张图片训练耗时2.4s，现在是2.4-1.2-0.4~~0.8s

认识cython之前感觉人生被浪费了

import numpy as np

def bbox_transform(ex_rois, gt_rois):
    '''
    Receives two sets of bounding boxes, denoted by two opposite corners
    (x1,y1,x2,y2), and returns the target deltas that Faster R-CNN should aim
    for.
    '''
    ex_widths = ex_rois[:, 2] - ex_rois[:, 0] + 1.0
    ex_heights = ex_rois[:, 3] - ex_rois[:, 1] + 1.0
    ex_ctr_x = ex_rois[:, 0] + 0.5 * ex_widths
    ex_ctr_y = ex_rois[:, 1] + 0.5 * ex_heights

    gt_widths = gt_rois[:, 2] - gt_rois[:, 0] + 1.0
    gt_heights = gt_rois[:, 3] - gt_rois[:, 1] + 1.0
    gt_ctr_x = gt_rois[:, 0] + 0.5 * gt_widths
    gt_ctr_y = gt_rois[:, 1] + 0.5 * gt_heights

    targets_dx = (gt_ctr_x - ex_ctr_x) / ex_widths
    targets_dy = (gt_ctr_y - ex_ctr_y) / ex_heights
    targets_dw = np.log(gt_widths / ex_widths)
    targets_dh = np.log(gt_heights / ex_heights)

    targets = np.vstack(
        (targets_dx, targets_dy, targets_dw, targets_dh)).transpose()
    mean = np.array([0, 0, 0, 0])
    std = np.array([0.1, 0.1, 0.2, 0.2])

    return (targets-mean)/std

import numpy as np


def bbox_transform_inv(boxes, deltas, mean=None, std=None):
    if mean is None:
        mean = np.array([0, 0, 0, 0], dtype=np.float32)
    if std is None:
        std = np.array([0.1, 0.1, 0.2, 0.2], dtype=np.float32)

    widths = boxes[:, 2] - boxes[:, 0] + 1.0
    heights = boxes[:, 3] - boxes[:, 1] + 1.0
    ctr_x = boxes[:, 0] + 0.5 * widths
    ctr_y = boxes[:, 1] + 0.5 * heights

    dx = deltas[:, :, 0] * std[0] + mean[0]
    dy = deltas[:, :, 1] * std[1] + mean[1]
    dw = deltas[:, :, 2] * std[2] + mean[2]
    dh = deltas[:, :, 3] * std[3] + mean[3]

    pred_ctr_x = ctr_x + dx * widths
    pred_ctr_y = ctr_y + dy * heights
    pred_w = np.exp(dw) * widths
    pred_h = np.exp(dh) * heights

    pred_boxes = np.zeros(deltas.shape, dtype=deltas.dtype)

    pred_boxes_x1 = pred_ctr_x - 0.5 * pred_w
    pred_boxes_y1 = pred_ctr_y - 0.5 * pred_h
    pred_boxes_x2 = pred_ctr_x + 0.5 * pred_w
    pred_boxes_y2 = pred_ctr_y + 0.5 * pred_h

    pred_boxes = np.stack([pred_boxes_x1, pred_boxes_y1,
                           pred_boxes_x2, pred_boxes_y2], axis=2)

    return pred_boxes

WeissSama

关注

1
点赞
踩
10

收藏

觉得还不错? 一键收藏
0
评论
计算anchor和ground truth box之间overlap的cython加速方法

在训练RetinaNet的过程中，每一种图片的操作步骤大概可以分成这样几步：1 获取ground truth的四个顶点坐标先算出在原图上的比例，然后乘以我们要resize的图片大小比如512，这样便得到，四个顶点坐标。2 生成anchors，计算N个anchors和该图片中M个gt_boxes的overlap，得到一个(N,M)的矩阵，根据交集的大小，来给每一个anchors分配一个gt_b...
复制链接

扫一扫