选择搜索(selective search)python实现

最新推荐文章于 2024-08-30 09:26:17 发布

yyf7329081

最新推荐文章于 2024-08-30 09:26:17 发布

阅读量8.5k

点赞数 7

分类专栏：目标检测文章标签：选择搜索目标检测

本文链接：https://blog.csdn.net/u014796085/article/details/83478583

版权

本文介绍了使用Python实现选择搜索的详细过程，包括图像初步分割、区域相似度计算、邻接列表创建、区域合并和选择搜索算法。通过这段代码，可以理解和应用在目标检测中的候选框生成。

摘要由CSDN通过智能技术生成

前言

最近在学习区域卷积神经网络(RCNN)时，候选框产生使用了选择搜索(selective search)，为了更透彻地理解RCNN的工作原理，所以决定基于python代码，实现选择搜索(selective search)。

简介

关于选择搜索(selective search)的基本原理和初步认知，可以参考以下博客：
https://blog.csdn.net/mao_kun/article/details/50576003

在这里主要结合自己的理解作简要总结和梳理：

使用 Efficient Graph-Based Image Segmentation的方法获取原始分割区域R={r1,r2,…,rn}，具体可见我的另一篇博客：
https://blog.csdn.net/u014796085/article/details/83449972
初始化相似度集合S=∅
计算两两相邻区域之间的相似度，将其添加到相似度集合S中
从相似度集合S中找出，相似度最大的两个区域 ri 和rj，将其合并成为一个区域 rt，从相似度集合中除去原先与ri和rj相邻区域之间计算的相似度，计算rt与其相邻区域（原先与ri或rj相邻的区域）的相似度，将其结果添加的到相似度集合S中。同时将新区域 rt 添加区域集合R中。
重复步骤5，直到S=∅，即最后一个新区域rt为整幅图像。
获取R中每个区域的Bounding Boxes，去除像素数量小于2000，以及宽高比大于1.2的，剩余的框就是物体位置的可能结果L

代码实现与解读

图像初步分割

def _generate_segments(img_path, neighbor, sigma, scale, min_size):  
    # open the Image
    im_mask = graphbased_segmentation(img_path, neighbor, sigma, scale, min_size)
    im_orig = skimage.io.imread(img_path)
    # merge mask channel to the image as a 4th channel
    im_orig = numpy.append(
        im_orig, numpy.zeros(im_orig.shape[:2])[:, :, numpy.newaxis], axis=2)
    im_orig[:, :, 3] = im_mask

    return im_orig

对原图像作图像分割，把分割的每个像素所属区域的编号作为图像的第4通道。

区域相似度的定义

def _calc_colour_hist(img):
    """
        calculate colour histogram for each region

        the size of output histogram will be BINS * COLOUR_CHANNELS(3)

        number of bins is 25 as same as [uijlings_ijcv2013_draft.pdf]

        extract HSV
    """

    BINS = 25
    hist = numpy.array([])

    for colour_channel in (0, 1, 2):

        # extracting one colour channel
        c = img[:, colour_channel]

        # calculate histogram for each colour and join to the result
        hist = numpy.concatenate(
            [hist] + [numpy.histogram(c, BINS, (0.0, 255.0))[0]])

    # L1 normalize
    hist = hist / len(img)

    return hist


def _calc_texture_gradient(img):
    """
        calculate texture gradient for entire image

        The original SelectiveSearch algorithm proposed Gaussian derivative
        for 8 orientations, but we use LBP instead.

        output will be [height(*)][width(*)]
    """
    ret = numpy.zeros((img.shape[0], img.shape[1], img.shape[2]))

    for colour_channel in (0, 1, 2):
        ret[:, :, colour_channel] = skimage.feature.local_binary_pattern(
            img[:, :, colour_channel], 8, 1.0)

    return ret


def _calc_texture_hist(img):
    """
        calculate texture histogram for