前言
最近在学习区域卷积神经网络(RCNN)时,候选框产生使用了选择搜索(selective search),为了更透彻地理解RCNN的工作原理,所以决定基于python代码,实现选择搜索(selective search)。
简介
关于选择搜索(selective search)的基本原理和初步认知,可以参考以下博客:
https://blog.csdn.net/mao_kun/article/details/50576003
在这里主要结合自己的理解作简要总结和梳理:
- 使用 Efficient Graph-Based Image Segmentation的方法获取原始分割区域R={r1,r2,…,rn},具体可见我的另一篇博客:
https://blog.csdn.net/u014796085/article/details/83449972 - 初始化相似度集合S=∅
- 计算两两相邻区域之间的相似度,将其添加到相似度集合S中
- 从相似度集合S中找出,相似度最大的两个区域 ri 和rj,将其合并成为一个区域 rt,从相似度集合中除去原先与ri和rj相邻区域之间计算的相似度,计算rt与其相邻区域(原先与ri或rj相邻的区域)的相似度,将其结果添加的到相似度集合S中。同时将新区域 rt 添加区域集合R中。
- 重复步骤5,直到S=∅,即最后一个新区域rt为整幅图像。
- 获取R中每个区域的Bounding Boxes,去除像素数量小于2000,以及宽高比大于1.2的,剩余的框就是物体位置的可能结果L
代码实现与解读
图像初步分割
def _generate_segments(img_path, neighbor, sigma, scale, min_size):
# open the Image
im_mask = graphbased_segmentation(img_path, neighbor, sigma, scale, min_size)
im_orig = skimage.io.imread(img_path)
# merge mask channel to the image as a 4th channel
im_orig = numpy.append(
im_orig, numpy.zeros(im_orig.shape[:2])[:, :, numpy.newaxis], axis=2)
im_orig[:, :, 3] = im_mask
return im_orig
对原图像作图像分割,把分割的每个像素所属区域的编号作为图像的第4通道。
区域相似度的定义
def _calc_colour_hist(img):
"""
calculate colour histogram for each region
the size of output histogram will be BINS * COLOUR_CHANNELS(3)
number of bins is 25 as same as [uijlings_ijcv2013_draft.pdf]
extract HSV
"""
BINS = 25
hist = numpy.array([])
for colour_channel in (0, 1, 2):
# extracting one colour channel
c = img[:, colour_channel]
# calculate histogram for each colour and join to the result
hist = numpy.concatenate(
[hist] + [numpy.histogram(c, BINS, (0.0, 255.0))[0]])
# L1 normalize
hist = hist / len(img)
return hist
def _calc_texture_gradient(img):
"""
calculate texture gradient for entire image
The original SelectiveSearch algorithm proposed Gaussian derivative
for 8 orientations, but we use LBP instead.
output will be [height(*)][width(*)]
"""
ret = numpy.zeros((img.shape[0], img.shape[1], img.shape[2]))
for colour_channel in (0, 1, 2):
ret[:, :, colour_channel] = skimage.feature.local_binary_pattern(
img[:, :, colour_channel], 8, 1.0)
return ret
def _calc_texture_hist(img):
"""
calculate texture histogram for