Bags of Local Convolutional Features for Scalable Instance Search 论文解读
Mohedano E, Mcguinness K, O’Connor N E, et al. Bags of Local Convolutional Features for Scalable Instance Search[C]// ACM on International Conference on Multimedia Retrieval. ACM, 2016:327-331.
相关
1 Motivation
- 1 当前的TRECVid 实例检索系统仍然使用基于聚合局部特征来实现,比如SIFT等,其中原因还是由于高维特征是稀疏的,容易线性可分
- 2 时下卷积神经网络大热,而且在image retrieval 上取得不错的效果
2 Contribution
1 Sparse visual representation based on a Bags of Convolutional Features, which allows fast retrieval by means of an inverted index
2 Assignment map as a new compact representation of the image
3 Local analysis of multiple image regions for reranking followed by query expansion using the obtained object locations
3 Pipeline
图像特征的描述:
*输入一张图片,经过一个特定的CNN网络,得到某一卷积层响应的feature map,我们提出feature map中的Local CNN feature,并对其进行K-means 聚类,生成聚类中心;再把每一个local cnn feature 映射到聚类中心生成Assignment Map;最后使用BOW的思想统计聚类中心的词频,得到最终的图像特征描述。
4 Instance retrieval
4.1 Initial Search
1 Global search(GS) :The BoW vector of the query is built with the visual words of all the local CNN features in the convolutional layer extracted for the query image.
2 Local search(LS) :The BoW vector of the query contains only the visual words of the local CNN features that fall inside the query bounding box.
注:
所谓的GS,指对query处理时,使用整张query 图片的bow特征作为查询
所谓的LS,指对query 处理时,使用query图片中bounding box的bow特征作为查询
4.2 Local reranking
此论文采用
W∈{W,W/2,W/4}
,
H∈{H,H/2,H/4}
的宽高组合来划分区域。并对划分得到的区域进行筛选,筛选方法如下:
其中: ARq=WqHq , ARw=WwHw ,当某窗口得分大于某一阈值时,保留此窗口
另外本文还借鉴Spatial pyramid matching 对保留的窗口进行划分,采用了L=2的分辨率leval,即整个窗口和4个窗口子区域。分别再统计每个子窗口的BOW特征,并对不同的子窗口的bow特征赋予不同的权重。权重函数直接采用Spatial pyramid matching论文中的权重函数:
其中: wr 指权重,L=2,l_r指当前子窗口的分辨率
空间金字塔匹配请参考:http://blog.csdn.net/chlele0105/article/details/16972695
得到窗口的BOW特征之后,和query特征计算余弦相似度,得分最高的窗口作为最终目标的定位。
4.3 Query expansion
- 1 Global query expansion(GQE)
The BoW vectors of the N images at the top of the ranking are averaged together with the BoW of the query to form the new representation for the query.
GQE 指使用local rerank得到的前五张图片的全局特征与query特征做平均,重新生成query特征。
- 2 Local query expansion(LQE)
Locations obtained in the local reranking step are used to mask out the background and build the BoW descriptor of only the region of interest of the N-top images in the ranking.
LQE 指使用local rerank得到的前五张图片中定位的局部特征(例如使用上图中红色框内的BOW特征)与query特征做平均,重新生成query特征。
5 Experiments
R 表示local reranking
与state of art 相比