基于神经网络的图像检索 Neural Codes for Image Retrieval

Neural Codes for Image Retrieval

example-e1404721339557
An example of the retrieval with neural codes. Each row shows the query on the left and the images from the INRIA dataset with the most similar neural codes to the right. The three rows correspond to the three layers in the convolutional neural network.

In this project, we investigate what are the best ways to get global (holistic) image descriptors out of deep neural networks. We aim at relatively low-dimensional descriptors (e.g. 128-256 dimensions to describe an image).

The current “best” way that we have found is summarized in the ICCV 2015 paper. In short, the descriptors are based on the last convolutional layer of a pretrained deep network, sum-pooling, and simple post-processing. We call the resulting descriptor SPoC (for Sum-Pooling oConvolutional features).

In the ICCV paper, we obtained very competitive retrieval accuracy while using a network pretrained on ImageNet and not fine-tuned for buildings/landmarks. However, in our earlier ECCV 2014 paper, we have shown that such fine-tuning is beneficial and have collected a special dataset suitable for such finetuning (the Landmarks dataset). Note that at the moment we recommend SPoC (and similar) descriptors over using outputs of the fully-connected layers (as in the ECCV 2014 paper) when generic image descriptors are required. Fine-tuning on related dataset should be used if maximal performance is desired, as our preliminary experiments suggest that SPoC also benefit from fine-tuning though to a lesser degree than fully-connected features.

Papers:
A. Babenko and V. Lempitsky. Aggregating local deep features for image retrieval. IEEE International Conference on Computer Vision (ICCV), Santiago de Chile, 2015
arXiv:1510.07493

A. Babenko, A. Slesarev, A. Chigorin and V. Lempitsky. Neural Codes for Image Retrieval, European Conference on Computer Vision (ECCV), Zurich, 2014
Paper

Code:
The code and usage example for the SPOC descriptor: https://github.com/arbabenko/Spoc

The Landmarks Dataset: 
Tab-separated file
Format of each line: url image_id class_name (in Russian)
Note: Oxford-related classes were manually removed for our experiments in the ECCV 2014 paper.

Related work:
Several other groups have been investigating similar directions recently and in parallel. Some very interesting results have been presented by the KTH Stockholm group:
http://www.csc.kth.se/cvap/cvg/DL/ots/

from: http://sites.skoltech.ru/compvision/projects/neuralcodes/

  • 1
    点赞
  • 2
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值