[image retrieval]ranking - geo verification - reranking

Bag-of-features processing+tf-idf weighting -> sparse frequency vector ->
querying (inverted file index) (searching and ranking)->
geo verification for the shorted list -> Re-ranked list

overview of image retrieval:
- recent large-scale image retrieval algorithms may be categorized into two lines:
- 1) compact hashing of global features such as GIST/ color histogram;
- 2) efficient indexing of local features by a vocabulary tree - BOW
- ad: capable of finding images of the same objects or scenes undergoing different
capturing conditions
- dis: 1. lots of memory usage for the inverted indexes of a large number of visual words
- 2. the geometric relations of the local features or their spatial layout are largely ignored. Therefore, a post re-ranking procedure
- conventional re-ranking approach
- first matched the local feature descriptors by GHT(refer to lowel 2004); second a RANSAC procedure fit a global affine transform
- re-ranked according to the number of inliers in the RANSAC or fitting errors
- 问题是,first high dimensional descriptors will lead to complex computation in this procedure (really?…) Second, the assumption of a global affine transform between two image may not hold e.g., for images of a 3D object from different view angles.(whats that!)

——–

ranking

tf-idf weighting - find well-represent words - 用于image retrieval 大概是在matching的时候给feature vector权重?
  • def : a numerical statistic that is intended to reflect how important a word is to a document in a collection or corpus, used as a weighting factor in information retrieval
    • summing tf-idf -> the simplest ranking function
    • tf–idf is the product of two statistics, term frequency(词频) and inverse document frequency (逆向文件頻率). Various ways for determining the exact values of both statistics exist.
    • TF-IDF的主要思想是:如果某个词或短语在一篇文章中出现的频率TF高(say m times),并且在其他文章中很少出现(如果包含词条t的文档越少(say n times in total),也就是n=m+k越小,IDF越大),则认为此词或者短语具有很好的类别区分能力,适合用来分类。
    • 反正就是要想要一个词在某文章中经常出现,但是在其它文章里面又不经常出现。
  • drawbacks: when m become large, n is also large which make IDF small, it seems that the word is bad but since that m is large, it can actually represent it well
  • if n is large, then the word is called“应删除词”(Stopwords) – TF-IDF 的作用之一就是剔除掉stopwords,举个栗子,像“的”“是”这样的东西
  • 在向量空間模型裡的應用
    TF-IDF權重計算方法經常會和餘弦相似性(cosine similarity)一同使用於向量空間模型中,用以判斷兩份文件之間的相似性。
余弦相似性 - cal similarity (一百种计算similarity的方法之一)
  • 通过测量两个向量内积空间的夹角的余弦值来度量它们之间的相似性。两个向量有相同的指向时,余弦相似度的值为1;两个向量夹角为90°时,余弦相似度的值为0;两个向量指向完全相反的方向时,余弦相似度的值为-1。that is, similarity= cos(θ)
顺便提一下神奇的tf-idf weighting 还是convert the textual representation of information into a Vector Space Model (VSM), or into sparse features的方法之一
Vector Space Model (VSM) 有提到是用在represent document by vectors: an algebraic model representing textual information as a vector, the components of this vector could represent the importance of a term (tf–idf) or even the absence or presence (Bag of Words) of it in a document - 但其实并没有很懂,可以看一下 http://blog.christianperone.com/?p=1589


geo verification

method 1 generalized hough transformation;
method 2 RANSAC

Hough transformation
http://blog.csdn.net/marvin521/article/details/9071405
generalized hough transformation
http://blog.csdn.net/u010278305/article/details/42741315
RANSAC
RANSAC(RandomSample Consensus),随机采样一致算法 可以看:
http://www.micc.unifi.it/delbimbo/wp-content/uploads/2011/10/slide_corso/A34%20Geometric%20verification.pdf
potential drawback of using ransac in image retrival: 基于RANSAC的几何验证或者匹配验证计算复杂度高
空间几何信息检验:

认为如果两幅图像描述的是同一个物体或场景,由于图片大小,位置和角度的不同,两幅图像中相应的兴趣点的位置信息应该满足某一个仿射变换关系。因此,我们以两幅图像之间的特征匹配作为数据来估计仿射变换矩阵by ransac (or hough),如果能找到一个足够多特征匹配都服从的仿射变换,则认为当前匹配让人满意。
http://blog.csdn.net/giser_whu/article/details/25424885

但是之后到底怎么reranking呢!



BTW:
sift after matching:in Lowel’s SIFT paper, he use Hough transformation for
http://www.cs.toronto.edu/~jepson/csc2503/tutSIFT04.pdf
Recognition using SIFT features
- Compute SIFT features on the input image
- Match these features to the SIFT feature database
- Each keypoint specifies 4 parameters: 2D location,scale, and orientation.
- To increase recognition robustness: generalized Hough transform to identify clusters of matches that vote for the same object pose.
- Each keypoint votes for the set of object poses that are consistent with the keypoint’s location, scale, and orientation.
- Locations in the Hough accumulator that accumulate at least 3 votes are selected as candidate object/pose matches.
- A verification step matches the training image for the hypothesized object/pose to the image using a least-squares fit to the hypothesized location, scale, and orientation of the object.



possible modification – no idea

在图像检索中,通常使用词袋模型(BagOfWords,简称BOW)对图像进行描述,得到检索结果之后使用RANSAC(RANdomSAmpleConsensus,简称RANSAC)进行几何验证或者进行匹配验证实现重排序。这一检索框架存在三方面的不足:1)词袋模型完全忽略了图像中的空间结构信息,在图像的特征表示上没有充分利用空间信息增强判别性;2)面向规模较大的图像检索问题,需要相应的大规模的视觉词典,直接针对视觉词的度量方法,其计算复杂度高;3)基于RANSAC的几何验证或者匹配验证计算复杂度高。后两条导致检索效率不高。
针对以上三点不足,本文主要研究如何利用空间信息提高图像的判别性表示,如何利用哈希算法加快图像的检索速度,如何利用空间位置的粗匹配,加快图像验证。本文进一步研究了如何利用哈希算法解决自然场景中中文字符识别问题。
本文的工作主要集中在以下两个方面: (一)设计一个融合空间判别性信息的图像检索框架:在第一层使用粗粒度的几何信息,设计了空间最小哈希方法。哈希表示是词袋模型的零阶逼近,它随机的抽取了词袋表示的部分视觉词进行比较,提高了计算速度,然而丧失了部分判别信息。为了增加哈希表示的判别性,本文将图像先进行空间金字塔表示,然后在各个局部空间进行最小哈希算法,改善了检索的性能。在第二层图像验证层,使用细粒度的空间信息–局部空间金字塔表示,进行图像之间的配准验证。利用最大极值稳定区域(MaximallyStableExtremalRegions,简称MSER)和角点之间的空间位置关系,进行配准验证。该验证避免了图像之间所有点的完全匹配验证,通过特征分层验证,降低了计算量,加快了验证速度。 (二)针对于自然场景中文字符识别存在的字体不一致、数据集不平衡、常用中文字符类别多、类内样本少等问题,本文将图像检索的技术应用在自然场景汉字识别中。利用迭代量化算法用于中文字符识别,并结合编辑距离对识别的结果进行纠正。


https://www.robots.ox.ac.uk/~vgg/publications/papers/philbin07.pdf
Object retrieval with large vocabularies and fast spatial matching
似乎有介绍reranking的detail但是还没看

https://lear.inrialpes.fr/pubs/2010/JDS10a/jegou_improvingbof_preprint.pdf
作者说可以significantly speeds up the assignment of the descriptors to visual words.
似乎是BOW 的优化但是也没看(……)

之所以要用inverted file index是因为有millions of visual words 所以bow histogram will be very sparse so that inverted indexes are suited to implement the indexing and searching efficiently.

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值