论文阅读笔记： Image retrieva using scene graphs

最新推荐文章于 2022-06-27 13:17:08 发布

古承风

最新推荐文章于 2022-06-27 13:17:08 发布

阅读量651

点赞数 1

分类专栏：论文阅读笔记文章标签：计算机视觉信息检索

本文链接：https://blog.csdn.net/qq_34271349/article/details/112563134

版权

论文阅读笔记专栏收录该内容

9 篇文章 1 订阅

订阅专栏

文章目录

摘要
Introduction
- 背景
- 提出和解决问题
正文部分
结论

Johnson J, Krishna R, Stark M et al. Image retrieval using scene graphs[A]. Proceedings of the IEEE conference on computer vision and pattern recognition[C]. 2015: 3668–3678.

关键词

scene graphs
图像检索

摘要

利用 scene graph 进行语义图像检索
对象+属性加联系构成图谱
设计了一个conditional random field graphs 来利用场景图谱进行语义图像检索
利用以上条件对检索结果进行评分排序
引入了一个新的数据集（5000张图片 带有场景图谱）
对小型和全局的场景图谱都做了实验

结果显示优于只用图像特征做检索的方法

此方法还可以用于提高目标定位的效果

Introduction

背景

理想的图像语义检索系统不应该只考虑 $(m a n, b o a t)$ 这种场景，对象之间应该有联系，如 $(m a n o n b o a t)$ ，并且对象应该具有属性 $(b o a t i s w h i t e)$
当前检索图像的效果
上图就是当前图像检索系统的效果，并没有完全考虑对象之间的关系。所以结果并不尽如人意。

[71] C. L. Zitnick and D. Parikh. Bringing semantics into fo- cus using visual abstraction. In Computer Vision and Pat- tern Recognition (CVPR), 2013 IEEE Conference on, pages 3009–3016. IEEE, 2013. 1, 2
[72] C. L. Zitnick, D. Parikh, and L. Vanderwende. Learn- ing the visual interpretation of sentences. In Computer Vi- sion (ICCV), 2013 IEEE International Conference on, pages 1681–1688. IEEE, 2013. 1, 2
[22] A. Geiger, P. Lenz, and R. Urtasun. Are we ready for au- tonomous driving? the kitti vision benchmark suite. In Conference on Computer Vision and Pattern Recognition (CVPR), 2012. 2

应该有办法解决 $（对象、联系、属性）$ 三者之间的关系。以上三个文献通过学习**abstruct scenes ** 向这个目标迈进了重要的一步。

这种模式应用起来对图像理解和图像检索都有好处。

提出和解决问题

提出问题
这种语义推理应用到真实场景中有两个问题：

在一个场景中建立两个对象之间的联系是很困难的，其难度远大于简单的图像配对。
场景图谱不断延申，可能没有尽头

[36] J. D. Lafferty, A. McCallum, and F. C. N. Pereira. Conditional random fields: Probabilistic models for segmenting
and labeling sequence data. In Proceedings of the Eighteenth
International Conference on Machine Learning, ICML ’01, 2001. 1, 5

解决问题 : 作者提出了一个新的语义图像检索框架，其基于CRF ³⁶ 的 visual scene.

CRF ： conditional Random Field

[20] M. Fisher, M. Savva, and P. Hanrahan. Characterizing structural relationships in scenes using graph kernels. In ACM
SIGGRAPH 2011 papers, SIGGRAPH ’11, pages 34:1–
34:12. ACM, 2011. 1
[7] A. X. Chang, M. Savva, and C. D. Manning. Learning spatial
knowledge for text to 3D scene generation. In Proceedings of
the 2014 Conference on Empirical Methods in Natural Language Processing, EMNLP 2014, 2014. 1, 2

灵感来自于 :近来的两篇比较和生成场景的graph based 准则。

方法核心思想:

利用场景图谱作为查询条件，代替用存文本作查询条件。可以更好的体现语义对象之间的关系。

主要贡献：

将CRF 引入基于场景图谱的语义检索SOTA
引入了一个新的数据集

正文部分

如果以后有需要会进行补充，现在只是大体了解这篇文章干了什么工作

结论

利用场景图谱作为视觉场景的新型表示
引入了一个新的数据集
构建了一个CRF模型用于语义图像检索
SOTA

古承风

关注

1
点赞
踩
3

收藏

觉得还不错? 一键收藏
打赏
0
评论
论文阅读笔记： Image retrieva using scene graphs

文章目录摘要Introduction背景提出和解决问题正文部分结论Johnson J, Krishna R, Stark M et al. Image retrieval using scene graphs[A]. Proceedings of the IEEE conference on computer vision and pattern recognition[C]. 2015: 3668–3678.关键词scene graphs图像检索摘要利用 scene graph
复制链接

扫一扫