论文阅读-场景图谱-图谱生成：Factorizable Net: An Efficient Subgraph-based Framework for Scene Graph Generation

最新推荐文章于 2024-07-26 09:59:38 发布

古承风

最新推荐文章于 2024-07-26 09:59:38 发布

阅读量692

点赞数

分类专栏：论文阅读笔记文章标签：计算机视觉机器学习信息检索

本文链接：https://blog.csdn.net/qq_34271349/article/details/112574047

版权

论文阅读笔记专栏收录该内容

9 篇文章 1 订阅

订阅专栏

文章目录

摘要
引言

[1] Li Y, Ouyang W, Zhou B等. Factorizable Net: An Efficient Subgraph-Based Framework for Scene Graph Generation[J]. Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), 2018, 11205 LNCS: 346–363.

摘要

之前的图谱构建方法复杂计算量大依赖外部数据。限制了应用
本文专注于效率提升
提出了一种基于子图的连接图，在推理的过程中简洁的表示场景图谱

方法简介

子图聚类挑选代表用较少的子图代表全局
SMP ：利用子图维护的空间信息 SRI : 空间敏感关系推理促进关系识别
SOTA

代码地址：代码链接

引言

进来的构建图像内对象之间关系的方法

[6] Dai, B, Zhang, Y, Lin, D. Detecting visual relationships with deep relational networks. CVPR（2017)
[28] Krishna. R. Zhu. Y. groth. O. Johnson. . Hata. K. Kravitz. J. Chen. S Kalantidis, Y, Li, L.J.， Shamma, D A, et al. Visual genome:Connecting language and vision using crowdsourced dense image annotations. IJCV（2017)
[34] Li, Y., Ouyang, W., Wang, X., Tang, X.: Vip-cnn: Visual phrase guided convolutional neural network. CVPR (2017)
[35] Li, Y., Ouyang, W., Zhou, B., Wang, K., Wang, X.: Scene graph generation from objects, phrases and region captions. In: ICCV (2017)
[37] Lu, C., Krishna, R., Bernstein, M., Fei-Fei, L.: Visual relationship detection with language priors. In: ECCV (2016)
[58] Xu, D., Zhu, Y., Choy, C.B., Fei-Fei, L.: Scene graph generation by iterative message passing. CVPR (2017)
[64] Zhuang, B., Liu, L., Shen, C., Reid, I.: Towards context-aware interaction recognition. ICCV (2017)

场景图谱用于图像检索

[26] Johnson, J., Krishna, R., Stark, M., Li, L.J., Shamma, D.A., Bernstein, M.S., FeiFei, L.: Image retrieval using scene graphs. In: CVPR (2015)
[45] Ramanathan, V., Li, C., Deng, J., Han, W., Li, Z., Gu, K., Song, Y., Bengio,S., Rossenberg, C., Fei-Fei, L.: Learning semantic relationships for better actionretrieval in images. In: CVPR (2015)

生成场景图谱的两种方法
1. 两步法
先探测对象然后建立连接

[6] Dai, B, Zhang, Y, Lin, D. Detecting visual relationships with deep relational networks. CVPR（2017)
[36] Liao, W., Shuai, L., Rosenhahn, B., Yang, M.Y.: Natural language guided visual relationship detection. arXiv preprint arXiv:1711.06032 (2017)
[37] Lu, C., Krishna, R., Bernstein, M., Fei-Fei, L.: Visual relationship detection with language priors. In: ECCV (2016)
[58] Xu, D., Zhu, Y., Choy, C.B., Fei-Fei, L.: Scene graph generation by iterative message passing. CVPR (2017)
[62] Yu, R., Li, A., Morariu, V.I., Davis, L.S.: Visual relationship detection with internal and external linguistic knowledge distillation. ICCV (2017)

2. 协同推理
jointly infer the objects and their relationships [34,35,58] based on the object region proposals.
基于region proposals 的协同推理对象之间的关系

[34]:Li, Y., Ouyang, W., Wang, X., Tang, X.: Vip-cnn: Visual phrase guided convolutional neural network. CVPR (2017)
[35]:Li, Y., Ouyang, W., Zhou, B., Wang, K., Wang, X.: Scene graph generation from objects, phrases and region captions. In: ICCV (2017)
[58]:Xu, D., Zhu, Y., Choy, C.B., Fei-Fei, L.: Scene graph generation by iterative message passing. CVPR (2017)

为了生成完整的场景图谱，两种方法都应该被考虑到。利用共视区域的特征。

但是当数据量变得大的时候，这个问题就变得很棘手。利用更少的对象或者利用一些简单的过滤方法过滤掉一些pairs是一种方案，但是这两种方法都会降低模型的表现。

关键点： 找到更简洁的场景图谱的中间表示应该是解决问题的关键

方法核心思想
子图聚类，共享文本表示。显著提高了效率。

Spatial-weighted message passing SMP 用于保持子图之间的空间信息。

空间信息在谓词识别中已经证明很有作用。

Dai, B., Zhang, Y., Lin, D.: Detecting visual relationships with deep relational networks. CVPR (2017)
Liao, W., Shuai, L., Rosenhahn, B., Yang, M.Y.: Natural language guided visual relationship detection. arXiv preprint arXiv:1711.06032 (2017)
Yu, R., Li, A., Morariu, V.I., Davis, L.S.: Visual relationship detection with internal and external linguistic knowledge distillation. ICCV (2017)

为了利用空间信息，Spatial-sensitive Relation Inference SRI 算法被设计。它融合了object features pairs 和 subgraph features 用于最后的关系推理。

小结