The More You Know: Using Knowledge Graphs for Image Classification 论文总结

最新推荐文章于 2020-11-02 10:29:44 发布

Jay_Tang

最新推荐文章于 2020-11-02 10:29:44 发布

阅读量2.5k

点赞数 18

分类专栏：图神经网络文章标签：机器学习深度学习图像识别

本文链接：https://blog.csdn.net/Jay_Tang/article/details/108353255

版权

文章目录

往期文章链接目录

Overview

Note: This previous post I wrote might be helpful for reading this paper summary:

Introduction to Graph Neural Network (GNN)

This paper investigates the use of structured prior knowledge in the form of knowledge graphs and shows that using this knowledge improves performance on image classification.

It introduce the Graph Search Neural Network (GSNN) as a way of efficiently incorporating large knowledge graphs into a vision classification pipeline, which outperforms standard neural network baselines for multi-label classification.

Intuition

While modern learning-based approaches can recognize some categories with high accuracy, it usually requires thousands of labeled examples for each of these categories. This approach of building large datasets for every concept is unscalable. One way to solve this problem is to use structured knowledge and reasoning (prior knowledge, this is what human usually do but current approaches do not).

For example, when people try to identify the animal shown in the figure, they will first recognize the animal, then recall relevant knowledge, and finally reason about it. With this information, even if we have only seen one or two pictures of this animal, we would be able to classify it. So we hope a model could also have similar reasoning process.

Previous Work

There has been a lot of work in end-to-end learning on graphs or neural network trained on graphs. Most of these approaches either extract features from the graph or they learn a propagation model that transfers evidence between nodes conditional on the type of edge. An example of this is the Gated Graph Neural Network which takes an arbitrary graph as input. Given some initialization specific to the task, it learns how to propagate information and predict the output for every node in the graph.

Previous works are focusing on building and then querying knowledge bases rather than using existing knowledge bases as side information for some vision task.

This work not only uses attribute relationships that appear in our knowledge graphs, but also uses relationships between objects and reasons directly on graphs rather than using object-attribute pairs directly.

Major Contribution

The introduction of the GSNN as a way of incorporating potentially large knowledge graphs into an end-to-end learning system that is computationally feasible for large graphs;
Provide a framework for using noisy knowledge graphs for image classification (In vision problems, graphs encode contextual and common-sense relationships and are significantly larger and noisier);
The ability to explain image classifications by using the propagation model (Interpretability).

Graph Search Neural Network (GSNN)

GSNN Explanation

The idea is that rather than performing the recurrent update over all of the nodes of the graph at once, it starts with some initial nodes based on the input and only choose to expand nodes which are useful for the final output. Thus, the model only compute the update steps over a subset of the graph.

Steps in GSNN:

Determine Initial Nodes

They determine initial nodes in the graph based on likelihood of the visual concept being present as determined by an object detector or classifier. For their experiments, they use Faster R-CNN for each of the 80 COCO categories. For scores over some chosen threshold, they choose the corresponding nodes in the graph as our initial set of active nodes. Once they have initial nodes, they also add the nodes adjacent to the initial nodes to the active set.

Propagation

Given the initial nodes, they want to first propagate the beliefs about the initial nodes to all of the adjacent nodes (propagation network). This process is similar to GGNN.

Decide which nodes to expand next

After the first time step, they need a way of deciding which nodes to expand next. Therefore, a per-node scoring function is learned to estimates how “important” that node is. After each propagation step, for every node in the current graph, the model predict an importance score

$i_{v}^{(t)}=g_{i}\left(h_{v}, x_{v}\right)$

where $g_{i}$

最低0.47元/天解锁文章

Jay_Tang

关注

18
点赞
踩
5

收藏

觉得还不错? 一键收藏
2
评论
The More You Know: Using Knowledge Graphs for Image Classification 论文总结

文章目录往期文章链接目录OverviewIntuitionPrevious WorkMajor ContributionGraph Search Neural Network (GSNN)GSNN ExplanationThree networksDiagram visualizationAdvantageIncorporate the graph network into an image pipelineDatasetConclusion往期文章链接目录往期文章链接目录OverviewThis p
复制链接

扫一扫