[论文精读]Contrastive Graph Neural Network Explanation

夏莉莉iy

于 2024-10-05 02:38:06 发布

阅读量669

点赞数 9

分类专栏：论文精读文章标签：深度学习人工智能计算机视觉笔记机器学习神经网络图论

本文链接：https://blog.csdn.net/Sherlily/article/details/142707183

版权

论文精读专栏收录该内容

79 篇文章 9 订阅

订阅专栏

论文网址：[2010.13663] Contrastive Graph Neural Network Explanation (arxiv.org)

论文代码：GitHub - lukasjf/contrastive-gnn-explanation

英文是纯手打的！论文原文的summarizing and paraphrasing。可能会出现难以避免的拼写错误和语法错误，若有发现欢迎评论指正！文章偏向于笔记，谨慎食用

2.5.1. CoGE Implementation

2.5.2. Qualitative Analysis

2.6. Conclusion

3. Reference

1. 心得

（1）正文只有四页！冲刺！

（2）一个非常简单的损失~

2. 论文逐段精读

2.1. Abstract

①They think occlusion fail for one elimination results large difference

②They called the situation as Distribution Compliant Explanation (DCE), and they only use data consistent with the training distribution for model interpretation

③They proposed a Contrastive GNN Explanation (CoGE) technique

2.2. Introduction

①⭐Occlusion can be used in GNN explaination, but it's too extreme that one node might greatly change the structure of a sparse graph

②⭐Excluding one edge in a graph may cause disconnective graph

③⭐CoGE searches the similarity between nodes in the same label and dissimilarity between nodes in different label

④Edge explaination methods:

2.3. Related work

（1）Graph Neural Networks

（2）Explainability Methods for Graphs

（3）Adversarial Graph Attacks

2.4. Method

（1）Preliminaries

①Considering undirected graph $G=\left ( V ,E \right )$ with node set $V$ and edge set $E$

②Feature matrix $X$

（2）Explanations for graph classification

①They measure the similarity by Optimal Transport (OT) distance

②A example of how to calculate the distance between left graph and the middle graph:

each node holds a weight and all the weigts in one graph equals to 1. The capacity of one node is the weight. The cost of transport is the source weight multiples the distance (L2 distance)

③They aim to find a weight:

$w_{opt}(G)=\arg\min_{w}\mathcal{L}_{w}^{\neq }(G)-\mathcal{L}_{w}^{\approx}(G)+\mathcal{L}_{w}^{=}(G)$

（第一项的上标其实是“不约等于”，但csdn怎么没有这个标志？）

where the first term means the average distance of the $k$ most similar graph with different label, the second term is the average distance of the $k$ most similar graph with the same label, the third term is the distance between weighted graph $G$ and its uniformly-weighted version.

④The formal loss:

$\begin{aligned} &\mathcal{L}_{W}^{\neq }(G) =\frac{1}{k}\sum_{H\in\mathbb{G}_{k}^{\neq}}d_{W}(Z_{G},Z_{H}) \\ &\mathcal{L}_{W}^{\approx}(G) =\frac{1}{k}\sum_{H\in\mathbb{G}_{k}^{\approx}}d_{W}(Z_{G},Z_{H}) \\ &\mathcal{L}_{W}^{=}(G) =d_W(Z_G,Z_G) \end{aligned}$

（将第一个公式中的两个“不等于”替换成“不约等于”）

2.5. Experiments

2.5.1. CoGE Implementation

①Number of compared graphs: $k=10$

②Optimizer: Adam

③Learning rate: 0.1, only 0.01 for REDDIT

2.5.2. Qualitative Analysis

①Graph classification dataset: MUTAG (4337 chemical molecules) and REDDIT-BINARY (2000 Reddit threads)

②GNN: GIN

③The most important structure in MUTAG:

where the left denotes the original graph, the middle denotes the similar graph with the same label, the right one is the similar graph with different label

④The most important structure in REDDIT-BINARY：

where the number denotes the degree

2.5.3. Quantitative Analysis

（1）Dataset

①Node classification dataset: CYCLIQ

②Aiming: finding how many of the $x$ most important edges are in the loop or cluster

（2）Experiment Setup

①GNN: GCN with 5 layers

②Embedding size: 20

③Edge features: NONE

④Split: 80%/20% train/test

（3）Results

①Performance on CYCLIQ dataset:

（4）AblationStudy

①Loss ablation:

and they also tried euclidean distance on the weighted average on the node embeddings (L and Average) and got a worse result

2.6. Conclusion

They aim to further apply it in node classification

3. Reference

Faber, L., Moghaddam, A. K., & Wattenhofer, R. (2020) 'Contrastive Graph Neural Network Explanation', ICML Workshop on Graph Representation Learning and Beyond. doi: https://doi.org/10.48550/arXiv.2010.13663