GNN algorithm(4): HAN, Heterogeneous Graph Attention Network

天狼啸月1990

已于 2023-04-09 17:22:13 修改

阅读量1k

点赞数

分类专栏： GNN algorithms 文章标签： HAN 图神经网络

于 2023-01-03 18:06:00 首次发布

本文链接：https://blog.csdn.net/qq_33419476/article/details/128532212

版权

GNN algorithms 专栏收录该内容

17 篇文章 1 订阅

订阅专栏

paper: Heterogeneous Graph Attention Network | The World Wide Web Conference

1. background

(1) heterogeneity of graph

(2) semantic-level attention

(3) Node-level attention

2.2 Network Embedding

3. Preliminary

background

4. Proposed Model

4.1 Node-level attention

ideas: challenge: hetero graph -> homo graph lose much semantics and structural info.

4.2 Semantic-level attention

ideas:semantics

4.3 Analysis of the proposed model

4.4 classification

4.5 Analysis

4.6 HAN 与Fin-Event对比

5. HAN Implementation

5.1 tf2 HAN 代码吐槽

5.1.1 HAN mask

5.1.2 HAN train

5.1.3 problem 2：train acc有变化，val acc值为什么一直不变

5.1.4 y_train, y_val, y_test都是(3025,3)

5.1.5 train数据太少，test数据太多

5.2 pytorch_HAN 代码分析

GNN, a powerful graph representation technique

problem: it not beeen fully considered in graph neural network for heterogeneous graph which contains different types of nodes and links.

heterogeneity
rich semantic information

solution: HAN(Heterogeneous graph attention network)

node-level attention: learn the importance between a node and its meta-path based neighbors
semantic-level attention: is able to learn the importance of different meta-paths.

-> model can generate node embedding by aggregating features from meta-path based neighbors in a hierarchical manner.

1. background

GAT: leverages attention mechanism for the homogeneous graph which includes only one type of nodes or links.

GAT only for homo-graph，也就是GAT只能选择一种meta-path做预测，但是你HAN不过是integrate多个GAT homo-graph outputs，没多大技术含量，真是个老6，服了你了！

As a matter of fact, the real-world graph usually comes with multi-types of nodes and edges, also widely known as heterogeneous information network(HIN)
meta-path, a composite relation connecting two objects, is a widely used structure to capture the semantics. e.g. meta-path Movie-Actor-Movie(MAM) -> 因为只有依据meta-path才能得到meta-path based neighbors.
heterogeneous graph contains more comprehensive information and rich semantics. depending on the meta-paths, the relation between nodes in the heterogeneous graph can have different semantics.

(1) heterogeneity of graph

HINs, describe real-world graph.

different meta-path, have different semantics. while gnn can't be applied to hete-graph directly.

-> how to handle and preserve the diverse feature information simultaneously?

(2) semantic-level attention

background: Different meta-paths in the heterogeneous graph may extract diverse semantic information.

problem: how to select the most meaningful meta-paths and fuse the semantic information for the specific task.

solution: semantic-level attention aims to learn the importance of each meta-path and assign proper weights to them.

problem: treating different meta-path equally is unpractical and will weaken the semantic information <- some useful meta-paths

(3) Node-level attention

how to distinguish subtle difference of these neighbors and select some informative neighbors is required.

solution: node-level attention aims to learn the importance of meta-path based neighbors and assign different attention values to them.

e.g. The Terminator movie <-> meta-path relation

problem: how to design a model which can discover the subtle differences of neighbors and learn their weights properly will be desired.

(4) HAN

solution: HAN

node-level attention: learn meta-path based neighbors' attention values
semantic-level attention: learn meta-paths' attention values

-> our model can get the optimal combination of neighbors and multiple meta-paths in a hierarchical manner, which enables the learned node embeddings to better capture the complex structure and rich semantic information in a heterogeneous graph.

contributions

first attempt to study the heterogeneous gnn based on attention mechanism. -> GNN directly applied to heterogeneous graph.
HAN
superiority -> good interpretability for heterogeneous graph analysis.

看来单纯的HAN不行啊，你做到的、没做到的，人家Fin-Event都做的比你好，这该咋办？

期待Re-HAN.

2. Related Work

2.1 GNN

GCN, spectral approach, which design a graph convolutional network via a localized first-order approximation of spectral graph convolutions(graph Fourier).
GraphSAGE, non-spectral approach, which performs a neural network based aggregator over a fixed size node neighbor. generates embeddings by aggregating features from a node's local neighborhood.
GAT, proposed to learn the importance between nodes and its neighbors and fuse the neighbors to perform node classification.

2.2 Network Embedding

Network Representation Learning(NRI) -> is proposed to embed network into a low dimensional space while preserving the network structure and property so that the learned embeddings can be applied to the downstream network tasks.

Heterogeneous graph embedding mainly focuses on preserving the meta-path based structural information.

Aim-1: needs to conduct grid search to find the optimal weights of meta-paths.

3. Preliminary

background

different meta-paths always reveal different semantics.
Given a meta-path φ，there exists a set of meta-path based neighbors of each node which can reveal diverse structural and rich semantics in a heterogeneous graph.
graph neural network has been proposed to deal with arbitrary graph-structure data. however, all of them are designed for homogeneous network.

4. Proposed Model

semi-supervised gnn for heterogenous graph.

(1) nodel-level attention -> learn the weight of meta-path based neighbors and aggregate them to get the semantic-spicific node embedding.

for node i, 同一meta-path(即semantics)下，求 neighbors weight.

(2) semantic-level attention -> can tell the difference for meta-paths and get the optimal weighted combination of the semantic-specific node embedding.

for node i, different meta-path 的 weight

4.1 Node-level attention

这不就是一个multi-head attention嘛，对每个homo-graph做embedding -> 缺少neighbor sampleing过程，不如人家Fin-Event做的细致吧！

problem: due to the heterogeneity of nodes, different types of nodes have different feature spaces.

solution: design type-specific transformation matrix $M_{\Phi i}$ to project the features of different types of nodes into the same feature space.

<- type-specific transformation matrix is based on node-type rather than edge-type.

asymmetric不对称, node-level attention can preserve the asymmetric which is a critical property of heterogeneous graph.

ideas: challenge: hetero graph -> homo graph lose much semantics and structural info.

problem-1: fail to learn the meta-path importance well.
problem-2: heterogeneous element can't be applied directly with gnn, but converting into homogeneous graph first.

attention weight $\alpha_{ij}^{\Phi}$ is generated for single meta-path, it is semantic-specific and able to capture one kind of semantic information.

-> multi-head attention, repeat the node-level attention for k times and concatenate the learned embeddings as the semantic-specific embeddings.

4.2 Semantic-level attention

这不就是用个nn.Linear把多个homo-graph embeddings合并成一个嘛，Fin-Event做的也不咋的，直接cat拼接起来，够简单的

need to fuse multiple semantics which can be revealed by meta-paths.

ideas:semantics

这里的semantics只是nlp传统意义上很狭隘的语序概念，而在更广泛的语义概念上，包括形状、图片、音色、颜色等可以指明一个物体独特性的语义属性

4.3 Analysis of the proposed model

4.4 classification

problem: the variance of graph-structured data can be quite high.

solution: repeat the process for 10 times and report the average.

HAN -> designs for heterogeneous graph, captures the rich semantics successfully and show its superiority.

4.5 Analysis

node-level attention, learn the attention values between nodes and its neighbors in a specific meta-path
semantic-level attention, learn the attention values between diverse meta-paths.
with node-level and semantic-level attention, the importance of node and meta-path can be fully considered.

4.6 HAN 与Fin-Event对比

Fin-Event用intra_gnn学习合并node_neighbors, inter_gnn合并多个homo-graph embeddings -》final adj matrix
HAN则是用multi-head attention合并node-neighbors，将多个homo-graph embeddings合并，学习到了meta-path importance。
Fine-Event做了RL sample neighbors，而HAN没做。

5. HAN Implementation

5.1 tf2 HAN 代码吐槽

原文tensorflow实现: GitHub - Jhy1993/HAN: Heterogeneous Graph Neural Network

整图输入，分别生成PAP、PLP对应的graph embedding，然后将它们合并成一个comprehensive embedding。

feat_in_list
bias_in_list
lbl_in, y_train, (1,3025,3)
msk_in, y_train, (1,3025)

5.1.1 HAN mask

HAN mask就真的是bool list，不同于Fin-Event mask是index，它的长度是(3025,)

5.1.2 HAN train

HAN在所有数据上做预测prediction概率，然后用train_msk选出训练数据对应的tran_pred_probabilities

就HAN train这事就很离谱，怎么看怎么别扭！

因为pytorch requires_grad会自动追踪所有tensor计算，当所有计算完成后执行backward()和optimizer.step()更新参数，虽然表面上是只有训练数据参数了计算loss function，但在BP阶段，梯度反馈对应的是所有数据，无形之中减小了train梯度大小，尤其是在train 600条，test2125条这么大差距的情况下

我给改成mini-batch训练后，舒服多了！GNN_models/HAN/HAN_torch at master · yuyongsheng1990/GNN_models · GitHub

5.1.3 problem 2：train acc有变化，val acc值为什么一直不变

可能是参数没咋变化，导致prediction没咋变 -》不收敛！！！

5.1.4 y_train, y_val, y_test都是(3025,3)

5.1.5 train数据太少，test数据太多

train数据太少，test数据太多，这不就相当于transfer learning吗？不，这还不如transfer learning，transfer learning model起码是similar，参数经过training，HAN你这直接上个生瓜蛋子，近似raw_model去预测，效果肯定不好啊！

-》改进：train: 2125; test: 600