[论文精读]Hypergraph Neural Networks

论文网址:Hypergraph neural networks | Proceedings of the Thirty-Third AAAI Conference on Artificial Intelligence and Thirty-First Innovative Applications of Artificial Intelligence Conference and Ninth AAAI Symposium on Educational Advances in Artificial Intelligence (acm.org)

英文是纯手打的!论文原文的summarizing and paraphrasing。可能会出现难以避免的拼写错误和语法错误,若有发现欢迎评论指正!文章偏向于笔记,谨慎食用

1. 省流版

1.1. 心得

(1)没有心得是最好的心得,所以一阶近似就是和GCN一样嘛。不提供代码说个der

1.2. 论文总结图

2. 论文逐段精读

2.1. Abstract

        ①HCNN aims at representing high order relationships between data

        ②Applicable to multi-modal data and performs excellently

2.2. Introduction

        ①Hypergraph structure in social media:

        ②Comparison between graph and hypergraph:

2.3. Related Work

2.3.1. Hypergraph learning

        ①The transductive inference on hypergraph focuses on minimizing the difference between strong connected nodes(有个小问题就是现在如果超图本来就是基于相关性强才构建的...可能那些节点本身就很相似了

2.3.2. Neural networks on graph

        ①Introducing related works on spectral and spatial domain

2.4. Hypergraph Neural Networks

2.4.1. Hypergraph learning statement

        ①They define a hypergraph as G=\left ( V, E,W \right ), where W is a diagonal matrix that each element denotes a weight of hyper edge

        ②The incidence matrix H can be constructed by:

h(v,e)=\left\{\begin{array}{cc}1,&\text{if}\, v\in e\\0,&\text{if} \, v\not\in e\end{array}\right.

这里Incidence matrix应该都是不带权的,计算边缘度和顶点度直接加个数就行

        ③Denoting \textbf{D}_e and \textbf{D}_v combined with the diagonal \delta(e) and d\left ( v \right ) respectively

        ④The vertex label should be smooth (regularized):

\arg\min_{f}\left\{\mathcal{R}_{emp}(f)+\Omega(f)\right\}

where \mathcal{R}_{emp}(f) denotes the supervised empirical loss,  \Omega(f) denotes the rigularize, and f denotes the classification function

        ⑤The \Omega(f) can be calculated by:

\begin{gathered} \Omega(f)= \frac{1}{2}\sum_{e\in\mathcal{E}}\sum_{\{u,v\}\in\mathcal{V}}\frac{w(e)h(u,e)h(v,e)}{\delta(e)} \Big(\frac{f(u)}{\sqrt{d(u)}}-\frac{f(v)}{\sqrt{d(v)}}\Big)^{2}, \end{gathered}

        ⑥\theta=\mathbf{D}_{v}^{-1/2}\mathbf{H}\mathbf{W}\mathbf{D}_{e}^{-1}\mathbf{H}^{\top}\mathbf{D}_{v}^{-1/2} and \Delta=\mathbf{I}-\Theta

        ⑦所以可以把\Omega(f)写成!!!??:

\Omega(f)=f^{\top}\Delta

我没去推诶。

where \Delta is positive semi-definite, and usually called the hypergraph Laplacian

2.4.2. Spectral convolution on hypergraph

        ①Updating the hypergraph: G=\left ( V,E,\Delta \right )

        ②Eigen decomposition: \Delta=\Phi\Lambda\Phi^{\top} where \Phi =diag\left ( \phi _1,...,\phi_n \right ) contains the eigen vectors and \Lambda =diag\left ( \lambda _1,...,\lambda _n \right ) contains eigen values

        ③Changing the original singal x=\left ( x_1,...,x_n \right ) to \hat{x}=\Phi ^Tx, where \Phi ^T is the Fourier base

        ④Spectral convolution with filer g:

\mathbf{g} \star \mathbf{x}=\mathbf{\Phi}\left(\left(\boldsymbol{\Phi}^{\top} \mathbf{g}\right) \odot\left(\boldsymbol{\Phi}^{\top} \mathbf{x}\right)\right)=\mathbf{\Phi} g(\boldsymbol{\Lambda}) \boldsymbol{\Phi}^{\top} \mathbf{x}

where g\left ( \Lambda \right )=diag\left ( \mathbf{g}\left ( \lambda _1 \right ),..., \mathbf{g}\left ( \lambda _n \right )\right ) is Fourier coefficients

        ⑤They use 1 order approximate by Chebyshev of Fourier, then update convolution:

\mathbf{g} \star \mathbf{x}\approx \sum_{k=0}^{K}\theta _kT_k\left ( \hat{\Delta } \right )x\\ \approx \theta_0x-\theta_1\mathbf{D}_{v}^{-1/2}\mathbf{H}\mathbf{W}\mathbf{D}_{e}^{-1}\mathbf{H}^{\top}\mathbf{D}_{v}^{-1/2}x

where  \theta _0 and \theta _1 are parameters of filters

        ⑥They transform them to(这参数可以纯自己设计的啊?):

\left\{\begin{matrix} \theta_1=-\frac{1}{2}\theta\\ \theta_0=\frac{1}{2}\theta \mathbf{D}_{v}^{-1/2}\mathbf{H}\mathbf{D}_{e}^{-1}\mathbf{H}^{\top}\mathbf{D}_{v}^{-1/2}\end{matrix}\right.

        ⑦The convolution will be:

\begin{gathered} \mathbf{g}\star\mathbf{x} \approx{\frac{1}{2}}\theta\mathbf{D}_{v}^{-1/2}\mathbf{H}(\mathbf{W}+\mathbf{I})\mathbf{D}_{e}^{-1}\mathbf{H}^{\top}\mathbf{D}_{v}^{-1/2}\mathbf{x} \\ \approx\theta\mathbf{D}_{v}^{-1/2}\mathbf{H}\mathbf{W}\mathbf{D}_{e}^{-1}\mathbf{H}^{\top}\mathbf{D}_{v}^{-1/2}\mathbf{x}, \end{gathered}

作者说W最开始就是I那还要每一层都加个I啊?感觉最开始加一下就可以了后面再加会不会自环环多了啊。噢,第一层是为了和I叠起来把1/2系数消了是吧

Thus the final convolution function can be:

\mathrm{Y=D_v^{-1/2}HWD_e^{-1}H^{\top}D_v^{-1/2}X\Theta}

where \mathbf{W}=\mathrm{diag}(\mathbf{w_{1}},\ldots,\mathbf{w_{n}}) and \Theta\in\mathbb{R}^{C_1\times C_2}

2.4.3. Hypergraph neural networks analysis

        ①Process of HGNN:

        ②Convolution layer:

\mathbf{X}^{(l+1)}=\sigma(\mathbf{D}_v^{-1/2}\mathbf{H}\mathbf{W}\mathbf{D}_e^{-1}\mathbf{H}^{\top}\mathbf{D}_v^{-1/2}\mathbf{X}^{(l)}\mathbf{\Theta}^{(l)})

([v,f2]=[v,v]×[v,e]×[e,e]×[e,e]×[e,v]×[v,v]×[v,f]×[f,f2])

where \sigma denotes the nonlinear activation function

         ③The details of convolution:

2.4.4. Implementation

        ①Hypergraph construction: They construct the hypergraph by defining the most similar vertex. For each node, they find K nearest neighbors, which means each hyperedge connects K+1 node. And there is finally N nodes and N hyperedges, \textbf{H}\in \mathbb{R}^{N \times N}

        ②Model for node classification: their classifier is Softmax

2.5. Experiments

2.5.1. Citation network classification

(1)Datasets

        ①Introducing citation network and visual object datasets

        ②Details of datasets:

(2)Experimental settings

        ①Conv layers: 2

        ②Hidden layer: 16

        ③Dropout rate: 0.5

        ④Activation function: ReLU

        ⑤Optimizer: Adam

        ⑥Learning rate: 0.001

(3)Results and discussion

        ①Their results come from the average from 100 runs

        ②Comparison table:

2.5.2. Visual object classification

(1)Datasets and experimental settings

        ①Introducing each dataset

        ②Constructed dataset:

(2)Hypergraph structure construction on visual datasets

        ①Hypergraph construction and ⭐multi-modal hypergraph construction:

        ②⭐For multi-modal hypergraph, they generate different H for different modality, then concatenate all of them.

(3)Results and discussions 

        ①Comparison table on ModelNet40 dataset:

        ②Comparison table on NTU dataset:

        ③Comparison table on ModelNet40:

2.6. Conclusion

        ①The HGNN is more general

3. Reference List

Feng, Y. et al. (2019) 'Hypergraph Neural Networks', AAAI. doi: http://10.1609/aaai.v33i01.33013558 

  • 29
    点赞
  • 8
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值