无监督图嵌入Unsupervised graph embedding|基于对抗的图对齐adversarial graph alignment详解

最新推荐文章于 2024-05-28 09:53:48 发布

祥瑞Coding

最新推荐文章于 2024-05-28 09:53:48 发布

阅读量4.2k

点赞数 5

分类专栏：机器学习 GAN 论文解析

本文链接：https://blog.csdn.net/weixin_36474809/article/details/93882645

版权

机器学习同时被 3 个专栏收录

133 篇文章 53 订阅

订阅专栏

论文解析

46 篇文章 5 订阅

订阅专栏

GAN

9 篇文章 15 订阅

订阅专栏

目的：理解并解析graph embedding 与adversarial graph alignment的公式。

一、Unsupervised graph embedding无监督图嵌入

1.1 graph embedding图嵌入

https://blog.csdn.net/lkxhit/article/details/81391728 Graph Embedding 综述

图Graph：图是一种抽象程度高、表达能力强的数据结构，它通过对节点和边的定义来描述实体和实体之间的关联关系。常用的图有社交关系网络、商品网络、知识图谱等等。

图嵌入（Graph Embedding）定义：网络表示学习又称图嵌入（Graph Embedding），主要目的是将一个网络中的节点基于网络的特点映射成一个低维度向量，这样可以定量的衡量节点之间的相似度，更加方便的应用。

Graph Embedding是学术界一个重要研究方向，比如deep walk，是语言模型和无监督学习从单词序列扩展到图结构上的一个典型方法，该方法将截断游走的序列当成句子进行学习，之后采用word2vec中Skip-Gram模型进行训练，得到每个节点的embedding向量。Line只针对边进行采样，Node2vec可以调节参数来进行BFS或者DFS的抽样。

所以Graph Embedding的基本思路是，对graph进行采样（Sampling），采出来的序构建模型（Embedding）。

1.2 图对齐Graph alignment问题描述

图的公式表示：

用G表示一个图，G=（V，E）
V表示图中的节点
E表示边，就是节点与节点之间的连接 E⊆ ( VxV)
n = |V |表示节点的个数为n
也有将图表示为 G = (V,E,X)，其中X为节点的特征，n×m维度，n表示节点的个数，m表示每个节点的特征维度

Graph alignment：

两个Graph，一个源域，一个目标域，可以分别用Gs和Gt表示。
Gs=(Vs, Es)表示源域图
Gt=(Vt, Et)表示目标域图
graph alignment表示将Gs与Gt中的节点对应起来。

1.3 Unsupervised graph embedding无监督图嵌入

有较多的graph embedding的方法，例如，DeepWalk，Line，Node2Vec

[23] B. Perozzi, R. Al-Rfou, and S. Skiena, “Deepwalk: Online learningof social representations,” in Proceedings of the 20th ACM SIGKDDinternational conference on Knowledge discovery and data mining,pp. 701–710, ACM, 2014.
[24] J. Tang, M. Qu, M. Wang, M. Zhang, J. Yan, and Q. Mei, “Line:Large-scale information network embedding,” in Proceedings of the 24th International Conference on World Wide Web, pp. 1067–1077,International World Wide Web Conferences Steering Committee, 2015.
[25] A. Grover and J. Leskovec, “node2vec: Scalable feature learning for networks,” in Proceedings of the 22nd ACM SIGKDD international conference on Knowledge discovery and data mining, pp. 855–864, ACM, 2016.

例如Deep walk就是一种Unsupervised graph embedding 的方法，它基于随机游走Random-Walk-based Methods

论文：DeepWalk: Online Learning of Social Representations http://www.perozzi.net/publications/14_kdd_deepwalk.pdf \
PPT: http://www.perozzi.net/publications/14_kdd_deepwalk-slides.pdf
工具包：https://github.com/phanein/deepwalk

DeepWalk是一种基于随机游走的Graph embedding的方法。

表示d维度的特征向量。其中|V|表示节点的个数，每个节点表示为一个d维的向量。

Unsupervised graph embedding就是需要通过每个图中的节点vi学到特征

1.4 DeepWalk方法

DeepWalk, inspired by Word2Vec [26], borrows the methodology of Skip-grams [27] for training

[26] T. Mikolov, I. Sutskever, K. Chen, G. S. Corrado, and J. Dean,“Distributed representations of words and phrases and their composition-ality,” in Advances in neural information processing systems, pp. 3111–3119, 2013.
[27] T. Mikolov, K. Chen, G. Corrado, and J. Dean, “Efficient estimation ofword representations in vector space,” arXiv preprint arXiv:1301.3781,2013.

Skip-grams 的word窗口

对于每个word 表示为
对于每个表示为一个window，k为半个window的宽度
window为：

模型为了学出已知当前word和当前window的最大化概率的方法。

DeepWalk方法

即已知word与window，最有可能的映射。

w为window的size
Φ是mapping function即映射关系
映射为graph中所有的节点到节点的feature的映射关系

二、graph alignment图对齐

2.1 图对齐Graph alignment问题描述

图的公式表示：

用G表示一个图，G=（V，E）
V表示图中的节点
E表示边，就是节点与节点之间的连接 E⊆ ( VxV)

Graph alignment：

两个Graph，一个源域，一个目标域，可以分别用Gs和Gt表示。
Gs=(Vs, Es)表示源域图
Gt=(Vt, Et)表示目标域图
graph alignment表示将Gs与Gt中的节点对应起来。

2.2 有监督图对齐supervised graph alignemnt

问题转换：图对齐Graph alignment——图中节点对齐V alignment——图中节点经过embedding得到V对齐——源域与目标域Z映射后接近——arg min ||WX-Y||

从前面graph embedding分别得出两个set

Zs为n个source domain的Graph节点的特征，Zt为m个target domain的graph节点的特征

我们希望每个source与target domian中的节点能够对应，即对应节点的特征z对应

即对应的第i个节点，在源域和目标域中的特征对应
网络需要学出一个线性的映射W让特征对应上

公式表示对应域内对应节点的特征经过W转换后最接近时候的W即为所需要的W
源域X的特征经过映射，与Y最接近时候的W即为所需要的W
W为d×d维度的矩阵，源域为X，目标域为Y

三、domain Adversarial training图对抗生成

将GAN引入图结构之中。最终的目的是使源域的与目标域的难以区分。

W可以看作生成器
判别器用于判别源域的与目标域的
运用GAN的思想使源域的与目标域的接近

3.1 判别器D

对于给定的W，其参数为，希望判别器尽可能的判别出目标域与源域样本

其中为源域样本经过生成器后生成的样本
目标域样本为
表示判别器尽可能的判别出源域样本，反之亦然

3.2 生成器W

生成器W需要尽可能的愚弄判别器

与上面类似。

经过类似于GAN的过程，W即尽可能的将源域图像拉近目标域。

3.3 W正交化

S. L. Smith, D. H. Turban, S. Hamblin, and N. Y. Hammerla, “Offline bilingual word vectors, orthogonal 362 transformations and the inverted softmax,” arXiv preprint arXiv:1702.03859, 2017.

这篇论文发现，正交化W可以改善最终的performance。运用下面的步骤可以实现W接近于正交化。