CS224W: Machine Learning with Graphs - 03 Node Embeddings

最新推荐文章于 2024-07-21 19:18:55 发布

xbfu-xjtu

最新推荐文章于 2024-07-21 19:18:55 发布

阅读量190

点赞数

文章标签：机器学习人工智能图论

本文链接：https://blog.csdn.net/fxb163/article/details/120895465

版权

Node Embeddings

1. Graph Represnetation Learning

Graph represnetation learning alleviates the need to do feature engineering every single time (automatically learn the features)
Goal: efficient task-independent feature learning for machine learning with graphs
Why embedding?

Similarity of embeddings between nodes indicates their similarity in the netwrok
Encode network information
Potantially used for many downstream predictions (node classification, link prediction, graph prediction, anomalous node detection, clustering…)

2. Node Embeddings: Encoder and Decoder

Goal: encode nodes so that similarity in the embedding space approximates similarity in the graph
a) Encoder ENC maps from nodes to embeddings (a low-dimensional vector)
b) Define a node similarity function (i.e., a measure of similarity in the original network)
c) Decoder DEC maps from embeddings to the similarity score
d) Optimize the parameters of the encoder so that similarity $v)\approx z_v^Tz_u$

1). “Shallow” Encoding

Simplest encoding approach: encoder is just an embedding-lookup so each node is assigned a unique embedding vector
$\text{ENV}(v)=z_v=Z \cdot v$
where $Z$ is matrix and each column is a node embedding and $v$ is an indicator vector with all zeroes excepy a one in column indicating node $v$
Methods: DeepWalk, node2vec

3. Random Walk Approaches for Node Embeddings

Vector $z_u$ is the embedding of node $u$
Probability $P(v|z_u)$ is the (predicted) probability of visiting node $v$ on random walks starting from node $u$
Random walk: given a graph and a starting point, we select one of its neighbors at random and move to this neighbor; then we select a neighbor of this point at random and move to it, etc. The (random) sequence of points visited this way is a random walk on the graph.

1). Random-walk Embeddings

$z_u^Tz_v \approx \text{probability that \textit{u} and \textit{v} co-occur on a random walk over the graph}$

Estimate probability $P_R(v|u)$ of visiting node $v$ on a random walk starting from node $u$ using the random walk strategy $R$
Optimize embeddings to encode these random walk statistics

Why random walks?

Expressivity: flexible stochastic definition of node similarity that incorporates both local and higher-order neighborhood information (If a random walk starting from node $u$ visits $v$ with high probability, $u$ and $v$ are similar)
Efficiency: do not need to consider all node pairs when training; only need to consider pairs that co-occur on random walks

2). Unsupervised Feature Learning

Intuition: find embedding of nodes in $d$ -dimensional space that preserves similarity
Idea: learn node embedding such that nearby nodes are close together in the network
$N_R(u)$ : neighborhood of $u$ obtained by the strategy $R$
Goal: learn a mapping

最低0.47元/天解锁文章

xbfu-xjtu

关注

0
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
CS224W: Machine Learning with Graphs - 03 Node Embeddings

CS224W: Machine Learning with Graphs - 03 Node Embeddings
复制链接

扫一扫