CS224W: Machine Learning with Graphs - 04 Graph as Matrix: PageRank,Random Walks and Embeddings

最新推荐文章于 2024-07-19 16:50:04 发布

xbfu-xjtu

最新推荐文章于 2024-07-19 16:50:04 发布

阅读量172

点赞数

文章标签：人工智能机器学习图论

本文链接：https://blog.csdn.net/fxb163/article/details/120944959

版权

Graph as Matrix: PageRank,Random Walks and Embeddings

0. Graph as Matrix

Investigate graph analysis and learning from a matrix perspective to

Determine node importance via random walk (PageRank)
Obtain node embeddings via matrix factorization (MF)
View other node ebeddings (e.g., Node2Vec) as MF

1. PageRank: Google Algorithm

0). Example: the Web as a Graph

Web as a graph: nodes $\to$ web pages, edges $\to$ hyperlinks
In early days of the Web, links were navigational; today many links are transational (post, comment, like, buy, …)
Web as a directed graph

1). Link Analysis Algorithms

PageRank
Personalized PageRank (PPR)
Random Walk with Restarts

2). PageRank: the “Flow” Model

Idea: links as votes (page is more important if it has more links)
Links from important pages count more
Resursive question
A vote from an imporatnt page is worth more

Each link’s vote is proportional to the importance of its source page
If page $i$ with importance $r_i$ has $d_i$ out-links, each link gets $r_i/d_i$ votes
Page $j$ 's own importance $r_j$ is the sum of the votes on its in-links
$r_j=\sum_{i\to j} \frac{r_i}{d_i}$

3). PageRank: Matrix Formulation

Stochastic adjacency matrix $M$
If $\to i$ , then $M_{ij}=\frac{1}{d_j}$
M is a column stochastic matrix (columns sum to 1
Rank vector $r$ : an entry per page
$r_i$ is the importance score of page $i$ ( $\sum_i r_i =1$ )
The flow equation can be written as
$\cdot r$

4). Connection to Random Walk

Imageine a random web surfer
a) At any time $t$ , surfer is on some page $i$
b) At any time $t + 1$ , the surfer follows an out-link from $i$ uniformly at random
c) Ends up on some page $j$ linked from $i$
d) Process repeats indefinitely
Let $p (t)$ denote the vector whose $i^{th}$ coordinate is the probability that the surfer is at page $i$ at time $t$ . So $p (t)$ is a probability distribution over pages

5). The Stationary Distribution

Follow a link uniformly at random
$M\cdot p(t)$
Suppose the random walk reaches a state
$M\cdot p(t)=p(t)$
then $p (t)$ is stationary distribution of a random walk
Since $r=M\cdot r$ , $r$ is a stationary distribution for the random walk

6). Eigenvector Formulation

The flow equation $\cdot r=M\cdot r$ . So the rank vector $r$ is an eigenvector of the stochastic adjacency matrix $M$ with eigenvalue 1
PageRank = Limiting distribution = principal eigenvector of $M$ , $r$ is the principal eigenvector of $M$ eigenvalue 1

2. PageRank: How to Solve

Given a graph with $n$ nodes, we use an iterative procedure:

Assign each node an initial page rank
Repeat until convergence ( $\sum_i|r_i^{t+1} - r_i^t|<\epsilon$ ), where $r_j^{t+1}=\sum_{i\to j} \frac{r_i}{d_i}$

1). Power Iteration Method

Given a web graph with $N$ nodes, where the nodes are pages and edges are hyperlinks

Initialize: $r^0=[1/N, \dots,1/N]^T$
Iterate: $r^{t+1}=M\cdot r^t$
Stop when $|r^{t+1} - r^t|<\epsilon$

About 50 iterations is sufficient to estimate the limiting solution

2). Problems

a). Dead ends

Some pages have no out-links $\to$ cause importance to ‘leak out’
Solutions: teleports follow random teleport links with total probability 1.0 from dead-ends

Adjust matrix accordingly

b). Spider traps

All out-links of some pages are within the group $\to$ eventually absord all importance
Solutions: at each time step, the random surfer has two options

With probability $\beta$ , follow a link at random
With probability $1-\beta$ , jump to a random page
Common values for $\beta$ are in [0.8, 0.9]

Surfer will teleport out of spider trap within a few time steps

c). Why teleports solve the problems

Spider traps are not a problem, but with traps PageRank scores are not what we want
Solution: never get stuck in a spider trap by teleporting out of it in a finite number of steps
Dead-ends are a problem: the matrix is not column stochastic so our initial assumptions are not met
Solution: make matrix column stochastic by always teleporting when there is nowhere else to go

3). Solution: Random Teleports

PageRank equation
$r_j^{t+1}=\sum_{i\to j}\beta \frac{r_i}{d_i}+(1-\beta)\frac{1}{N}$
The Google Matrix $G$ :
$G=\beta M+(1-\beta)[\frac{1}{N}]_{N\times N}$
We have a recursive problem: $r=G\cdot r$ and the Power method still works

3. Random Walk with Restarts and Personalized PageRank

1). Proximity on Graphs

PageRank: teleports with uniform probability to any node in the network
Personalized PageRank: ranks proximity of nodes to the teleport nodes $S$
Proximity on Graphs: random walks with restarts - teleport back to the starting node

2). Random Walks

Idea

Every node has some importance
Importance gets evenly split among all edges and pushed to the neighbors

Given a set of QUERY_NODES, we simulate a random walk

Make a step to a random neighbor and record the visit (visit count)
With probability $\alpha$ , restart the walk at one of the QUERY_NODES
The nodes with the highest visit count have highest proximity to the QUERY_NODES

Benefits: the “similarity” considers

Multiple connections
Multiple paths
Direct and indirect connections
Degree of the node

3). PageRank Variants

PageRank: teleports to any node and nodes can have the same probability of the surfer landing
$S = [0.2, 0.2, 0.2, 0.2, 0.2]$
Topic-specific PageRank aka Personalized PageRank: teleports to a specific set of nodes and nodes can have different probabilities of the surfer landing
$S = [0.3, 0, 0.5, 0.2, 0]$
Random walks with restarts: Topic-specific PageRank where teleport is always to the same node
$S = [0, 0, 0, 1, 0]$

4. Matrix Factorization and Node Embeddings

0). Relationship between Node Embeddings and Matrix Factorization

Node embeddings
Objective: maximize $z_v^Tz_u$ for node pairs $(u, v)$ that are similar
Matrix factorization
Simplest node similarity: nodes $u, v$ are similar if they are connected by an edge ( $z_v^Tz_u=A_{uv}$ and therefore $Z^TZ=A$ )

1). Matrix Factorization

The embedding dimension (number of rows in $Z$ ) is much smaller than number of ndoes $n$
Exact factorization $A=Z^TZ$ is generally not possible
However, we can learn $Z$ approximately
Objective: $\underset{Z}{min}||A-Z^TZ||_2$
Conclusion: inner product decoder with node similarity defined by edge connectivity is equivalent to matrix factorization of $A$

2). Random Walk-based Similarity

DeepWalk and node2vec have a more complex node similarity definition based on random walks

DeepWalk is equivalent to matrix factorization of the following matrix expression
$\log(vol(G)(\frac{1}{T}\sum_{r=1}^T(D^{-1}A))D^{-1})-\log b$
Node2vec can also be formulated as a more complex matrix factorization

3). Limitations

Cannot obtain embeddings for nodes not in the training set
If some new nodes are added at test time (e.g., new user in a social network), we need to recompute all node embeddings
Cannot capture structural similarity
If two nodes are far from each other, they will have very different embeddings because it is unlikely that a random walk will reach one node from the other one.
Cannot utilize node, edge, and graph features

Solutions: Deep Representation Learning and Graph Neural Networks

xbfu-xjtu

关注

0
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
CS224W: Machine Learning with Graphs - 04 Graph as Matrix: PageRank,Random Walks and Embeddings

CS224W: Machine Learning with Graphs - 04 Graph as Matrix: PageRank,Random Walks and Embeddings
复制链接

扫一扫