Feature
·Arbitrary types of networks
·Directed, undirected, and/or weighted
·Clear objective function
·Preserve the first-order and second-order proximity
·Scalable
·Asynchronous stochastic gradient descent
·Millions of nodes and billions of edges: a couple of hours on a single machine
First-order Proximity
·The local pairwise proximity between the nodes
·However, many links between the nodes are not observed
·Not sufficient for preserving the entire network structure
Second-order Proximity
Preserving the First-order Proximity (LINE 1st)
·Distributions : ( defined on the undirected edge i - j )
·Objective:
Preserving the Second-order Proximity (LINE 2nd)
·Distributions : (defined on the directed edge i->j )
·Objective:
Optimization Tricks
·Stochastic gradient descent + Negative Sampling
·Randomly sample an edge and multiple negative edges
·The gradient w.r.t the embdding with edge(i,j)
·Problematic when the variances of weights of the edges are large
·The variance of the gradients are large
·Solution: edge sampling
·Sample the edges according to their weights and treat the edges as binary
·Complexity: O(d * K * |E|)
·Linear to the dimensionality d, the number of negative samples K, and the number of edges
Discussion
·Embed nodes with few neighbors
·Expand the neighbors by adding higher-order neighbors
·Breadth-first search (BFS)
·Adding only second-order neighbors works well in most cases
·Embed new nodes
·Fix the embeddings of existing nodes
·Optimize the objective w.r.t the embeddings of new nodes
…To be continue
[注]:文章部分内容选自Mila & McGill University, Tutorial on Graph Representation Learning, AAAI 2019