Graph Convolutional Neural Networks for Web-Scale Recommender Systems
-
LINK: https://arxiv.org/abs/1806.01973
-
CLASSIFICATION: RECOMMENDER-SYSTEM, GCN
-
YEAR: Submitted on 6 Jun 2018
-
FROM: KDD 2018
-
WHAT PROBLEM TO SOLVE: The main challenge is to scale both the training as well as inference of GCN-based node embeddings to graphs with billions of nodes and tens of billions of edges. Scaling up GCNs is difficult because many of the core assumptions underlying their design are violated when working in a big data environment.
-
SOLUTION: We develop a data-efficient Graph Convolutional Network (GCN) algorithm PinSage, which combines efficient random walks and graph convolutions to generate embeddings of nodes (i.e., items) that incorporate both graph structure as well as node feature information.
-
CORE POINT:
-
Key insights to drastically improve the scalability of GCNs
-
On-the-fly convolutions
PinSage algorithm performs efficient, localized convolutions by sampling the neighborhood around a node and dynamically constructing a computation graph from this sampled neighborhood.
-
Producer-consumer minibatch construction
A large-memory, CPU-bound producer efficiently samples node network neighborhoods and fetches the necessary features to define local convolutions, while a GPU-bound TensorFlow model consumes these pre-defined computation graphs to efficiently run stochastic gradient decent.
-
Efficient MapReduce inference
Given a fully-trained GCN model, we design an efficient MapReduce pipeline that can distribute the trained model to generate embeddings for billions of nodes, while minimizing repeated computations.
-
-
New training techniques and algorithmic innovations
-
Constructing convolutions via random walks
Random sampling is suboptimal, and we develop a new technique using short random walks to sample the computation graph. An additional benefit is that each node now has an importance score, which we use in the pooling/aggregation step.
-
Importance pooling
We introduce a method to weigh the importance of node features in this aggregation based upon random-walk similarity measures.
-
Curriculum training
The algorithm is fed harder-and-harder examples during training.
-
-
Difference between GraphSage and PinSage
We fundamentally im
-