文章目录
往期文章链接目录
Note
This is the first post of the Graph Neural Networks (GNNs) series.
Background and Intuition
There is an increasing number of applications where data are generated from non-Euclidean domains and are represented as graphs with complex relationships and interdependency between objects. For examples, in e-commence, a graph-based learning system can exploit the interactions between users and products to make highly accurate recommendations. In chemistry, molecules are modeled as graphs.
Another example is the image data. We can represent an image as a regular grid in the Euclidean space. A convolutional neural network (CNN) is able to exploit the shift-invariance, local connectivity, and compositionality of image data. As a result, CNNs can extract local meaningful features that are shared with the entire data sets for various image analysis.
As graphs can be irregular, a graph may have a variable size of unordered nodes, and nodes from a graph may have a different number of neighbors, resulting in some important operations (e.g., convolutions) being easy to compute in the image domain, but difficult to apply to the graph domain.
Furthermore, a core assumption of existing machine learning algorithms is that instances are independent of each other. This assumption no longer holds for graph data because each node is related to others by links of various types, such as citations, friendships, and interactions.
The complexity of graph data has imposed significant challenges on existing machine learning algorithms, and nowadays many studies on extending deep learning approaches for graph data have emerged.
Intro to Graph Neural Networks
Graph neural networks (GNNs) are categorized into four groups:
- Recurrent graph neural networks (RecGNNs)
- Convolutional graph neural networks (ConvGNNs)
- Graph autoencoders (GAEs)
- Spatial-temporal graph neural networks (STGNNs).
These early studies fall into the category of recurrent graph neural networks (RecGNNs). They learn a target node’s representation by propagating neighbor information in an iterative manner until a stable fixed point is reached. This process is computationally expensive, and recently there have been increasing efforts to overcome these challenges.
Encouraged by the success of CNNs in the computer vision domain, a large number of methods that re-define the notion of convolution for graph data are developed in parallel. These approaches are under the umbrella of convolutional graph neural networks (ConvGNNs). ConvGNNs are divided into two main streams, the spectral-based approaches and the spatial-based approaches.
GNNs Framework
With the graph structure and node content information as inputs, the outputs of GNNs can focus on different graph analytics tasks with one of the following mechanisms:
-
Node-level outputs relate to node regression and node classification tasks. RecGNNs and ConvGNNs can extract high-level node representations by information propagation/graph convolution. With a multi-perceptron or a softmax layer as the output layer, GNNs are able to perform node-level tasks in an end-to-end manner.
-
Edge-level outputs relate to the edge classification and link prediction tasks. With two nodes’ hidden representations from GNNs as inputs, a similarity function or a neural network can be utilized to predict the label/connection strength of an edge.
-
Graph-level outputs relate to the graph classification task. To obtain a compact representation on the graph level, GNNs are often combined with pooling and readout operations.
Definition
A graph is represented as G = ( V , E ) G=(V, E) G=(V,E) where
- V V V is the set of nodes/ vertices, v i ∈ v_{i} \in vi∈ V V V denotes a node in V V V;
- E E E is the set of edges;
- The neighborhood of a node v v v is defined as N ( v ) = { u ∈ V ∣ ( v , u ) ∈ E } N(v)=\{u \in V \mid(v, u) \in E\} N(v)={ u∈V∣(v,u)∈E}.
We are going to use the following notations in the post:
- X ∈ R n × d X \in \mathbf{R}^{n \times d} X∈Rn×d: The feature matrix of the graph, which is a concatenation of all node representations of the graph. n n n is the number of nodes and d d d is the dimension of a node feature vector.
- x v \mathbf{x}_{v} xv: feature at node v v v.
- x ( v , u ) \mathbf{x}_{(v, u)} x(v,u): feature at the edge between node the v v v and the node u u u.
- H ∈ R n × b \mathbf{H} \in \mathbf{R}^{n \times b} H∈Rn×b: The node hidden feature matrix, where b b b is the dimension of a hidden node feature vector.
- h v ∈ R b \mathbf{h}_{v} \in \mathbf{R}^b hv