CS224W: Machine Learning with Graphs - 09 How Expressive are GNNs

最新推荐文章于 2024-08-01 20:59:33 发布

xbfu-xjtu

最新推荐文章于 2024-08-01 20:59:33 发布

阅读量523

点赞数

文章标签：机器学习人工智能

本文链接：https://blog.csdn.net/fxb163/article/details/122402942

版权

How Expressive are GNNs

0. Theory of GNNs

How powerful are GNNs?

Many GNN models have been proposed (e.g., GCN, GAT, GraphSAGE)
What is the experssive p[ower (ability to distinguish different graph structures) of these models?
How to design a maximally expressive GNN model?

1. Local Neighborhood Structures

We specially consider local neighborhood structures around each node in a graph
Key question: can GNN node embeddings distinguish different nodes’ local neighborhood structures?
Next: we need to understand how a GNN captures local neighborhood structures

1). Computational Graph

In each layer, a GNN aggregates neighboring node embeddings.
A GNN generates node embeddings through a computational graph defined by the neighborhood.
But GNN only sees node features (not IDs).
A GNN will generate the same embedding for two nodes if their computational graph are the same and node features are identical.
In general, different local neighborhoods define different computation graphs
Computational graphs are identical to rooted subtree structures around each node.
GNN’s node embeddings captures rooted subtree structures.
Most expressive GNN maps different rooted subtrees into different node embeddings.
Most expressive GNN should map subtrees to the node embeddings injectively.

Key observation: subtrees of the same depth can be recursively characterized from the leaf nodes to the root nodes

If each step of GNN’s aggregation can fully retain the neighboring information, the generated node embeddings can distinguish different rooted subtrees.
In other words, most expressive GNN would use an injective neighbor aggregation function at each step (maps different neighbors to different embeddings).

2. Design the Most Powerful GNN

1). Neighbor Aggregation

Observation: neighbor aggregation can be abstracted as a function over a multi-set (a set with repeating elements)

a). GCN (mean-pool)

Take element-wise mean, followed by linear function and ReLU activation
Theorem: GCN’s aggregation function cannot distinguish different multi-sets with the same color proportion.

b). GraphSAGE (max-pool)

Apply an MLP, then take element-wise max
Theorem: GraphSAGE’s aggregation function cannot distinguish different multi-sets with the same set of distinct colors.

c). Summary

Expressive power of GNNs can be characterized by that of the neighbor aggregation function
Neighbor aggregation is a function over multi-sets (sets with repeating elements)
GCN and GraphSAGE’s aggregation function cannot distinguish some basic multi-sets and hence not injective
Therefore, GCN and GraphSAGE are not maximally powerful GNNs.

2). Neighbor Aggregation of Graph Isomorphism Network (GIN)

Goal: design maximally powerful GNNs in the class of message-passing GNNs
This can be achieved by designing injective neighbor aggregation function over multi-sets
We design an NN that can model injective multi-set function

a). Injective multi-set function

Theorem: any injective multi-set function can be expressed as
$\Phi(\sum_{x\in S}f(x))$
where $\Phi$ and $f$ are some non-linear function
Proof Intuition: $f$ produces one-hot encodings of colors. Summation of the one-hot encodings retains all the information about the input multi-set

b). Universal approximation theorem

1-hidden-layer MLP with sufficiently large hidden dimensionality and appropriate non-linearity $\sigma(\cdot)$ can approximate any continuous function to an arbitrary accuracy.
We have arrived at an NN that can model any injective multi-set function
$\text{MLP}_{\Phi}(\sum_{x\in S}\text{MLP}_f(x))$
In practice, MLP hidden dimensionality of 100 to 500 is sufficient.

c). Graph isomorphism network (GIN)

Apply an MLP, element-wise sum, followed by another MLP
$\text{MLP}_{\Phi}(\sum_{x\in S}\text{MLP}_f(x))$
Theorem: GIN’s neighbor aggregation function is injective
GIN is the most expressive GNN in the class of message-passing GNNs

3). Full Model of GIN

a). WL graph kernel

Recall: color refinement algorithm in WL kernel
Given a graph $G$ with a set of nodes $V$ , assign an initial color $c^0(v)$ to each ndoe $v$ . Then iteratively refine node colors by
$c^{k+1}(v)=\text{HASH}(\{c^k(v), \{c^k(u)\}_{u\in N(v)}\})$
where HASH maps different inputs to different colors. After $K$ steps of color refinement, $c^k(v)$ summarizes the structure of $K$ -hop neighborhood
Process continues until a stable coloring is reached
Two graphs are considered isomorphic if they have the same set of colors

b). Complete GIN model

GIN uses an NN to model the injective HASH function
$c^{k+1}(v)=\text{HASH}(\{c^k(v), \{c^k(u)\}_{u\in N(v)}\})$
Specifically, we will model the injective function over the tuple $(c^k(v), \{c^k(u)\}_{u\in N(v)})$ where $c^k(v)$ is root node features and $\{c^k(u)\}_{u\in N(v)}$ is neighboring node colors
Theorem: any injective function over the tuple $(c^k(v), \{c^k(u)\}_{u\in N(v)})$ can be modeled as
$\text{MLP}_{\Phi}((1+\epsilon)\cdot\text{MLP}_f(c^k(v))+\sum_{u\in N(v)}\text{MLP}_f(c^k(u)))$
where $\epsilon$ is a learnable scalar.
If input feature $c^0(v)$ is represented as one-hot, direct summation is injective.
We only need $\Phi$ to ensure the injectivity
$\text{GINConv}(c^k(v), \{c^k(u)\}_{u\in N(v)})=\text{MLP}_{\Phi}((1+\epsilon)\cdot c^k(v)+\sum_{u\in N(v)}c^k(u))$

c). GIN and WL graph kernel

GIN can be understood as differentiable neural version of the WL graph kernel. They have exactly the same expressiveness. They are both powerful enough to distinguish most of the real-world graphs.

	Update target	Update function
WL graph kernel	Node colors (one-hot)	HASH
GIN	Node embeddings (low-dim vectors)	GINConv

Advantages of GIN over the WL graph kernel are:

Node embeddings are low-dimensional; hence, they can capture the fine-grained similarity of different nodes
Parameters of the update function can be learned for the downstream tasks

xbfu-xjtu

关注

0
点赞
踩
1

收藏

觉得还不错? 一键收藏
0
评论
CS224W: Machine Learning with Graphs - 09 How Expressive are GNNs

CS224W: Machine Learning with Graphs - 09 How Expressive are GNNs
复制链接

扫一扫