1 Main Contributions
-
Build a Deep Attention Diffusion Graph Neural Network (DADGNN) that ahcieves broad receptive fields via disffusion mechanism.
-
Propose to decouple the propagation and transformation processes of GNNs and thus construct a GNN layer that can be stacked for much more times.
-
Extensive experiment results demostrate the effectiveness of the present model.
2 Method
Decouple propogation and transformation
After decoupling, the message passing progess of DADGNN can be formulated as follows:
To have a clear comparison, the fomulation of traditional GNNs is presented below:
It shows that DAGNN transform the feature dimension at early stage, and remove it from the propagation progress.
Add diffusion mechanism
An is the powers of the attention matrix, which take into account the influence of all neighboring nodes j with path lenghts up to n on target node i based on the powers of the graph adjacency matrix.
Graph-Level Representation
Then the graph (document) representation is gotten using an attention-based summation operation. This graph representaion then can be used for classification.
3 Experiment
Overall Performance
The proposed model is compared with (1) Sequence-based DL models like CNN, BiLSTM, etc.. (2) word embedding-based models like PV-DM. (3) graph-based models like SGC, TextGCN, etc..
It show that DADGNN consistently achieves best results on all datasets. The authors attribute this to the disffusion mechnism and decoupling of propagation and transformations.
GPU consumption