[论文精读]Graph-in-Graph (GiG): Learning interpretable latent graphs in non-Euclidean domain for bio-

论文原文:Graph-in-Graph (GiG): Learning interpretable latent graphs in non-Euclidean domain for biological and healthcare applications - ScienceDirect

论文全名:Graph-in-Graph (GiG): Learning interpretable latent graphs in non-Euclidean domain for biological and healthcare applications

论文代码:GitHub - mullakaeva/GiG: Graph-in-Graph (GiG)

英文是纯手打的!论文原文的summarizing and paraphrasing。可能会出现难以避免的拼写错误和语法错误,若有发现欢迎评论指正!文章偏向于笔记,谨慎食用!

目录

1. 省流版

1.1. 心得

1.2. 论文框架图

2. 论文逐段精读

2.1. Abstract

2.2. Introduction

2.3. Related work

2.4. Method

2.4.1. Graph-in-graph model

2.4.2. Node degree distribution loss (NDDL)

2.5. Experiments and results

2.5.1. Datasets

2.5.2. Implementation details

2.5.3. Quantitative results

2.5.4. Knowledge discovery analysis

2.6. Discussion

2.7. Conclusion

3. 知识补充

3.1. Upstream and downstream

3.2. Soft threshold

3.3. Soft assignment and hard assignment

3.4. CATH

4. Reference List


1. 省流版

1.1. 心得

(1)开头说图是适用非欧几里得空间,的确是这样因为基本脑图ROI之间的边都不是按距离算的。但是意思就是默认了每个脑区的功能一定隔得那么开是吗...如果和距离其实有关系呢?脑区一定就像各个器官一样分离吗?还是说其实呈扩散状呢

(2)都是2023的论文了为什么老强调非欧几里得

(3)数学部分真是...简略,行吧可能是重点放在NDDL上了

(4)救命这个表格分析真是over specific

(5)Again神奇排版

(6)可解释性的图好清晰

1.2. 论文框架图

2. 论文逐段精读

2.1. Abstract

        ①Graph structure beyonds the traditional Euclidean space and distance

        ②Therefore, graph representation is suitable for brain connetome analysis and molecule property prediction

        ③The authors put forward Graph-in-Graph (GiG) with Node Degree Distribution Loss (NDDL) to classify protein and brain image

2.2. Introduction

        ①Graph representation is used in CV, CG, physics, chemistry and medicine recently

        ②Most graph model takes individual graph input, then aggregates neighbors (nodes or edges) to a new graph

2.3. Related work

(1)Protein classification (enzymes or non-enzymes)

        ①SVM

        ②C-SVM

        ③GCNNs

        ④GNN

(2)Brain connectome analysis

        ①Brain graph prediction: Kullback–Leibler (KL) divergence

        ②Brain graph integration/fusion: multi-view graph normalizer network (MGN-Net)

        ③Brain graph classification (multi-graph/multi-modal classification): MICNet, GNN selection (RG-Select), DMBN, rs-fMRI-GIN, Graph Isomorphism Network (GIN), BrainNetCNN, ElasticNet, ST-GCN, DECENNT

(3)Molecular toxicity prediction (drug discovery)

        ①DNNs

        ②Tree-based ensemble methods

        ③Simplified Molecular-Input Line-Entry (SMILES)

        ④Bridging works

        ⑤Graph Multiset Transformer (GMT)

(4)GCNNs in different areas:

        ①Social sciences

        ②CV and CG

        ③Physical

        ④Medical/biological sciences

(5)Methods of GiG:

        ①Fully inductive

        ②Not neccesary for test data

toxicology  n.毒理学;毒物学    enzyme  n.酶    proliferation  n.增殖;激增;涌现;大量的事物

2.4. Method

There are N graphs, G=G_{1},G_{2},...,G_{N} and each G_{i}=\left ( V_{i},E_{i},X_{i} \right );

V denotes node/verticle, E denotes edge, X_{i}\in \mathbb{R}^{\left \| V_{i}\times D \right \|} denotes node feature matrix, D denotes the number of features;

the output for each graph is a vector p\in \mathbb{R}^{C}, which indicates the propability of prediction;

C denotes the number of possible classes;

2.4.1. Graph-in-graph model

         GiG framework:

where F_{1} caculates graph features, F_{2} learns latent connections between graphs, F_{3} combines F_{1} and F_{2} to predict

(1)Node-level module F_1

        ①Graph feature vector h_{i}=F_{1}\left ( G_{i} \right ) , where F_{1} denotes GCNN and pooling operators

        ②Graph feature matrix h=\left [ h_{1},...,h_{N} \right ]\in \mathbb{R}^{N\times H} (我倒是一直没有get到这种玩意儿,文中说h_{i}是1*H维的,那不是行向量吗,h又横向拼接行向量吗?我这里竖着拼接了,我不知道对不对我现在还没看到代码。我猜1*H是列向量,因为下面的N维是行向量

    

(2)Population-level module F_2

        ①The input of F_{2} is the output of F_{1}

        ②Function in this layer: A_{p}=F_{2}\left ( h \right ) ,

where each a_{ij}=\frac{1}{1+e^{-t\left \| \tilde{h}_{i}-\tilde{h}_{j} \right \|_{2}+\theta }}\tilde{h}_{i}=MLP\left ( h_{i} \right ) , \theta and t are learnable soft-threshold and temperature parameters

        ③ A_{p}\in \left ( 0,1 \right )^{N\times N} represents the weighted adjacency matrix

(3)GNN classifier F_3

        ①The final function:

 p=F_{3}\left ( h,A_{p} \right )=\left [ p_{1},...,p_{N} \right ]\\\\=shared\,\, weights\, \, and\, \, ReLU(GCNs\left ( h,A_{p} \right ))

(这我自己写的,他竟然只有语言描述啊)

2.4.2. Node degree distribution loss (NDDL)

        ①They proposed a regularizer, Node Degree Distribution Loss (NDDL) based on Kullback–Leibler divergence between computed degree distribution and target distribution (Gaussian distribution is chosen). Divergence of value between LGL and LGL+NDD:

        ②The overall steps:

        ③A denotes a adjacency matrix of undirected graph and A_{p} denotes a weighted fully connected graph. They only retain edges with which weights are greater than 0.5:

\bar{A}=A_{p}\, with\, \left ( A_{p}> 0.5 \right )\in \mathbb{R}^{N\times N}

where each node degree vector in it (sum of each row combines a new row vector):

\bar{d}_{i}=\sum_{i=1}^{N}\bar{A}_{i,j}\in \mathbb{R}^{N}

        ④Computing the soft assignment matrix S:

S_{i,j}=\frac{e^{\frac{-\Delta_{i,j}^2}{\sigma^2}}}{\sum_ke^{\frac{\Delta_{k,j}^2}{\sigma^2}}}

where \sigma is a hyperparameter which set to 0.6;

\Delta_{i,j}=c_i-\bar{d}_j;

c_{i},i\in \left \{ 1,...,N \right \} denotes possible degree

        ⑤Node degree distribution q=\left [ q_{1},...,q_{N} \right ] is calculated by:

q_i=\frac{\sum_jS_{i,j}}{\sum_{k,j}S_{k,j}}

他这个下面表达式我觉得有点小问题啊,算出来其实应该是两个Σ,因为他明显是把矩阵所有值求和了。还是说这样表述在数学里面是对的呢?

        ⑥The final DNNL step:

DNNL=D_{KL}\left ( q,r \right )

where r denotes target normal discrete distribution with learnable parameters

        ⑦The loss function:

loss=CE_{loss}+\alpha NDDL

where CE_{loss}=-\sum_{c=1}^Clabel_c*log\left(p_c\right) is the cross entropy loss;

label_{c} represents the class membership indicator (1 represents belonging to, 0 is not);

p_{c} denotes the probability of predicted class

2.5. Experiments and results

 (1)Dataset information

        ①Statistics of dataset:

        ②Class distribution:

(2)Experiment settings

        ①They test GiG in biological, medical and chemical domains

        ②Each sample is a graph

        ③They test 2 variants, LGL and LGL+NDD

2.5.1. Datasets

        ①Predict sex from brain fMRI data in the Human Connectome Project (HCP)

        ②Classify proteins as enzymes or non-enzymes in PROTEINS

        ③Predict binary value 0/1 as not active/active of toxicity

(这里我没有写很详细,作者其实写了很多)

2.5.2. Implementation details

(1)Settings

        ①Optimizer: Adam

        ②Activation: ReLU, except KNN in HCP which adopts Sigmoid following with batch norm

        ③DL framework: PyTorch and PyTorch Geometric

(2)Dataset split

        ①Training set is 72%, validation set is 8%, test set is 20% in HCP

        ②90% training set and 10% test set in PROTEINS_29, and training set is divided into training and validation sets through 10-folds

        ③Predefined scaffold splits are used for Tox21

(3)Importance of batch size

        ①They tested 1 batch and all batches

        ②Larger batch brings better performance, hence they suggest to combine test and traning samples

scaffold  n.脚手架;断头台; 鹰架;绞刑架;建筑架

2.5.3. Quantitative results

        ①Comparison of classifying models:

        ②Comparison of recent models on HCP:

where G is GroupICA, S is Shaefer, M is multi-modal parcellation

        ③Comparison of different folds:

        ④They test the relationship between number of test sets and accuracy:

where triangles denotes mean value and the data comes from PROTEINS_3

        ⑤Distribution of degrees before and after adopting 0.5 threshold:

where the x-axis is the node degree, and the y-axis denotes occurrence of nodes in its x-axis degree in PROTEIN_29 

2.5.4. Knowledge discovery analysis

        ①Population graphs comparison in HCP, with red outline representing misclassified:

it is easy to see GLG+NDD clusters better

        ②Population graphs comparison in PROTEINS_29:

where threshold for LGL is 0.01,  for LGL+NDD is 0.5. In addition, (b) and (d) implement CATH classes, i.e. classifying "mainly belongs to a", "mainly belongs to b", "combination of a and b"

        ③Population graphs comparison of GiG LGL+NDD in different datasets:

where threshold for LGL+NDD is 0.5

        ④The influence of \theta where the first row is LGL and the second row is LGL+NDD on PROTEINS_3:

        ⑤

        ⑥Population graph evaluation:

        ⑦Evaluation of learned \theta:

        ⑧Hyperparameters optimizing with "bs" represents batch size, "k" represents number of KNN graph, "S" represents scheduler, DECL denotes DynamicEdgeConv(ReLu(Linear(2*–,–))), "-" denotes dimension from the previous or subsequent layer, "P" represents Reduce Learning Rate on Plateau, "C" represents Cosine Annealing

        ⑨Hyperparameters selection ranges

2.6. Discussion

        ①They designed two learnable parameters: temp and \theta , which \theta can significantly influence the classifying results

        ②Proper input-graph representations also greatly impact

        ③Limitations: 1) target distribution in NDDL, 2) different F_{1} and F_{2}

2.7. Conclusion

        They proposed a graph structure learning method includes node-level, population-level, and GCN classifier

3. 知识补充

3.1. Upstream and downstream

(1)Upstream tasks mainly represent pre-training

(2)Downstream tasks usually denote the rest model part

3.2. Soft threshold

相关链接:软阈值(Soft Thresholding)函数解读-CSDN博客

3.3. Soft assignment and hard assignment

(1)Soft assignment: only presents the probability of classifying 

(2)Hard assignment: gives the specific cluster of one data

3.4. CATH

(1)Explanation: Class(C), Architecture(A), Topology(T) and Homologous superfamily (H) of protein

(2)CATH classes: 

4. Reference List

Orengo, C. et al. (1997) 'CATH – a hierarchic classification of protein domain structures', Structure, vol. 5, issue 8, pp. 1093-1109. doi: Redirecting

Zaripova, K. et al. (2023) 'Graph-in-Graph (GiG): Learning interpretable latent graphs in non-Euclidean domain for biological and healthcare applications', Medical Image Analysis, vol. 88, 102839. doi: Redirecting

  • 1
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值