v2版本,于2024.4.28 remastered
BrainGB网站:https://braingb.us
英文是纯手打的!论文原文的summarizing and paraphrasing。可能会出现难以避免的拼写错误和语法错误,若有发现欢迎评论指正!文章偏向于笔记,谨慎食用!
1. 省流版
1.1. 论文总结图
2. 论文逐段精读
2.1. Abstract
①At present, there is still a lack of systematic research on brain network analysis
②They proposed Brain Graph Neural Network Benchmark (BrainGB) to construct pipelines and modularize its implementation
2.2. Introduction
①The interactions between brain regions are decisive factors of analysing neurology and diseases
②Their contributions are: a) establishing a unified framework and evaluation criteria, b) summarizing the reprocessing and building pipeline of fMRI and sMRI, c) setting baselines as node features, message passing mechanisms, attention mechanisms, and pooling strategies
③Overall framework:
(不过主干部分只有GCN和GAT可选呢,其实还有一堆Conv都可以涵盖进去,GIN和GraphSAGE啥的)
motif n.(文学作品或音乐的)主题;装饰图案;动机;主旨
2.3. Preliminaries
2.3.1. Brain Network Analysis
①Brain network dataset is with subjects, where , is the true label, denotes nodes (ROIs), denotes edges. The output of model is prediction
②Graph kernels and tensor factorization are too shallow to analyse the complicate brain structure
③The adjacency matrix is weighted (不知道会不会直接替换邻接矩阵A,不过其实也可以根据W去生成A)
aberration n.异常行为;反常现象;脱离常规
2.3.2. Graph Neural Networks
①There are 3 differences between brain network and other graph: a) brain network is lack of node features, b) weights of connection can be positive or negative, c) ROI is fixed
2.4. Brain Network Dataset Construction
2.4.1. Background: Diverse Modalities of Brain Imaging
There is a lot of scanning technology: Magnetic-Resonance Imaging (MRI), Electroencephalography (EEG) and Magnetoencephalography (MEG), Positron Emission Tomography (PET), Single-Photon Emission Computed Tomography (SPECT), and X-ray Computed Tomography (CT) etc.
(1)MRI Data
①Functional MRI (fMRI) indicates changes in blood oxygen and blood flow and reveals the functional activities
②Diffusion-weighted MRI (dMRI) fits brain structure through molecular (usually water) motion trajectories
trajectory n.轨迹;(射体在空中的)轨道;弹道
(2)Challenges in MRI Preprocessings
①There are preprocessing tools like SPM, AFNI and FSL. However, it really takes time to learn them or use them
②None of a tool contains all the preprocessing functions of dMRI
③The publicity of datasets is also a big problem
④For different modalities, they need different methods of preprocessing
2.4.2. Brain Network Construction From Raw Data
(1)Functional Brain Network Construction
①Some preprocessing functions in different tools:
②There are partial correlations, mutual information, coherence, Granger causality etc. as the pairwise correlations between ROIs
(2)Structural Brain Network Construction
①Some preprocessing functions in different tools:
2.4.3. Discussions
The combination of sMRI and fMRI might be more effective than single modality
metabolic adj.代谢的;新陈代谢的
2.5. GNN Baselines for Brain Network Analysis
2.5.1. Node Feature Construction
①Identity: give one hot feature vector for each node
②Eigen: similar to PCA...
③Degree: a one dimension vector that records the degree of one node
④Degree profile:
⑤Connection profile: each row of one node is the original node feature
2.5.2. Message Passing Mechanisms
①The node feature in layer firstly get message from neighbors through sum operation:
where represents all the neighbors of node , denotes the edge weights between node and , denotes the message function
②They secondly update with:
where can be any differentiable function
③They might be influenced on:
egde wights | Aggregation as in GCN, , clearly reflects that the value of is related to the edge weight value |
bin concat | Set buckets, trying it in [5, 10, 15, 20]. Each bucket possesses its own expression . Ranking all the edge weights and dividing them into buckets in ascending order. Then, followed by an MLP: . It helps to find the similar connections. |
edge weight concat | , where the value of is the dimension of node feature. Such scaling extends the impact of edge feature |
node edge concat | . It can reduce the over smoothing problem because “从每个中心节点的本地邻居传递的每条消息都使用其上一个时间步长的表示进行强化”(?我没太能理解,这不是两个节点之间的concat吗,和上一步有什么关系?) |
node concat |
2.5.3. Attention-Enhanced Message Passing
①Attention mechanism is useful in collecting of important information
②Different from traditional graph attention mechanisms as in molecule, brain graph needs the edge features more and node features less
③So the attention will be:
Attention weighted | original GAT without edge features where denotes the corresponding attention score and is come from nonlinear LeakyReLU in single-layer feed-forward neural network: (作者没说learnable linear transformation matrix , weight vector 的值诶) (作者说是LeakyReLU nonlinearity,这是一个操作(function)还是说是个值啊) |
Edge weighted w/ attn | enhanced version of "egde wights" in GCN: |
Attention edge sum | another enhanced version of "egde wights" in GCN: |
Node edge concat w/ attn | enhanced version of "edge weight concat" in GCN: |
Node concat w/ attn | enhanced version of "node weight concat" in GCN: |
2.5.4. Pooling Strategies
①The pooling operator is like:
②Provided pooling methods:
mean pooling | |
sum pooling | |
concat pooling |
③They think other complex pooling like hierarchical pooling, learnable pooling, clustering readout are usually regarded as independent GNN architecture rather than combinative modules. Therefore they did not provide them.
2.6. Experimental Analysis and Insights
2.6.1. Experimental Settings
(1)Datasets
①Four basic datasets: fMRI (HIV, PNC, ABCD) and dMRI(PPMI)
②Tasks: disease classification in HIV and PPMI, sex classification in PNC and ABCD
③Overall information of datasets:
④Human Immunodeficiency Virus Infection (HIV): 35 early HIV patients and 35 seronegative controls. Preprocessing procedures are: a) realignment to the first volume, b) slice timing correction, c) normalization, d) patial smoothness, e) band-pass filtering, f) linear trend removal of the time series.(很神奇的是ROI数量是116个但是size只包含90个大脑区域诶,怎么筛选的也没说)
⑤Philadelphia Neuroimaging Cohort (PNC): 289 (57.46%) female. Preprocessing procedures are: a) slice timing correction, b) motion correction, c) registration, d) normalization, e) removal of linear trends, f) bandpass filtering, g) spatial smoothing. Also, they just choose 232 of 264.
⑥Parkinson’s Progression Markers Initiative (PPMI): 596 Parkinson’s
disease patients and 158 HC. Preprocessing procedures are: a) aligned to correct for head motion and eddy current distortions, b) remove the non-brain tissue and linearly align and register the skull-stripped images. Number of ROI is 84. Reconstructing the brain network by deterministic 2nd-order Runge-Kutta (RK2) wholebrain tractography algorithm.
⑦Adolescent Brain Cognitive Development Study (ABCD): subjects are 9-10 years old children from 21 sites. 3961 (50.1%) are female. Preprocessed by ABCD-HCP BIDS fMRI Pipeline12.
⑧⭐For sMRI, standardizing each edge weight by dividing by the maximum edge weight in one sample to ensure all the values are in [0,1]. For fMRI, they delete negative value in GCN and remain them in GAT (GCN can not handel them).
seronegative adj. 血清反应阴性的 therapeutics n. 疗法,治疗学
(2)Baselines
①Shallow models: M2E, MPCA and MK-SVM followed by logistic regression classification
②Deep models: BrainGNN and BrainNetCNN
(3)Implementation Details
①Optimizer: Adam
②Epoch: 20
③Learning rate: 1e-3
④Weight decay: 1e-4 for regularization
⑤Sample split: 80% training set and 20% test set
⑥Cross validation: 10 fold
⑦The mean performance of each model in each dataset:
2.6.2. Performance Report
(1)Node Feature
①⭐Adopting the row of node as the node feature perfoms best.
②They think this method captures the overall information of brain network...(虽然我真的觉得这个可解释性差到极致了...)
(2)Message Passing
Generally discuss these methods and their performances.
(3)Attention Enhanced Message Passing
①⭐Attention performs better than without
②Generally discuss these methods and their performances.
(4)Pooling Strategies
Generally discuss these methods and their performances.
(5)Other Baselines
①Deep models performs better than shallow models
②The BrainGNN might be out-of-memory (OOM) in large dataset
(6)Insights on Density Levels
①fMRI graphs are fully connected but sMRI graphs are not. There are about 22.64% edges in PPMI
②⭐They find that the more complex the models are, the more the hidden layers needed.
2.7. Open Source Benchmark Platform
Briefly introduce BrainGB.
2.8. Discussion and Extensions
(1)Limitations
①They did not provide the graph-level module
②They are restricted due to the small sample size of the dataset
(2)Future prospects
①“神经学驱动的GNN设计:基于对预测性大脑信号,特别是疾病特异性信号的神经学理解,设计GNN架构。”(这是中翻,我没太能理解。信号这东西,得有这数据集吧?)
②Better pretraining
③Sharing information of different diseases(好像看到过一篇文章是把ADHD和AD比较吗,说这俩玩意儿共同脑区的)
3. BrainGB库/代码
参见另一篇文章:[代码复现]BrainGB: A Benchmark for Brain Network Analysis With Graph Neural Networks-CSDN博客
4. 知识补充
4.1. Out-of-memory (OOM)
(1)跑深度学习模型时,如果遇到内存不足的问题,可能有以下几个原因:
①模型复杂度高:深度神经网络通常包含大量的参数和层数,这需要大量的内存来存储和计算。
②数据量大:训练深度学习模型需要大量的数据,这些数据需要在内存中存储和处理。
③批次处理大小:在训练过程中,每次输入一批次的数据进行处理,如果批次处理的大小设置得过大,会导致内存不足。
④缓存需求:在深度学习模型训练过程中,中间计算结果需要被缓存,以便在反向传播时使用,这也会占用大量内存。
(2)为了解决内存不足的问题,可以采取以下几种方法:
①降低批次处理大小:减小批次处理的大小可以减少内存的使用量,但同时也会降低模型训练的效率。
②采用更小的模型:通过采用更小的模型,减少模型的参数数量和层数,可以降低内存的使用量。
③使用更高效的数据格式:根据实际需求选择更高效的数据格式,例如float16或float32等,可以减少内存的占用。
④优化模型结构:优化模型的结构和参数,减少不必要的计算和参数,可以降低内存的使用量。
⑤使用显存优化库:使用显存优化库可以更高效地管理内存和显存的分配,从而避免内存不足的问题。
4.2. Weight decay
Weight Decay是一个正则化技术,其作用是抑制模型的过拟合,从而提高模型的泛化性。它是通过给损失函数增加模型权重L2范数的惩罚(penalty)来让模型权重不要太大,以此来减小模型的复杂度,从而抑制模型的过拟合。Weight Decay参数是在优化器上,而不是在Loss上。在损失函数中,weight decay是放在正则项(regularization)前面的一个系数,正则项一般指示模型的复杂度,所以weight decay的作用是调节模型复杂度对损失函数的影响,若weight decay很大,则复杂的模型损失函数的值也就大。
4.3. 2nd-order Runge-Kutta (RK2)
(1)介绍:Runge-Kutta是一种在工程上广泛应用的高精度单步算法,基于数学支持。对于一阶精度的欧拉公式,Runge-Kutta方法通过在区间内预估多个点上的斜率值,并用它们的加权平均数作为平均斜率的近似值,能够构造出具有很高精度的高阶计数公式。这种方法既避免了求高阶导数,又提高了计算方法的精度。具体地,如果使用四个点处的斜率加权平均作为平均斜率的近似值,便构成一系列四阶Runge-Kutta公式,具有四阶精度。该方法的推导基于Taylor展开方法,要求所求的解具有较好的光滑性。如果解的光滑性差,那么使用四阶Runge-Kutta方法求得的数值解的精度可能反而不如改进的欧拉方法。在实际计算时,应针对问题的具体特点选择适合的算法。
(2)参考学习1:Runge-Kutta(龙格-库塔)方法 | 基本思想 + 二阶格式 + 四阶格式-CSDN博客
5. Reference List
Cui H. et al. (2023) 'BrainGB: A Benchmark for Brain Network Analysis With Graph Neural Networks', IEEE Transactions on Medical Imaging, 42 (2), pp. 493-506. doi" 10.1109/TMI.2022.3218745